Deep Support Vector Machines for Regression Problems

Marco A. Wiering; Lambert Schomaker; Arnold Meijster

Deep Support Vector Machines for Regression Problems

Marco A. Wiering

Lambert Schomaker

Arnold Meijster

2013

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain support vector machines that learn to extract relevant features from the input patterns or from the extracted features of one layer below. The highest level SVM performs the actual prediction using the highest-level extracted features as inputs. The system is trained by a simple gradient ascent learning rule on a min-max formulation of the optimization problem. A two-layer DSVM is compared to the regular SVM on ten regression datasets and the results show that the DSVM outperforms the SVM.

Figures (1)

Fig. 1: Architecture of a two-layer DSVM. In this example, the feature layer consists of three SVMs Sz. 2 The Deep Support Vector Machine tual prediction. The whole system is trained with sim- ple gradient ascent and descent learning algorithms on the dual objective of the main SVM. The main SVM earns to maximize this objective, while the feature- ayer SVMs learn to minimize it. Instead of adapting few kernel weights, we use large DSVM architectures, sometimes consisting of a hundred SVMs in the first ayer. Still, the complexity of our DSVM scales only inearly with the number of SVMs compared to the standard SVM. Furthermore, the strong regularization power of the main SVM prevents overfitting. We use regression datasets: {(x1,y1),---, (xz, ye)}, where x; are input vectors and y; are the target out- puts. The architecture of a two-layer DSVM is shown in Figure 1. First, it contains an input layer of D in- puts. Then, there are a total of d pseudo-randomly initialized SVMs S,, each one learning to extract one feature f(x), from an input pattern x. Finally, there is the main support vector machine M that approximates the target function using the extracted feature vector as

Adil Omari

2019

Support Vector Machines, SVM, are one of the most popular machine learning models for supervised problems and have proved to achieve great performance in a wide broad of predicting tasks. However, they can suffer from scalability issues when working with large sample sizes, a common situation in the big data era. On the other hand, Deep Neural Networks (DNNs) can handle large datasets with greater ease and in this paper we propose Deep SVM models that combine the highly non-linear feature processing of DNNs with SVM loss functions. As we will show, these models can achieve performances similar to those of standard SVM while having a greater sample scalability.

Log In

Deep Support Vector Machines for Regression Problems

Sign up for access to the world's latest research

Abstract

Related papers

Related papers