Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Recently, fully-connected and convolutional neural networks have been trained to achieve state-of-the-art performance on a wide variety of tasks such as speech recognition, image classification, natural language processing , and bioinformatics. For classification tasks, most of these " deep learning " models employ the softmax activation function for prediction and minimize cross-entropy loss. In this paper, we demonstrate a small but consistent advantage of replacing the soft-max layer with a linear support vector machine. Learning minimizes a margin-based loss instead of the cross-entropy loss. While there have been various combinations of neu-ral nets and SVMs in prior art, our results using L2-SVMs show that by simply replacing softmax with linear SVMs gives significant gains on popular deep learning datasets MNIST, CIFAR-10, and the ICML 2013 Representation Learning Workshop's face expression recognition challenge.
Integrated Computer-Aided Engineering, 2020
Kernel based Support Vector Machines, SVM, one of the most popular machine learning models, usually achieve top performances in two-class classification and regression problems. However, their training cost is at least quadratic on sample size, making them thus unsuitable for large sample problems. However, Deep Neural Networks (DNNs), with a cost linear on sample size, are able to solve big data problems relatively easily. In this work we propose to combine the advanced representations that DNNs can achieve in their last hidden layers with the hinge and ϵ insensitive losses that are used in two-class SVM classification and regression. We can thus have much better scalability while achieving performances comparable to those of SVMs. Moreover, we will also show that the resulting Deep SVM models are competitive with standard DNNs in two-class classification problems but have an edge in regression ones.
2019
Support Vector Machines, SVM, are one of the most popular machine learning models for supervised problems and have proved to achieve great performance in a wide broad of predicting tasks. However, they can suffer from scalability issues when working with large sample sizes, a common situation in the big data era. On the other hand, Deep Neural Networks (DNNs) can handle large datasets with greater ease and in this paper we propose Deep SVM models that combine the highly non-linear feature processing of DNNs with SVM loss functions. As we will show, these models can achieve performances similar to those of standard SVM while having a greater sample scalability.
Journal of Imaging, 2021
Features play a crucial role in computer vision. Initially designed to detect salient elements by means of handcrafted algorithms, features now are often learned using different layers in convolutional neural networks (CNNs). This paper develops a generic computer vision system based on features extracted from trained CNNs. Multiple learned features are combined into a single structure to work on different image classification tasks. The proposed system was derived by testing several approaches for extracting features from the inner layers of CNNs and using them as inputs to support vector machines that are then combined by sum rule. Several dimensionality reduction techniques were tested for reducing the high dimensionality of the inner layers so that they can work with SVMs. The empirically derived generic vision system based on applying a discrete cosine transform (DCT) separately to each channel is shown to significantly boost the performance of standard CNNs across a large and ...
Convolutional Neural Networks (CNNs) are a subset of Supervised Learning class of algorithms that are very similar to regular Neural Networks and aim to find an optimal predictive model that assigns the input variable to the correct label. In contrast to the Multilayer Perceptron Architecture (MLP) that uses fully connected network layers, a CNN does not need to provide information of the entire feature space to all hidden layer nodes, but instead it breaks the input matrix into regions and then connects each region to a single hidden node. With this regional breakdown and assignment of small local groups of features to different hidden nodes, CNNs are performing very well for image recognition tasks. On the other hand, a Support Vector Machine classifier tries to separate the data into K classes by maximizing the distance between the differently labeled data. If the data are not linearly separable, then by using an appropriate kernel function we can map the data into a higher dimension where they happen to be linearly separable and we find the linear boundary there. Finally, we transform that linear boundary back to the original lower dimensions and we get a non-linear separator. In this project we are going to replace the standard sigmoid activation function of the penultimate layer of the network with a linear Support Vector Machine classifier and investigate performance differences. We are going to implement the standard CNN architecture as benchmark model and see how it compares with a Deep Learning SVC so that we choose the best model to implement the final solution.
IJEER, 2022
Facial emotion recognition has been very popular area for researchers in last few decades and it is found to be very challenging and complex task due to large intra-class changes. Existing frameworks for this type of problem depends mostly on techniques like Gabor filters, principle component analysis (PCA), and independent component analysis(ICA) followed by some classification techniques trained by given videos and images. Most of these frameworks works significantly well image database acquired in limited conditions but not perform well with the dynamic images having varying faces and images. In the past years, various researches have been introduced framework for facial emotion recognition using deep learning methods. Although they work well, but there is always some gap found in their research. In this research, we introduced hybrid approach based on RNN and CNN which are able to retrieve some important parts in the given database and able to achieve very good results on the given database like EMOTIC, FER-13 and FERG. We are also able to show that our hybrid framework is able to accomplish promising accuracies with these datasets.
IEEE Access, 2021
In recent years, researchers have proposed many deep learning (DL) methods for various tasks, and particularly face recognition (FR) made an enormous leap using these techniques. Deep FR systems benefit from the hierarchical architecture of the DL methods to learn discriminative face representation. Therefore, DL techniques significantly improve state-of-the-art performance on FR systems and encourage diverse and efficient real-world applications. In this paper, we present a comprehensive analysis of various FR systems that leverage the different types of DL techniques, and for the study, we summarize 171 recent contributions from this area. We discuss the papers related to different algorithms, architectures, loss functions, activation functions, datasets, challenges, improvement ideas, current and future trends of DL-based FR systems. We provide a detailed discussion of various DL methods to understand the current state-of-the-art, and then we discuss various activation and loss f...
With the recent advancement in digital technologies, the size of data sets has become too large in which traditional data processing and machine learning techniques are not able to cope with effectively . However, analyzing complex, high dimensional, and noise-contaminated data sets is a huge challenge, and it is crucial to develop novel algorithms that are able to summarize, classify, extract important information and convert them into an understandable form .
International Journal of Engineering Trends and Technology, 2023
The inspiration behind the huge attention given to face recognition systems by the research community and computer vision specialists is the need to enhance face recognition systems' effectiveness, accuracy rate, and speed. The complexity of recognizing the human face by machines due to different variations in poses, illumination, age, facial expression, occlusion, personal appearance, and different cosmetic effects makes face recognition more challenging. However, this makes it difficult to implement a robust computational system. The study's main goal is to enhance the current deep learning approaches for face recognition applications using an enhanced and efficient hybrid deep learning method that involves multi-layer CNN and SVM. The model is encompassed with a newly developed middle block convolutional regularization algorithm (MBCRA) and a preactivation batch normalization method for computational stability and convergence speed. The combination of both CNN and SVM enables the system to obtain more significant face features from the images of the proposed AS_Darmaset. The database has six classes of images. Each class contains face images with specific variation problems. The experimental results demonstrate that the multi-layer CNN+SVM has a 99.87% accuracy, and the comparative analysis shows that the proposed model is more resilient for face image classification under unconstrained settings than the most developed deep learning model for face recognition.
iris.sel.eesc.usp.br
2013
In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain support vector machines that learn to extract relevant features from the input patterns or from the extracted features of one layer below. The highest level SVM performs the actual prediction using the highest-level extracted features as inputs. The system is trained by a simple gradient ascent learning rule on a min-max formulation of the optimization problem. A two-layer DSVM is compared to the regular SVM on ten regression datasets and the results show that the DSVM outperforms the SVM.
2017 IEEE International Conference on Consumer Electronics (ICCE), 2017
By growing the capacity and processing power of the handheld devices nowadays, a wide range of capabilities can be implemented in these devices to make them more intelligent and user friendly. Determining the mood of the user can be used in order to provide suitable reactions from the device in different conditions. One of the most studied ways of mood detection is by using facial expressions, which is still one of the challenging fields in pattern recognition and machine learning science. Deep Neural Networks (DNN) have been widely used in order to overcome the difficulties in facial expression classification. In this paper it is shown that the classification accuracy is significantly lower when the network is trained with one database and tested with a different database. A solution for obtaining a general and robust network is given as well.
2019 25th International Conference on Automation and Computing (ICAC), 2019
Facial expressions are important in people's daily communications. Recognising facial expressions also has many important applications in the areas such as healthcare and e-learning. Existing facial expression recognition systems have problems such as background interference. Furthermore, systems using traditional approaches like SVM (Support Vector Machine) have weakness in dealing with unseen images. Systems using deep neural network have problems such as requirement for GPU, longer training time and requirement for large memory. To overcome the shortcomings of pure deep neural network and traditional facial recognition approaches, this paper presents a new facial expression recognition approach which has image preprocessing techniques to remove unnecessary background information and combines deep neural network ResNet50 and a traditional classifier--the multiclass model for Support Vector Machine to recognise facial expressions. The proposed approach has better recognition accuracy than traditional approaches like Support Vector Machine and doesn't need GPU. We have compared 3 proposed frameworks with a traditional SVM approach against the Karolinska Directed Emotional Faces (KDEF) Database, the Japanese Female Facial Expression (JAFFE) Database and the extended Cohn-Kanade dataset (CK+), respectively. The experiment results show that the features extracted from the layer 49Relu have the best performance for these three datasets.
Cornell University - arXiv, 2018
In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly, and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bio-informatics, natural language processing (NLP), cybersecurity, and many others. This report presents a brief survey on the advances that have occurred in the area of DL, starting with the Deep Neural Network (DNN). The survey goes on to cover the Convolutional Neural Network (CNN), the Recurrent Neural Network (RNN) including Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU), the Auto-Encoder (AE), the Deep Belief Network (DBN), the Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). Additionally, we have included recent developments such as advanced variant DL techniques based on these DL approaches. This work considers most of the papers published after 2012 from when the history of deep learning began. Furthermore, DL approaches that have been explored and evaluated in different application domains are also included in this survey. We also included recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys that have been published on Deep Learning using Neural Networks [1, 38] and a survey on RL [234]. However, those papers have not discussed the individual advanced techniques for training large scale deep learning models and the recently developed method of generative models [1].
Electronics
In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and many others. This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network (DNN). The survey goes on to cover Convolutiona...
International Journal of Machine Learning and Computing
Although there have been many breakthroughs in the use of convolutional neural networks (CNN) for image classification, facial expression recognition (FER) in real-life is still a challenge in this research area. This paper proposes a method to leverage state-of-the-art multi-deep CNN encoders with support vector machines (SVM) to classify facial expression. We conducted experiments to show that combining features from multi-deep CNN is better than using a single deep CNN model. As well as combining multiple CNN models, we show that using rules to remove noise images from the training dataset improves the performance of the FER system. The FER2013 dataset was used to evaluate the proposed approach, which achieved 73.78% accuracy. Index Terms-Convolutional neural networks, deep convolutional neural network features, facial expression recognition in the wild, FER2013.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.