Papers by Alexander Binder
We have developed an R Interface for our Machine Learning Toolbox SHOGUN. It features algorithms ... more We have developed an R Interface for our Machine Learning Toolbox SHOGUN. It features algorithms to train hidden markov models and learn regression and 2-class classification problems. While the toolbox's focus is on kernel methods such as Support Vector Machines, it also implements a number of linear methods like Linear Discriminant Analysis, Linear Programming Machines and Perceptrons.
2010 20th International Conference on Pattern Recognition, 2010
In recent years bag-of-visual-words representations have gained increasing popularity in the fiel... more In recent years bag-of-visual-words representations have gained increasing popularity in the field of image classification. Their performance highly relies on creating a good visual vocabulary from a set of image features (e.g. SIFT). For real-world photo archives such as Flicker, codebooks with larger than a few thousand words are desirable, which is infeasible by the standard k-means clustering. In this paper, we propose a two-step procedure which can generate more informative codebooks efficiently by class-wise k-means and a novel procedure for word selection. Our approach was compared favorably to the standard k-means procedure on the PASCAL VOC data sets.
A hybrid supervised-unsupervised visual vocabulary algorithm for concept recognition
Abstract Vocabulary generation is the essential step in the bag-of-words image representation for... more Abstract Vocabulary generation is the essential step in the bag-of-words image representation for visual concept recognition, because its quality affects classification performance substantially. In this paper, we propose a hybrid method for visual word ...

電子情報通信学会技術研究報告, Nov 26, 2009
Recent research has shown that combining various image features significantly improves the object... more Recent research has shown that combining various image features significantly improves the object classification performance. Multiple kernel learning (MKL) approaches, where the mixing weights at the kernel level are optimized simultaneously with the classifier parameters, give a well founded framework to control the importance of each feature. As alternatives, we can also use boosting approaches, where single kernel classifier outputs are combined with the optimal mixing weights. Most of those approaches employ an ℓ 1regularization on the mixing weights that promote sparse solutions. Although sparsity offers several advantages, e.g., interpretability and less calculation time in test phase, the accuracy of sparse methods is often even worse than the simplest flat weights combination. In this paper, we compare the accuracy of our recently developed non-sparse methods with the standard sparse counterparts on the PASCAL VOC
Device for tensioning components for laser micro-welding of thin components has a part of a tensioning unit facing the laser beam which is transparent
B 17 Maschinelles Lernen, Mustererkennung in der Bildverarbeitung
Layer-Wise Relevance Propagation for Deep Neural Network Architectures
Lecture Notes in Electrical Engineering, 2016
Method and System for the Automatic Analysis of an Image of a Biological Sample
Multi-modal identification and tracking of vehicles in partially observed environments
2014 International Conference on Indoor Positioning and Indoor Navigation (IPIN), 2014
Monatsschrift Kinderheilkunde, 2014
Genetische Analysen und „biobanking“ zur Erforschung von Infektionserkrankungen bei Kindern
Pädiatrie & Pädologie, 2014

Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning ... more Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multi-layer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the "importance" of individual pixels wrt the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012 and MIT Places data sets. Our main result is that the recently proposed Layer-wise Relevance Propagation (LRP) algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of neural network performance.
This paper studies the generalization performance of multi-class classification algorithms, for w... more This paper studies the generalization performance of multi-class classification algorithms, for which we obtain, for the first time, a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis. The theoretical analysis motivates us to introduce a new multi-class classification machine based on $\ell_p$-norm regularization, where the parameter $p$ controls the complexity of the corresponding bounds. We derive an efficient optimization algorithm based on Fenchel duality theory. Benchmarks on several real-world datasets show that the proposed algorithm can achieve significant accuracy gains over the state of the art.
We propose a localized approach to multiple kernel learning that, in contrast to prevalent approa... more We propose a localized approach to multiple kernel learning that, in contrast to prevalent approaches, can be formulated as a convex optimization problem over a given cluster structure. From which we obtain the first generalization error bounds for localized multiple kernel learning and derive an efficient optimization algorithm based on the Fenchel dual representation. Experiments on real-world datasets from the application domains of computational biology and computer vision show that the convex approach to localized multiple kernel learning can achieve higher prediction accuracies than its global and non-convex local counterparts.

Identification of vehicle tracks and association to wireless endpoints by multiple sensor modalities
International Conference on Indoor Positioning and Indoor Navigation, 2013
ABSTRACT Vehicular positioning technologies enable a broad range of applications and services suc... more ABSTRACT Vehicular positioning technologies enable a broad range of applications and services such as navigation systems, driver assistance systems and self-driving vehicles. However, Global Navigation Satellite Systems (GNSS) do not work in enclosed areas such as parking garages. For these scenarios, a wide range of indoor positioning technologies are available inside the vehicle (internal) and based on infrastructure (external). Based on our previous work, we use off-the-shelf network video cameras to detect the position of moving vehicles within the parking garage in multiple non-overlapping camera views. Towards the goal of using this system as positioning source for vehicles, detected positions need to be transmitted to the communication endpoint in the correct vehicle. The key problem thereby is the association of the externally-observed position to the endpoint in the corresponding vehicle. State-of-the-art tracking-by-detection techniques can differentiate multiple camera-detected vehicles but the generated tracks are anonymous and cannot inherently be associated to the corresponding vehicle. To bridge this gap, we present a tracking-by-identification solution which analyzes vehicle movement patterns by multiple vehicle sensor modalities and compares them with camera-detected tracks to identify the track with the best correlation. The presented approach is based on Kalman Filters and suitable for real-time operation. Test results show that a correct and robust association between endpoints and camera-detected tracks is achieved and that occurring identity switches can be resolved.
A procedure of adaptive kernel combination with kernel-target alignment for object classification
Proceeding of the ACM International Conference on Image and Video Retrieval - CIVR '09, 2009
Abstract In order to achieve good performance in object classification problems, it is necessary ... more Abstract In order to achieve good performance in object classification problems, it is necessary to combine information from various image features. Because the large margin classifiers are constructed based on similarity measures between samples called kernels, ...
Lecture Notes in Computer Science, 2011
In object classification tasks from digital photographs, multiple categories are considered for a... more In object classification tasks from digital photographs, multiple categories are considered for annotation. Some of these visual concepts may have semantic relations and can appear simultaneously in images. Although taxonomical relations and co-occurrence structures between object categories have been studied, it is not easy to use such information to enhance performance of object classification. In this paper, we propose a novel multi-task learning procedure which extracts useful information from the classifiers for the other categories. Our approach is based on non-sparse multiple kernel learning (MKL) which has been successfully applied to adaptive feature selection for image classification. Experimental results on PASCAL VOC 2009 data show the potential of our method.
Lecture Notes in Computer Science, 2010
In order to achieve good performance in image annotation tasks, it is necessary to combine inform... more In order to achieve good performance in image annotation tasks, it is necessary to combine information from various image features. In recent competitions on photo annotation, many groups employed the bag-of-words (BoW) representations based on the SIFT descriptors over various color channels. In fact, it has been observed that adding other less informative features to the standard BoW degrades recognition performances. In this contribution, we will show that even primitive color histograms can enhance the standard classifiers in the ImageCLEF 2009 photo annotation task, if the feature weights are tuned optimally by non-sparse multiple kernel learning (MKL) proposed by Kloft et al.. Additionally, we will propose a sorting scheme of image subregions to deal with spatial variability within each visual concept.

PLOS ONE, 2015
Understanding and interpreting classification decisions of automated image classification systems... more Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications, as it allows to verify the reasoning of the system and provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cases the disadvantage of acting as a black box, not providing any information about what made them arrive at a particular decision. This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers. We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. These pixel contributions can be visualized as heatmaps and are provided to a human expert who can intuitively not only verify the validity of the classification decision, but also focus further analysis on regions of potential interest. We evaluate our method for classifiers trained on PASCAL VOC 2009 images, synthetic image data containing geometric shapes, the MNIST handwritten digits data set and for the pre-trained ImageNet model available as part of the Caffe open source package.
Uploads
Papers by Alexander Binder