Papers by Marco A. Wiering

We propose a novel handwritten character recognition method for isolated handwritten Bangla digit... more We propose a novel handwritten character recognition method for isolated handwritten Bangla digits. A feature is introduced for such patterns, the contour angular technique. It is compared to other methods, such as the hotspot feature, the gray-level normalized character image and a basic lowresolution pixel-based method. One of the goals of this study is to explore performance differences between dedicated feature methods and the pixel-based methods. The four methods are compared with support vector machine (SVM) classifiers on the collection of handwritten Bangla digit images. The results show that the fast contour angular technique outperforms the other techniques when not very many training examples are used. The fast contour angular technique captures aspects of curvature of the handwritten image and results in much faster character classification than the gray pixel-based method. Still, this feature obtains a similar recognition compared to the gray pixel-based method when a large training set is used. In order to investigate further whether the different feature methods represent complementary aspects of shape, the effect of majority voting is explored. The results indicate that the majority voting method achieves the best recognition performance on this dataset.

This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algor... more This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorithm first learns a model of the multi-objective sequential decision making problem, after which this learned model is used by a multiobjective dynamic programming method to compute Pareto optimal policies. The advantage of this model-based multi-objective reinforcement learning method is that once an accurate model has been estimated from the experiences of an agent in some environment, the dynamic programming method will compute all Pareto optimal policies. Therefore it is important that the agent explores the environment in an intelligent way by using a good exploration strategy. In this paper we have supplied the agent with two different exploration strategies and compare their effectiveness in estimating accurate models within a reasonable amount of time. The experimental results show that our method with the best exploration strategy is able to quickly learn all Pareto optimal policies for the Deep Sea Treasure problem.
Pattern Recognition Applications and Methods, 2012
Abstract: Feature extraction techniques can be important in character recognition, because they c... more Abstract: Feature extraction techniques can be important in character recognition, because they can enhance the efficacy of recognition in comparison to featureless or pixel-based approaches. This study aims to investigate the novel feature extraction technique called the hotspot technique in order to use it for representing handwritten characters and digits. In the hotspot technique, the distance values between the closest black pixels and the hotspots in each direction are used as representation for a character. The hotspot technique is ...
European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning - ESANN
Search and rescue is often time and labour intensive. We present a system to be used in drones to... more Search and rescue is often time and labour intensive. We present a system to be used in drones to make search and rescue operations more effective. The system uses a drone downward facing camera to detect people and to evaluate potential sites as being safe or not for the drone to land. Histogram of Oriented Gradients (HOG) features are extracted and a Support Vector Machine (SVM) is used as classifier. Our results show good performance on classifying frames as containing people (Sensitivity > 78%, Specificity > 83%), and distinguishing between safe and dangerous landing sites (Sensitivity > 87%, Specificity > 98%).

2014 Ieee Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Dec 9, 2014
This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algor... more This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorithm first learns a model of the multi-objective sequential decision making problem, after which this learned model is used by a multiobjective dynamic programming method to compute Pareto optimal policies. The advantage of this model-based multi-objective reinforcement learning method is that once an accurate model has been estimated from the experiences of an agent in some environment, the dynamic programming method will compute all Pareto optimal policies. Therefore it is important that the agent explores the environment in an intelligent way by using a good exploration strategy. In this paper we have supplied the agent with two different exploration strategies and compare their effectiveness in estimating accurate models within a reasonable amount of time. The experimental results show that our method with the best exploration strategy is able to quickly learn all Pareto optimal policies for the Deep Sea Treasure problem.

Proceedings of the 10th International Conference on Computer Vision Theory and Applications, 2015
In face recognition, face rotation alignment is an important part of the recognition process. In ... more In face recognition, face rotation alignment is an important part of the recognition process. In this paper, we present a hierarchical detector system using eye and eye-pair detectors combined with a geometrical method for calculating the in-plane angle of a face image. Two feature extraction methods, the restricted Boltzmann machine and the histogram of oriented gradients, are compared to extract feature vectors from a sliding window. Then a support vector machine is used to accurately localize the eyes. After the eye coordinates are obtained through our eye detector, the in-plane angle is estimated by calculating the arc-tangent of horizontal and vertical parts of the distance between left and right eye center points. By using this calculated in-plane angle, the face is subsequently rotationally aligned. We tested our approach on three different face datasets: IMM, Labeled Faces in the Wild (LFW) and FERET. Moreover, to compare the effect of rotational aligning on face recognition performance, we performed experiments using a face recognition method using rotationally aligned and non-aligned face images from the IMM dataset. The results show that our method calculates the in-plane rotation angle with high precision and this leads to a significant gain in face recognition performance.
This paper describes the use of a novel A * pathplanning algorithm for performing line segmentati... more This paper describes the use of a novel A * pathplanning algorithm for performing line segmentation of handwritten documents. The novelty of the proposed approach lies in the use of a smart combination of simple soft cost functions that allows an artificial agent to compute paths separating the upper and lower text fields. The use of soft cost functions enables the agent to compute near-optimal separating paths even if the upper and lower text parts are overlapping in particular places. We have performed experiments on the Saint Gall and Monk line segmentation (MLS) datasets. The experimental results show that our proposed method performs very well on the Saint Gall dataset, and also demonstrate that our algorithm is able to cope well with the much more complicated MLS dataset.
Real-world control problems are often modeled as Markov Decision Processes (MDPs) with discrete a... more Real-world control problems are often modeled as Markov Decision Processes (MDPs) with discrete action spaces to facilitate the use of the many reinforcement learning algorithms that exist to find solutions for such MDPs. For many of these problems an underlying continuous action space can be assumed. We investigate the performance of the Cacla algorithm, which uses a continuous actor, on two such MDPs: the mountain car and the cart pole. We show that Cacla has clear advantages over discrete algorithms such as Q-learning and Sarsa, even though its continuous actions get rounded to actions in the same finite action space that may contain only a small number of actions. In particular, we show that Cacla retains much better performance when the action space is changed by removing some actions after some time of learning.

2015 International Joint Conference on Neural Networks (IJCNN), 2015
Robotic mapping and localization methods are mostly dominated by using a combination of spatial a... more Robotic mapping and localization methods are mostly dominated by using a combination of spatial alignment of sensory inputs, loop closure detection, and a global fine-tuning step. This requires either expensive depth sensing systems, or fast computational hardware at run-time to produce a 2D or 3D map of the environment. In a similar context, deep neural networks are used extensively in scene recognition applications, but are not yet applied to localization and mapping problems. In this paper, we adopt a novel approach by using denoising autoencoders and image information for tackling robot localization problems. We use semi-supervised learning with location values that are provided by traditional mapping methods. After training, our method requires much less run-time computations, and therefore can perform real-time localization on normal processing units. We compare the effects of different feature vectors such as plain images, the scale invariant feature transform and histograms of oriented gradients on the localization precision. The best system can localize with an average positional error of ten centimeters and an angular error of four degrees in 3D simulation.

2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2013
This paper compares three strategies in using reinforcement learning algorithms to let an artific... more This paper compares three strategies in using reinforcement learning algorithms to let an artificial agent learn to play the game of Othello. The three strategies that are compared are: Learning by self-play, learning from playing against a fixed opponent, and learning from playing against a fixed opponent while learning from the opponent's moves as well. These issues are considered for the algorithms Q-learning, Sarsa and TD-learning. These three reinforcement learning algorithms are combined with multi-layer perceptrons and trained and tested against three fixed opponents. It is found that the best strategy of learning differs per algorithm. Q-learning and Sarsa perform best when trained against the fixed opponent they are also tested against, whereas TD-learning performs best when trained through self-play. Surprisingly, Q-learning and Sarsa outperform TD-learning against the stronger fixed opponents, when all methods use their best strategy. Learning from the opponent's moves as well leads to worse results compared to learning only from the learning agent's own moves.

Concurrency and Computation: Practice and Experience, 2014
The Support Vector Machine (SVM) is a supervised learning algorithm used for recognizing patterns... more The Support Vector Machine (SVM) is a supervised learning algorithm used for recognizing patterns in data. It is a very popular technique in Machine Learning and has been successfully used in applications such as image classification, protein classification, and handwriting recognition. However, the computational complexity of the kernelized version of the algorithm grows quadratically with the number of training examples. To tackle this high computational complexity we have developed a directive-based approach that converts a gradient-ascent based training algorithm for the CPU to an efficient GPU implementation. We compare our GPU-based SVM training algorithm to the standard LibSVM CPU implementation, a highly-optimized GPU-LIBSVM implementation, as well as to a directive-based OpenACC implementation. The results on different handwritten digit classification datasets demonstrate an important speed-up for the current approach when compared to the CPU and OpenACC versions. Furthermore, our solution is almost as fast and sometimes even faster than the highly optimized CUBLAS-based GPU-LIBSVM implementation, without sacrificing the algorithm's accuracy. 2 VALERIU CODREANU ET AL.

Engineering Applications of Artificial Intelligence, 2014
While face and eye detection is well known research topics in the field of object detection, eye-... more While face and eye detection is well known research topics in the field of object detection, eye-pair detection has not been much researched. Finding the location and size of an eye-pair in an image containing a face can enable a face recognition application to extract features from a face corresponding to different entities. Furthermore, it allows us to align different faces, so that more accurate recognition results can be obtained. To the best of our knowledge, currently there is only one eye-pair detector, which is a part of the Viola-Jones object detection framework. However, as we will show in this paper, this eye-pair detector is not very accurate for detecting eye-pairs from different face images. Therefore, in this paper we describe several novel eye-pair detection methods based on different feature extraction methods and a support vector machine (SVM) to classify image patches as containing an eye-pair or not. To find the location of an eye-pair on unseen test images, a sliding window approach is used, and the location and size of the window giving the highest output of the SVM classifier are returned. We have tested the different methods on three different datasets: the IMM, the Caltech and the Indian face dataset. The results show that the linear restricted Boltzmann machine feature extraction technique and principal component analysis result in the best performances. The SVM with these feature extraction methods is able to very accurately detect eye-pairs. Furthermore, the results show that our best eye-pair detection methods perform much better than the Viola-Jones eye-pair detector.

2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2013
Reinforcement learning algorithms enable an agent to optimize its behavior from interacting with ... more Reinforcement learning algorithms enable an agent to optimize its behavior from interacting with a specific environment. Although some very successful applications of reinforcement learning algorithms have been developed, it is still an open research question how to scale up to large dynamic environments. In this paper we will study the use of reinforcement learning on the popular arcade video game Ms. Pac-Man. In order to let Ms. Pac-Man quickly learn, we designed particular smart feature extraction algorithms that produce higher-order inputs from the game-state. These inputs are then given to a neural network that is trained using Q-learning. We constructed higher-order features which are relative to the action of Ms. Pac-Man. These relative inputs are then given to a single neural network which sequentially propagates the action-relative inputs to obtain the different Q-values of different actions. The experimental results show that this approach allows the use of only 7 input units in the neural network, while still quickly obtaining very good playing behavior. Furthermore, the experiments show that our approach enables Ms. Pac-Man to successfully transfer its learned policy to a different maze on which it was not trained before.

2013 12th International Conference on Document Analysis and Recognition, 2013
We propose a novel handwritten character recognition method for isolated handwritten Bangla digit... more We propose a novel handwritten character recognition method for isolated handwritten Bangla digits. A feature is introduced for such patterns, the contour angular technique. It is compared to other methods, such as the hotspot feature, the gray-level normalized character image and a basic lowresolution pixel-based method. One of the goals of this study is to explore performance differences between dedicated feature methods and the pixel-based methods. The four methods are compared with support vector machine (SVM) classifiers on the collection of handwritten Bangla digit images. The results show that the fast contour angular technique outperforms the other techniques when not very many training examples are used. The fast contour angular technique captures aspects of curvature of the handwritten image and results in much faster character classification than the gray pixel-based method. Still, this feature obtains a similar recognition compared to the gray pixel-based method when a large training set is used. In order to investigate further whether the different feature methods represent complementary aspects of shape, the effect of majority voting is explored. The results indicate that the majority voting method achieves the best recognition performance on this dataset.
In this paper we describe a novel extension of the support vector machine, called the deep suppor... more In this paper we describe a novel extension of the support vector machine, called the deep support vector machine (DSVM). The original SVM has a single layer with kernel functions and is therefore a shallow model. The DSVM can use an arbitrary number of layers, in which lower-level layers contain support vector machines that learn to extract relevant features from the input patterns or from the extracted features of one layer below. The highest level SVM performs the actual prediction using the highest-level extracted features as inputs. The system is trained by a simple gradient ascent learning rule on a min-max formulation of the optimization problem. A two-layer DSVM is compared to the regular SVM on ten regression datasets and the results show that the DSVM outperforms the SVM.
Clockwork Orange: The Dutch RoboSoccer Team Matthijs Spaan1, Marco Wiering2, Robert Bartelds3, Raymond Donkervoort1, Pieter Jonker3, and Frans Groen1 1 Intelligent Autonomous systems Group, University of Amsterdam, the Netherlands

Neural-Fitted TD-Leaf Learning for Playing Othello With Structured Neural Networks
IEEE Transactions on Neural Networks and Learning Systems, 2012
This paper describes a methodology for quickly learning to play games at a strong level. The meth... more This paper describes a methodology for quickly learning to play games at a strong level. The methodology consists of a novel combination of three techniques, and a variety of experiments on the game of Othello demonstrates their usefulness. First, structures or topologies in neural network connectivity patterns are used to decrease the number of learning parameters and to deal more effectively with the structural credit assignment problem, which is to change individual network weights based on the obtained feedback. Furthermore, the structured neural networks are trained with the novel neural-fitted temporal difference (TD) learning algorithm to create a system that can exploit most of the training experiences and enhance learning speed and performance. Finally, we use the neural-fitted TD-leaf algorithm to learn more effectively when look-ahead search is performed by the game-playing program. Our extensive experimental study clearly indicates that the proposed method outperforms linear networks and fully connected neural networks or evaluation functions evolved with evolutionary algorithms.
2013 IEEE Congress on Evolutionary Computation, 2013
In this paper we propose a novel algorithm called the Bandit-Inspired Memetic Algorithm (BIMA) an... more In this paper we propose a novel algorithm called the Bandit-Inspired Memetic Algorithm (BIMA) and we have applied it to solve different large instances of the Quadratic Assignment Problem (QAP). Like other memetic algorithms, BIMA makes use of local search and a population of solutions. The novelty lies in the use of multi-armed bandit algorithms and assignment matrices for generating novel solutions, which will then be brought to a local minimum by local search. We have compared BIMA to multi-start local search (MLS) and iterated local search (ILS) on five QAP instances, and the results show that BIMA significantly outperforms these competitors.

How Longer Saccade Latencies Lead to a Competition for Salience
Psychological Science, 2011
It has been suggested that independent bottom-up and top-down processes govern saccadic selection... more It has been suggested that independent bottom-up and top-down processes govern saccadic selection. However, recent findings are hard to explain in such terms. We hypothesized that differences in visual-processing time can explain these findings, and we tested this using search displays containing two deviating elements, one requiring a short processing time and one requiring a long processing time. Following short saccade latencies, the deviation requiring less processing time was selected most frequently. This bias disappeared following long saccade latencies. Our results suggest that an element that attracts eye movements following short saccade latencies does so because it is the only element processed at that time. The temporal constraints of processing visual information therefore seem to be a determining factor in saccadic selection. Thus, relative saliency is a time-dependent phenomenon.
Uploads
Papers by Marco A. Wiering