Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2006, 18th International Conference on Pattern Recognition (ICPR'06)
At present, the object categorisation literature is still dominated by the use of individual class detectors. Detecting multiple classes then implies the subsequent application of multiple such detectors, but such an approach is not scalable towards high numbers of classes. This paper presents an alternative strategy, where multiple classes are detected in a combined way. This includes a decision tree approach, where ternary rather than binary nodes are used, and where nodes share features. This yields an efficient scheme, which scales much better. The paper proposes a strategy where the object samples are first distinguished from the background. Then, in a second stage, the actual object class membership of each sample is determined. The focus of the paper lies entirely on the first stage, i.e. the distinction from background. The tree approach for this step is compared against two alternative strategies, one of them being the popular cascade approach. While classification accuracy tends to be better or comparable, the speed of the proposed method is systematically better. This advantage gets more outspoken as the number of object classes increases. easy exemplars easy exemplars difficult exemplars
Viola and Jones [VJ] demonstrate that cascade classification methods can successfully detect objects belonging to a single class, such as faces. Detecting and identifying objects that belong to any of a set of "classes", many class detection, is a much more challenging problem. We show that objects from each class can form a "cluster" in a "classifier space" and illustrate examples of such clusters using images of real world objects. Our detection algorithm uses a "decision tree classifier" (whose internal nodes each correspond to a VJ classifier) to propose a class label for every sub-image W of a test image (or reject it as a negative instance). If this W reaches a leaf of this tree, we then pass W through a subsequent VJ cascade of classifiers, specific to the identified class, to determine whether W is truly an instance of the proposed class. We perform several empirical studies to compare our system for detecting objects of any of M classes, to the obvious approach of running a set of M learned VJ cascade classifiers, one for each class of objects, on the same image. We found that the detection rates are comparable, and our many-class detection system is about as fast as running a single VJ cascade, and scales up well as the number of classes increases.
Procedings of the British Machine Vision Conference 2008, 2008
We propose a novel multi-class object detector, that optimizes the detection costs while retaining a desired detection rate. The detector uses a cascade that unites the handling of similar object classes while separating off classes at appropriate levels of the cascade. No prior knowledge about the relationship between classes is needed as the classifier structure is automatically determined during the training phase. The detection nodes in the cascade use Haar wavelet features and Gentle AdaBoost, however the approach is not dependent on the specific features used and can easily be extended to other cases. Experiments are presented for several numbers of object classes and the approach is compared to other classifying schemes. The results demonstrate a large efficiency gain that is particularly prominent for a greater number of classes. Also the complexity of the training scales well with the number of classes.
Building efficient object detection systems is an important goal of computer and robot vision. If several object types are to be detected, the most simple solution is to run several object-specific classifiers independently of each other (in parallel). This solution is computationally expensive if several object classes are to be detected. In this paper, TCAS, a new classifier structure designed to be used on multiclass object detection problems is introduced as an alternative solution. TCAS offers an efficient solution and reduces the aggregated false detection rate. TCAS extends cascade classifiers (introduced by Viola & Jones) to the multiclass case and corresponds to a nested coarse-to-fine tree of multiclass nested boosted cascades. Results for three different object detection problems are presented: face and hand detection, robot detection, and multiview face detection. In the experiments, the obtained TCAS have classification times about 2-times shorter than the ones obtained using parallel cascades, and have the same or lower number of false positives (for the same detection rate).
2006
Viola and Jones (VJ) cascade classification methods have proven to be very successful in detecting objects belonging to a single classe.g., faces. This paper addresses the more challenging "many class detection" problem: detecting and identifying objects that belong to any of a set of classes. We use a set of learned weights (corresponding to the parameters of a set of binary linear separators) to identify these objects. We show that objects within many real-world classes tend to form clusters in this induced "classifier space". As the result of a sequence of classifiers can suggest a possible label for each object, we formulate this task as a Markov Decision Process. Our system first uses a "decision tree classifier" (i.e., a policy produced using dynamic programming) to specify when to apply which classifier to produce a possible class label for each sub-image W of a test image. This corresponds to a leaf of the decision tree. It then uses a cascade of classifiers, specific to this leaf to confirm that W is an instance of the proposed class. We present empirical evidence to verify that our ideas work effectively: showing that our system is essentially as accurate as running a set of cascade classifiers (one for each class of objects), but is much faster than that approach.
Multimedia Tools and Applications, 2012
In this paper we study the problem of the detection of semantic objects from known categories in images. Unlike existing techniques which operate at the pixel or at a patch level for recognition, we propose to rely on the categorization of image segments. Recent work has highlighted that image segments provide a sound support for visual object class recognition. In this work, we use image segments as primitives to extract robust features and train detection models for a predefined set of categories. Several segmentation algorithms are benchmarked and their performances for segment recognition are compared. We then propose two methods for enhancing the segments classification, one based on the fusion of the classification results obtained with the different segmentations, the other one based on the optimization of the global labelling by correcting local ambiguities between neighbor segments. We use as a benchmark the Microsoft MSRC-21 image database and show that our method competes with the current state-of-the-art.
This paper proposes a multiclass recognition scheme which uses multiple feature trees with an extended scoring method evolved from TF-IDF. Feature trees consisting of different feature descriptors such as SIFT and SURF are built by the hierarchical k-means algorithm. The experimental results show that the proposed scoring method combing with the proposed multiple feature trees yields high accuracy for multiclass recognition and achieves significant improvement compared to methods using a single feature tree with original TF-IDF.
International Journal of Organizational and Collective Intelligence, 2020
In recent years, several descriptors have been proposed in many image classification applications. Accelerated-KAZE (A-KAZE) is considered one of the descriptors that has shown high performance for feature extraction. A-KAZE uses a binary descriptor called modified-local difference binary, which is very efficient and invariant to changes in rotation and scale. This representation does not allow spatial information to be considered between objects in the image, which makes it possible to reduce the performances of the classification of the images. This article broaches a new approach to improve the performance of the A-KAZE descriptor for image classification. The authors first establish the connection between the A-KAZE descriptor and the bag of feature model. Then the Spatial Pyramid Matching (SPM) is adopted by exploiting the A-KAZE descriptor to reinforce its robustness by introducing spatial information. The results of the experiments on several datasets show that the A-KAZE des...
2014 International Conference on Computer Vision Theory and Applications (VISAPP), 2014
In practice, multiple objects in images are located by consecutively applying one detector for each class and taking the best confident score. In this work, we propose to show the advantage of grouping similar object classes into a hierarchical structure. While this approach has found interest in image classification, it is not analyzed for the object detection task. Each node in the hierarchy represents one decision line. All the decision lines are learned jointly using a novel problem formulation. Based on experiments using PASCAL VOC 2007 dataset, we show that our approach improves detection performance compared to a baseline approach.
International Journal on Recent and Innovation Trends in Computing and Communication
Object recognition is a significant approach employed for recognizing suitable objects from the image. Various improvements, particularly in computer vision, are probable to diagnose highly difficult tasks with the assistance of local feature detection methodologies. Detecting multi-class objects is quite challenging, and many existing researches have worked to enhance the overall accuracy. But because of certain limitations like higher network loss, degraded training ability, improper consideration of features, less convergent and so on. The proposed research introduced a hybrid convolutional neural network (H-CNN) approach to overcome these drawbacks. The collected input images are pre-processed initially through Gaussian filtering to eradicate the noise and enhance the image quality. Followed by image pre-processing, the objects present in the images are localized using Grid Guided Localization (GGL). The effective features are extracted from the localized objects using the AlexN...
The contextual information is exploited to detect and localize multiple object categories in an image. Our context model incorporates global image features, dependencies among object categories and output of local detectors into one probabilistic framework. However, the performance benefit of context models has been limited because Markov Random Field technique was tested on data sets with only a few object categories, in which most images contain one or two object categories. The project Sun 09 dataset is used with images that contain many instances of different object categories. The coherent structure among object categories models the object co-occurrences and spatial relation1ships using tree structured graphical model. Boosted Random Field (BRF) technique is introduced to combine both Boosting and Conditional Random field for improving the accuracy and speed. BRF provides better performance and requires fewer computations. BRF searches objects in an image and detects stuff things in an office. The context model and spatial relationship improves object recognition performance and provides coherent interpretation of scene, enables reliable image querying system by multiple object categories.
We describe a method for visual object detection based on an ensemble of optimized decision trees organized in a cascade of rejectors. The trees use pixel intensity comparisons in their internal nodes and this makes them able to process image regions very fast. Experimental analysis is provided through a face detection problem. The obtained results are encouraging and demonstrate that the method has practical value. Additionally, we analyse its sensitivity to noise and show how to perform fast rotation invariant object detection.
2015
Object class detectors typically apply a window classifier to all the windows in a large set, either in a sliding window manner or using object proposals. In this paper, we develop an active search strategy that sequentially chooses the next window to evaluate based on all the information gathered before. This results in a substantial reduction in the number of classifier evaluations and in a more elegant approach in general. Our search strategy is guided by two forces. First, we exploit context as the statistical relation between the appearance of a window and its location relative to the object, as observed in the training set. This enables to jump across distant regions in the image (e.g. observing a sky region suggests that cars might be far below) and is done efficiently in a Random Forest framework. Second, we exploit the score of the classifier to attract the search to promising areas surrounding a highly scored window, and to keep away from areas near low scored ones. Our search strategy can be applied on top of any classifier as it treats it as a black-box. In experiments with R-CNN on the challenging SUN2012 dataset, our method matches the detection accuracy of evaluating all windows independently, while evaluating 9× fewer windows.
Advanced Concepts for Intelligent …, 2009
Multi-class object learning and detection is a challenging problem due to the large number of object classes and their high visual variability. Specialized detectors usually excel in performance, while joint representations optimize sharing and reduce inference time -but are complex to train. Conveniently, sequential class learning cuts down training time by transferring existing knowledge to novel classes, but cannot fully exploit the shareability of features among object classes and might depend on ordering of classes during learning. In hierarchical frameworks these issues have been little explored. In this paper, we provide a rigorous experimental analysis of various multiple object class learning strategies within a generative hierarchical framework. Specifically, we propose, evaluate and compare three important types of multi-class learning: 1.) independent training of individual categories, 2.) joint training of classes, and 3.) sequential learning of classes. We explore and compare their computational behavior (space and time) and detection performance as a function of the number of learned object classes on several recognition datasets. We show that sequential training achieves the best trade-off between inference and training times at a comparable detection performance and could thus be used to learn the classes on a larger scale.
Lecture Notes in Computer Science, 2008
The main difficulty in the binary object classification field lays in dealing with a high variability of symbol appearance. Rotation, partial occlusions, elastic deformations, or intra-class and inter-class variabilities are just a few problems. In this paper, we introduce a novel object description for this type of symbols. The shape of the object is aligned based on principal components to make the recognition invariant to rotation and reflection. We propose the Blurred Shape Model (BSM) to describe the binary objects. This descriptor encodes the probability of appearance of the pixels that outline the object's shape. Besides, we present the use of this descriptor in a system to improve the BSM performance and deal with binary objects multi-classification problems. Adaboost is used to train the binary classifiers, learning the BSM features that better split object classes. Then, the different binary problems learned by the Adaboost are embedded in the Error Correcting Output Codes framework (ECOC) to deal with the muti-class case. The methodology is evaluated in a wide set of object classes from the MPEG07 repository. Different state-of-the-art descriptors are compared, showing the robustness and better performance of the proposed scheme when classifying objects with high variability of appearance.
2012 IEEE 6th International Conference on Information and Automation for Sustainability, 2012
Support vector machine is a state-of-the-art learning machine that is used in areas, such as pattern recognition, computer vision, data mining and bioinformatics. SVMs were originally developed for solving binary classification problems, but binary SVMs have also been extended to solve the problem of multi-class pattern classification. There are different techniques employed by SVMs to tackle multi-class problems, namely oneversus-one (OVO), one-versus-all (OVA), and directed acyclic graph (DAG). When dealing with multi-class classification, one needs an appropriate technique to effectively extend these binary classification methods for multi-class classification. We address this issue by extending a novel architecture that we refer to as unbalanced decision tree (UDT). UDT is a binary decision tree arranged in a top-down manner, using the optimal margin classifier at each split to relieve the excessive time in classifying the test data when compared with the DAG-SVMs. The initial version of the UDT required a longer training time in finding the optimal model for each decision node of the tree. In this work, we have drastically reduced the excessive training time by finding the order of classifiers based on their performances during the selection of the root node and fix this order to form the hierarchy of the decision tree. UDT involves fewer classifiers than OVO, OVA and DAG -SVMs, while maintaining accuracy comparable to those standard techniques.
2009
Current work in object categorization discriminates among objects that typically possess gross differences which are readily apparent. However, many applications require making much finer distinctions. We address an insect categorization problem that is so challenging that even trained human experts cannot readily categorize the insects based on their images. The state of the art that uses visual dictionaries, when applied to this problem, yields mediocre results (16.1% error). Three possible explanations for this are (a) the dictionaries are unsupervised, (b) the dictionaries lose the detailed information contained in each keypoint, and (c) these methods rely on hand-engineered decisions about dictionary size. This paper presents a novel, dictionary-free methodology. A random forest of trees is first trained to predict the class of an image based on individual keypoint descriptors. A unique aspect of these trees is that they do not make decisions but instead merely record evidence-i.e., the number of descriptors from training examples of each category that reached each leaf of the tree. We provide a mathematical model showing that voting evidence is better than voting decisions. To categorize a new image, descriptors for all detected keypoints are "dropped" through the trees, and the evidence at each leaf is summed to obtain an overall evidence vector. This is then sent to a second-level classifier to make the categorization decision. We achieve excellent performance (6.4% error) on the 9class STONEFLY9 data set. Also, our method achieves an average AUC of 0.921 on the PASCAL06 VOC, which places it fifth out of 21 methods reported in the literature and demonstrates that the method also works well for generic object categorization.
Computer and Information Sciences …, 2008
Categorization of real world images without human intervention is a challenging ongoing research. The nature of this problem requires usage of multiclass classification techniques.
2010 20th International Conference on Pattern Recognition, 2010
We propose a new algorithm for detecting multiple object categories that exploits the fact that different categories may share common features but with different geometric distributions. This yields an efficient detector which, in contrast to existing approaches, considerably reduces the computation cost at runtime, where the feature computation step is traditionally the most expensive. More specifically, at the learning stage we compute common features by applying the same Random Ferns over the Histograms of Oriented Gradients on the training images. We then apply a boosting step to build discriminative weak classifiers, and learn the specific geometric distribution of the Random Ferns for each class. At runtime, only a few Random Ferns have to be densely computed over each input image, and their geometric distribution allows performing the detection.
Lecture Notes in Computer Science, 2010
In order for recognition systems to scale to a larger number of object categories building visual class taxonomies is important to achieve running times logarithmic in the number of classes . In this paper we propose a novel approach for speeding up recognition times of multi-class part-based object representations. The main idea is to construct a taxonomy of constellation models cascaded from coarse-to-fine resolution and use it in recognition with an efficient search strategy. The taxonomy is built automatically in a way to minimize the number of expected computations during recognition by optimizing the cost-to-power ratio . The structure and the depth of the taxonomy is not pre-determined but is inferred from the data. The approach is utilized on the hierarchy-of-parts model achieving efficiency in both, the representation of the structure of objects as well as in the number of modeled object classes. We achieve speed-up even for a small number of object classes on the ETHZ and TUD dataset. On a larger scale, our approach achieves detection time that is logarithmic in the number of classes.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.