Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop
Recently, an optimization approach for fast visual tracking of articulated structures based on Stochastic Meta-Descent (SMD) has been presented. SMD is a gradient descent with local step size adaptation that combines rapid convergence with excellent scalability. Stochastic sampling helps to avoid local minima in the optimization process. We have extended the SMD algorithm with new features for fast and accurate tracking by adapting the different step sizes between as well as within video frames and by introducing a robust likelihood function which incorporates both depths and surface orientations. A realistic deformable hand model reinforces the accuracy of our tracker. The advantages of the resulting tracker over state-of-the-art methods are corroborated through experiments.
Image and Vision Computing, 2007
Recently, an optimization approach for fast visual tracking of articulated structures based on stochastic meta-descent (SMD) has been presented. SMD is a gradient descent with local step size adaptation that combines rapid convergence with excellent scalability. Stochastic sampling helps to avoid local minima in the optimization process. We have extended the SMD algorithm with new features for fast and accurate tracking by adapting the different step sizes between as well as within video frames and by introducing a robust cost function, which incorporates both depths and surface orientations. The advantages of the resulting tracker over state-of-the-art methods are supported through 3D hand tracking experiments. A realistic deformable hand model reinforces the accuracy of our tracker. q
2004
The main challenge of tracking articulated structures like hands is their large number of degrees of freedom (DOFs). A realistic 3D model of the human hand has at least 26 DOFs. The arsenal of tracking approaches that can track such structures fast and reliably is still very small. This paper proposes a tracker based on 'Stochastic Meta-Descent' (SMD) for optimizations in such highdimensional state spaces. This new algorithm is based on a gradient descent approach with adaptive and parameter-specific step sizes. The SMD tracker facilitates the integration of constraints, and combined with a stochastic sampling technique, can get out of spurious local minima. Furthermore, the integration of a deformable hand model based on linear blend skinning and anthropometrical measurements reinforce the robustness of our tracker. Experiments show the efficiency of the SMD algorithm in comparison with common optimization methods.
IEE Proceedings - Vision, Image, and Signal Processing, 2005
The main challenge of tracking articulated structures like hands is their many degrees of freedom (DOFs). A realistic 3-D model of the human hand has at least 26 DOFs. The arsenal of tracking approaches that can track such structures fast and reliably is still very small. This paper proposes a tracker based on stochastic meta-descent (SMD) for optimisations in such highdimensional state spaces. This new algorithm is based on a gradient descent approach with adaptive and parameter-specific step sizes. The SMD tracker facilitates the integration of constraints, and combined with a stochastic sampling technique, can get out of spurious local minima. Furthermore, the integration of a deformable hand model based on linear blend skinning and anthropometrical measurements reinforces the robustness of the tracker. Experiments show the efficiency of the SMD algorithm in comparison with common optimisation methods.
Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings., 2004
Solving the tracking of an articulated structure in a reasonable time is a complex task mainly due to the high dimensionality of the problem. A new optimization method, called Stochastic Meta-Descent (SMD), based on gradient descent with adaptive and parameter specific step sizes was introduced recently [1] to solve this challenging problem. While the local optimization works very well, reaching the global optimum is not guaranteed. We therefore propose a novel algorithm which combines the SMD optimization with a particle filter to form 'smart particles'. After propagating the particles, SMD is performed and the resulting new particle set is included such that the original Bayesian distribution is not altered. The resulting 'smart particle filter' (SPF) tracks high dimensional articulated structures with far fewer samples than previous methods. Additionally, it can handle multiple hypotheses, clutter and occlusion which pure optimization approaches have problems. The performance of the SMD particle filter is illustrated in challenging 3D hand tracking sequences demonstrating a better robustness and accuracy than those of a single SMD optimization or an annealed particle filter.
Procedings of the British Machine Vision Conference 2006, 2006
In this paper, we propose a novel model-based approach to recover 3D hand pose from 2D images through a compact articulated 3D hand model whose parameters are inferred in a Bayesian manner. To this end, we propose generative models for hand and background pixels leading to a loglikelihood objective function which aims at enclosing hand-like pixels within the silhouette of the projected 3D model while excluding background-like pixels.Segmentation and hand pose estimation are unified through the minimization of a single likelihood function, which is novel and improve overall robustness. We derive the gradient in the hand parameter space of such an area-based objective function, which is new and allows faster convergence rate than gradient free methods. Furthermore , we propose a new constrained variable metric gradient descent to speed up convergence and finally the so called smart particle filter is used to improve robustness through multiple hypotheses and to exploit temporal coherence. Very promising experimental results demonstrate the potentials of our approach.
2005
This paper describes two methods of fitting deformable templates when tracking articulated objects using particle filters. One method fits a template to each of the links of an articulated object in a hierarchical way. The method first fits a template for the base of the articulated object and then fits a template for each of the links deeper in the hierarchy. The second method fits the whole articulated object as a rigid object, and then refines the fitting for each of the links of the articulated object in a hierarchical way, starting from the base. Advantages and disadvantages of each method are discussed and a way of combining the best of each method in a single tracker is presented.
IEEE ISUVR 2013, 2013
We present a method for articulated hand tracking that relies on visual input acquired by a calibrated multicamera system. A state-of-the-art result on this problem has been presented in [12]. In that work, hand tracking is formulated as the minimization of an objective function that quantifies the discrepancy between a hand pose hypothesis and the observations. The objective function treats the observations from each camera view in an independent way. We follow the same general optimization framework but we choose to employ the visual hull [10] as the main observation cue, which results from the integration of information from all available views prior to optimization. We investigate the behavior of the resulting method in extensive experiments and in comparison with that of [12]. The obtained results demonstrate that for low levels of noise contamination, regardless of the number of cameras, the two methods perform comparably. The situation changes when noisy observations or as few as two cameras with short baselines are employed. In these cases, the proposed method is more accurate than that of [12]. Thus, the proposed method is preferable in real-world scenarios with noisy observations obtained from easy-to-deploy, stereo camera setups.
2010
Model-based methods to the tracking of an articulated hand in a video sequence could be divided in two categories. The first one, called stochastic methods, uses stochastic filters such as kalman or particle ones. The second category, named deterministic methods, defines a dissimilarity function to measure how well the hand model is aligned with the hand images of a video sequence. This dissimilarity function is then minimized to achieve the hand tracking. Two well-known problems are related to the minimization algorithms. The first one is that of local minima. The second problem is that of computing time required to reach the solution. These problems are compounded with the large number of degrees of freedom (DOF) of the hand (around 26). The choice of the function to be minimized and that of the minimization process can be an answer to these problems. In this paper two major contributions are presented. The first one defines a new dissimilarity function, which gives better results...
2003
A new method for visual tracking of articulated objects is presented. Analyzing articulated motion is challenging because the dimensionality increase potentially demands tremendous increase of computation. To ease this problem, we propose an approach that analyzes subparts locally while reinforcing the structural constraints at the mean time. The computational model of the proposed approach is based on a dynamic Markov network, a generative model which characterizes the dynamics and the image observations of each individual subpart as well as the motion constraints among different subparts. Probabilistic variational analysis of the model reveals a mean field approximation to the posterior densities of each subparts given visual evidence, and provides a computationally efficient way for such a difficult Bayesian inference problem. In addition, we design mean field Monte Carlo (MFMC) algorithms, in which a set of low dimensional particle filters interact with each other and solve the high dimensional problem collaboratively. Extensive experiments on tracking human body parts demonstrate the effectiveness, significance and computational efficiency of the proposed method.
Lecture Notes in Computer Science, 2013
Discriminative techniques are good for hand part detection, however they fail due to sensor noise and high inter-finger occlusion. Additionally, these techniques do not incorporate any kinematic or temporal constraints. Even though model-based descriptive (for example Markov Random Field) or generative (for example Hidden Markov Model) techniques utilize kinematic and temporal constraints well, they are computationally expensive and hardly recover from tracking failure. This paper presents a unified framework for 3D hand tracking, utilizing the best of both methodologies. Hand joints are detected using a regression forest, which uses an efficient voting technique for joint location prediction. The voting distributions are multimodal in nature; hence, rather than using the highest scoring mode of the voting distribution for each joint separately, we fit the five high scoring modes of each joint on a tree-structure Markovian model along with kinematic prior and temporal information. Experimentally, we observed that relying on discriminative technique (i.e. joints detection) produces better results. We therefore efficiently incorporate this observation in our framework by conditioning 50% low scoring joints modes with remaining high scoring joints mode. This strategy reduces the computational cost and produces good results for 3D hand tracking on RGB-D data.
ACM Transactions on Graphics, 2016
Series in Machine Perception and Artificial Intelligence, 2009
Visual tracking of articulated motion is a complex task with high computational costs. Because of the fact that articulated objects are usually represented as a set of linked limbs, tracking is performed with the support of a model. Model-based tracking allows determining object pose in an effortless way and handling occlusions. However, the use of articulated models generates a multidimensional state-space and, therefore, the tracking becomes computationally very expensive or even infeasible.
Computer Vision and Image Understanding, 2007
Tracking articulated structures like a hand or body within a reasonable time is challenging because of the high dimensionality of the state space. Recently, a new optimization method, called 'Stochastic Meta-Descent' (SMD) has been introduced in computer vision. This is a gradient descent scheme with adaptive and parameter specific step sizes able to operate in a constrained space. However, while the local optimization works very well, reaching the global optimum is not guaranteed. We therefore propose an enhanced algorithm that wraps a particle filter around multiple SMD based trackers, which play the role of as many particles, i.e. that act as 'smart particles'. After the standard particle propagation on the basis of a simple motion model, SMD is performed and the resulting new particle set is included such that the original Bayesian distribution is not altered. The resulting 'Smart Particle Filter' (SPF) tracks high-dimensional articulated structures with far fewer samples than previous methods. Additionally, it can handle multiple hypotheses and clutter, where pure optimization approaches have problems. Good performance is demonstrated for the case of hand tracking from 3D range data.
Proceedings of the AAAI Conference on Artificial Intelligence
We propose a novel meta-learning framework for real-time object tracking with efficient model adaptation and channel pruning. Given an object tracker, our framework learns to fine-tune its model parameters in only a few gradient-descent iterations during tracking while pruning its network channels using the target ground-truth at the first frame. Such a learning problem is formulated as a meta-learning task, where a meta-tracker is trained by updating its meta-parameters for initial weights, learning rates, and pruning masks through carefully designed tracking simulations. The integrated meta-tracker greatly improves tracking performance by accelerating the convergence of online learning and reducing the cost of feature computation. Experimental evaluation on the standard datasets demonstrates its outstanding accuracy and speed compared to the state-of-the-art methods.
Particle Filter -ArPF-, which has been specifically designed for an efficient sampling of hierarchical spaces, generated by articulated objects. Our approach decomposes the articulated motion into layers for efficiency purposes, making use of a careful modeling of the diffusion noise along with its propagation through the articulations. This produces an increase of accuracy and prevent for divergences. The algorithm is tested on hand tracking due to its complex hierarchical articulated nature. With this purpose, a new dataset generation tool for quantitative evaluation is also presented in this paper.
2007
Abstract In this paper, we present two new articulated motion analysis and object tracking approaches: the decentralized articulated object tracking method and the hierarchical articulated object tracking method. The first approach avoids the common practice of using a high-dimensional joint state representation for articulated object tracking. Instead, we introduce a decentralized scheme and model the interpart interaction within an innovative Bayesian framework.
2009 IEEE International Conference on Robotics and Automation, 2009
We describe a general methodology for tracking 3-dimensional objects in monocular and stereo video that makes use of GPU-accelerated filtering and rendering in combination with machine learning techniques. The method operates on targets consisting of kinematic chains with known geometry. The tracked target is divided into one or more areas of consistent appearance. The appearance of each area is represented by a classifier trained to assign a class-conditional probability to image feature vectors. A search is then performed on the configuration space of the target to find the maximum likelihood configuration. In the search, candidate hypotheses are evaluated by rendering a 3D model of the target object and measuring its consistency with the class probability map. The method is demonstrated for tool tracking on videos from two surgical domains, as well as in a human hand-tracking task.
Three-Dimensional Image Capture and Applications 2008, 2008
In this paper, we present a stochastic framework for articulated 3D human motion tracking. Tracking full body human motion is a challenging task, because the tracking performance normally suffers from several issues such as self-occlusion, foreground segmentation noise and high computational cost. In our work, we use explicit 3D reconstructions of the human body based on a visual hull algorithm as our system input, which effectively eliminates self-occlusion. To improve tracking efficiency as well as robustness, we use a Kalman particle filter framework based on an interacting multiple model (IMM). The posterior density is approximated by a set of weighted particles, which include both sample means and covariances. Therefore, tracking is equivalent to searching the maximum a posteriori (MAP) of the probability distribution. During Kalman filtering, several dynamical models of human motion (e.g., zero order, first order) are assumed which interact with each other for more robust tracking results. Our measurement step is performed by a local optimization method using simulated physical force/moment for 3D registration. The likelihood function is designed to be the fitting score between the reconstructed human body and our 3D human model, which is composed of a set of cylinders. This proposed tracking framework is tested on a real motion sequence. Our experimental results show that the proposed method improves the sampling efficiency compared with most particle filter based methods and achieves high tracking accuracy.
2009 IEEE 12th International Conference on Computer Vision, 2009
We present a method for tracking a hand while it is interacting with an object. This setting is arguably the one where hand-tracking has most practical relevance, but poses significant additional challenges: strong occlusions by the object as well as self-occlusions are the norm, and classical anatomical constraints need to be softened due to the external forces between hand and object. To achieve robustness to partial occlusions, we use an individual local tracker for each segment of the articulated structure. The segments are connected in a pairwise Markov random field, which enforces the anatomical hand structure through soft constraints on the joints between adjacent segments. The most likely hand configuration is found with belief propagation. Both range and color data are used as input. Experiments are presented for synthetic data with ground truth and for real data of people manipulating objects.
Robotics: Science and Systems X, 2014
This paper introduces DART, a general framework for tracking articulated objects composed of rigid bodies connected through a kinematic tree. DART covers a broad set of objects encountered in indoor environments, including furniture and tools, and human and robot bodies, hands and manipulators. To achieve efficient and robust tracking, DART extends the signed distance function representation to articulated objects and takes full advantage of highly parallel GPU algorithms for data association and pose optimization. We demonstrate the capabilities of DART on different types of objects that have each required dedicated tracking techniques in the past.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.