Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2007, Image and Vision Computing
…
13 pages
1 file
Recently, an optimization approach for fast visual tracking of articulated structures based on stochastic meta-descent (SMD) has been presented. SMD is a gradient descent with local step size adaptation that combines rapid convergence with excellent scalability. Stochastic sampling helps to avoid local minima in the optimization process. We have extended the SMD algorithm with new features for fast and accurate tracking by adapting the different step sizes between as well as within video frames and by introducing a robust cost function, which incorporates both depths and surface orientations. The advantages of the resulting tracker over state-of-the-art methods are supported through 3D hand tracking experiments. A realistic deformable hand model reinforces the accuracy of our tracker. q
2004 Conference on Computer Vision and Pattern Recognition Workshop, 2004
Recently, an optimization approach for fast visual tracking of articulated structures based on Stochastic Meta-Descent (SMD) has been presented. SMD is a gradient descent with local step size adaptation that combines rapid convergence with excellent scalability. Stochastic sampling helps to avoid local minima in the optimization process. We have extended the SMD algorithm with new features for fast and accurate tracking by adapting the different step sizes between as well as within video frames and by introducing a robust likelihood function which incorporates both depths and surface orientations. A realistic deformable hand model reinforces the accuracy of our tracker. The advantages of the resulting tracker over state-of-the-art methods are corroborated through experiments.
2004
The main challenge of tracking articulated structures like hands is their large number of degrees of freedom (DOFs). A realistic 3D model of the human hand has at least 26 DOFs. The arsenal of tracking approaches that can track such structures fast and reliably is still very small. This paper proposes a tracker based on 'Stochastic Meta-Descent' (SMD) for optimizations in such highdimensional state spaces. This new algorithm is based on a gradient descent approach with adaptive and parameter-specific step sizes. The SMD tracker facilitates the integration of constraints, and combined with a stochastic sampling technique, can get out of spurious local minima. Furthermore, the integration of a deformable hand model based on linear blend skinning and anthropometrical measurements reinforce the robustness of our tracker. Experiments show the efficiency of the SMD algorithm in comparison with common optimization methods.
IEE Proceedings - Vision, Image, and Signal Processing, 2005
The main challenge of tracking articulated structures like hands is their many degrees of freedom (DOFs). A realistic 3-D model of the human hand has at least 26 DOFs. The arsenal of tracking approaches that can track such structures fast and reliably is still very small. This paper proposes a tracker based on stochastic meta-descent (SMD) for optimisations in such highdimensional state spaces. This new algorithm is based on a gradient descent approach with adaptive and parameter-specific step sizes. The SMD tracker facilitates the integration of constraints, and combined with a stochastic sampling technique, can get out of spurious local minima. Furthermore, the integration of a deformable hand model based on linear blend skinning and anthropometrical measurements reinforces the robustness of the tracker. Experiments show the efficiency of the SMD algorithm in comparison with common optimisation methods.
Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings., 2004
Solving the tracking of an articulated structure in a reasonable time is a complex task mainly due to the high dimensionality of the problem. A new optimization method, called Stochastic Meta-Descent (SMD), based on gradient descent with adaptive and parameter specific step sizes was introduced recently [1] to solve this challenging problem. While the local optimization works very well, reaching the global optimum is not guaranteed. We therefore propose a novel algorithm which combines the SMD optimization with a particle filter to form 'smart particles'. After propagating the particles, SMD is performed and the resulting new particle set is included such that the original Bayesian distribution is not altered. The resulting 'smart particle filter' (SPF) tracks high dimensional articulated structures with far fewer samples than previous methods. Additionally, it can handle multiple hypotheses, clutter and occlusion which pure optimization approaches have problems. The performance of the SMD particle filter is illustrated in challenging 3D hand tracking sequences demonstrating a better robustness and accuracy than those of a single SMD optimization or an annealed particle filter.
ACM Transactions on Graphics, 2016
Procedings of the British Machine Vision Conference 2006, 2006
In this paper, we propose a novel model-based approach to recover 3D hand pose from 2D images through a compact articulated 3D hand model whose parameters are inferred in a Bayesian manner. To this end, we propose generative models for hand and background pixels leading to a loglikelihood objective function which aims at enclosing hand-like pixels within the silhouette of the projected 3D model while excluding background-like pixels.Segmentation and hand pose estimation are unified through the minimization of a single likelihood function, which is novel and improve overall robustness. We derive the gradient in the hand parameter space of such an area-based objective function, which is new and allows faster convergence rate than gradient free methods. Furthermore , we propose a new constrained variable metric gradient descent to speed up convergence and finally the so called smart particle filter is used to improve robustness through multiple hypotheses and to exploit temporal coherence. Very promising experimental results demonstrate the potentials of our approach.
Three-Dimensional Image Capture and Applications 2008, 2008
In this paper, we present a stochastic framework for articulated 3D human motion tracking. Tracking full body human motion is a challenging task, because the tracking performance normally suffers from several issues such as self-occlusion, foreground segmentation noise and high computational cost. In our work, we use explicit 3D reconstructions of the human body based on a visual hull algorithm as our system input, which effectively eliminates self-occlusion. To improve tracking efficiency as well as robustness, we use a Kalman particle filter framework based on an interacting multiple model (IMM). The posterior density is approximated by a set of weighted particles, which include both sample means and covariances. Therefore, tracking is equivalent to searching the maximum a posteriori (MAP) of the probability distribution. During Kalman filtering, several dynamical models of human motion (e.g., zero order, first order) are assumed which interact with each other for more robust tracking results. Our measurement step is performed by a local optimization method using simulated physical force/moment for 3D registration. The likelihood function is designed to be the fitting score between the reconstructed human body and our 3D human model, which is composed of a set of cylinders. This proposed tracking framework is tested on a real motion sequence. Our experimental results show that the proposed method improves the sampling efficiency compared with most particle filter based methods and achieves high tracking accuracy.
IEEE ISUVR 2013, 2013
We present a method for articulated hand tracking that relies on visual input acquired by a calibrated multicamera system. A state-of-the-art result on this problem has been presented in [12]. In that work, hand tracking is formulated as the minimization of an objective function that quantifies the discrepancy between a hand pose hypothesis and the observations. The objective function treats the observations from each camera view in an independent way. We follow the same general optimization framework but we choose to employ the visual hull [10] as the main observation cue, which results from the integration of information from all available views prior to optimization. We investigate the behavior of the resulting method in extensive experiments and in comparison with that of [12]. The obtained results demonstrate that for low levels of noise contamination, regardless of the number of cameras, the two methods perform comparably. The situation changes when noisy observations or as few as two cameras with short baselines are employed. In these cases, the proposed method is more accurate than that of [12]. Thus, the proposed method is preferable in real-world scenarios with noisy observations obtained from easy-to-deploy, stereo camera setups.
Lecture Notes in Computer Science, 2013
Discriminative techniques are good for hand part detection, however they fail due to sensor noise and high inter-finger occlusion. Additionally, these techniques do not incorporate any kinematic or temporal constraints. Even though model-based descriptive (for example Markov Random Field) or generative (for example Hidden Markov Model) techniques utilize kinematic and temporal constraints well, they are computationally expensive and hardly recover from tracking failure. This paper presents a unified framework for 3D hand tracking, utilizing the best of both methodologies. Hand joints are detected using a regression forest, which uses an efficient voting technique for joint location prediction. The voting distributions are multimodal in nature; hence, rather than using the highest scoring mode of the voting distribution for each joint separately, we fit the five high scoring modes of each joint on a tree-structure Markovian model along with kinematic prior and temporal information. Experimentally, we observed that relying on discriminative technique (i.e. joints detection) produces better results. We therefore efficiently incorporate this observation in our framework by conditioning 50% low scoring joints modes with remaining high scoring joints mode. This strategy reduces the computational cost and produces good results for 3D hand tracking on RGB-D data.
2010
Model-based methods to the tracking of an articulated hand in a video sequence could be divided in two categories. The first one, called stochastic methods, uses stochastic filters such as kalman or particle ones. The second category, named deterministic methods, defines a dissimilarity function to measure how well the hand model is aligned with the hand images of a video sequence. This dissimilarity function is then minimized to achieve the hand tracking. Two well-known problems are related to the minimization algorithms. The first one is that of local minima. The second problem is that of computing time required to reach the solution. These problems are compounded with the large number of degrees of freedom (DOF) of the hand (around 26). The choice of the function to be minimized and that of the minimization process can be an answer to these problems. In this paper two major contributions are presented. The first one defines a new dissimilarity function, which gives better results...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Series in Machine Perception and Artificial Intelligence, 2009
Robotics: Science and Systems X, 2014
Computer Vision – ECCV 2018, 2018
Computer Vision and Image Understanding, 2007
2009 IEEE International Conference on Robotics and Automation, 2009
IEEE CVPR 2012, 2012
2009 IEEE 12th International Conference on Computer Vision, 2009
ERCIM News, 2013
Workshop on Motion and Video Computing, 2002. Proceedings., 2000
Advances in Intelligent Systems and Computing, 2015
IEEE CVPR 2014, 2014