Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
In this paper, we describe a novel variational Monte Carlo approach for modeling and tracking body parts of articulated objects. An articulated object (human target) is represented as a dynamic Markov network of the different constituent parts. The proposed approach combines local information of individual body parts and other spatial constraints influenced by neighboring parts. The movement of the relative parts of the articulated body is modeled with local information of displacements from the Markov network and the global information from other neighboring parts. We explore the effect of certain model parameters (including the number of parts tracked; number of Monte-Carlo cycles, etc.) on system accuracy and show that ourvariational Monte Carlo approach achieves better efficiency and effectiveness compared to other methods on a number of real-time video datasets containing single targets.
2003
A new method for visual tracking of articulated objects is presented. Analyzing articulated motion is challenging because the dimensionality increase potentially demands tremendous increase of computation. To ease this problem, we propose an approach that analyzes subparts locally while reinforcing the structural constraints at the mean time. The computational model of the proposed approach is based on a dynamic Markov network, a generative model which characterizes the dynamics and the image observations of each individual subpart as well as the motion constraints among different subparts. Probabilistic variational analysis of the model reveals a mean field approximation to the posterior densities of each subparts given visual evidence, and provides a computationally efficient way for such a difficult Bayesian inference problem. In addition, we design mean field Monte Carlo (MFMC) algorithms, in which a set of low dimensional particle filters interact with each other and solve the high dimensional problem collaboratively. Extensive experiments on tracking human body parts demonstrate the effectiveness, significance and computational efficiency of the proposed method.
The Fourth International Conference onComputer and Information Technology, 2004. CIT '04.
We present a novel method for tracking the motion of an articulated structure in a video sequence. The analysis of articulated motion is challenging because of the potentially large number of degrees of freedom (DOFs) of an articulated body. For particle filter based algorithms, the number of samples required with high dimensional problems can be computationally prohibitive. To alleviate this problem, we represent the articulated object as an undirected graphical model (or Markov Random Field, MRF) in which soft constraints between adjacent subparts are captured by conditional probability distributions. The graphical model is extended across time frames to implement a tracker. The tracking algorithm can be interpreted as a belief inference procedure on a dynamic Bayesian network. The discretisation of the state vectors makes it possible to utilise the efficient belief propagation (BP) and mean field (MF) algorithms to reason in this network. Experiments on real video sequences demonstrate that the proposed method is computationally efficient and performs well in tracking the human body.
In recent years Sequential Monte Carlo (SMC) algorithms have been applied to capture the motion of humans. In this paper we apply a SMC algorithm to capture the motion of an articulated chain, e.g., a human arm, and show how the SMC algorithm can be improved in this context by apply- ing auxiliary information. In parallel to a model-based ap- proach we detect skin color blobs in the image as our aux- iliary information and find the probabilities of each blob representing the hand. The probabilities of these blobs are used to control the drawing of particles in the SMC algo- rithm and to correct the predicted particles. The approach is tested against the standard SMC algorithm and we find that our approach improve the standard SMC algorithm.
2007
In this paper, we present a new approach for the stable tracking of variable interacting targets under severe occlusion in 3D space. We formulate the state of multiple targets as a union state space of each target, and recursively estimate the multi-body configuration and the position of each target in 3D space by using the framework of Trans-dimensional Markov Chain Monte Carlo(MCMC). The 3D environmental model, which replicates the real-world 3D structure, is used for handling occlusions created by fixed objects in the environment, and reliably estimating the number of targets in the monitoring area. Experiments show that our system can stably track multiple humans that are interacting with each other and entering and leaving the monitored area.
Three-Dimensional Image Capture and Applications 2008, 2008
In this paper, we present a stochastic framework for articulated 3D human motion tracking. Tracking full body human motion is a challenging task, because the tracking performance normally suffers from several issues such as self-occlusion, foreground segmentation noise and high computational cost. In our work, we use explicit 3D reconstructions of the human body based on a visual hull algorithm as our system input, which effectively eliminates self-occlusion. To improve tracking efficiency as well as robustness, we use a Kalman particle filter framework based on an interacting multiple model (IMM). The posterior density is approximated by a set of weighted particles, which include both sample means and covariances. Therefore, tracking is equivalent to searching the maximum a posteriori (MAP) of the probability distribution. During Kalman filtering, several dynamical models of human motion (e.g., zero order, first order) are assumed which interact with each other for more robust tracking results. Our measurement step is performed by a local optimization method using simulated physical force/moment for 3D registration. The likelihood function is designed to be the fitting score between the reconstructed human body and our 3D human model, which is composed of a set of cylinders. This proposed tracking framework is tested on a real motion sequence. Our experimental results show that the proposed method improves the sampling efficiency compared with most particle filter based methods and achieves high tracking accuracy.
2009
We study articulated human tracking by combining spatial and temporal priors in an integrated online learning and inference framework, where body parts can be localized and segmented simultaneously. The temporal prior is represented by the motion trajectory in a low dimensional latent space learned from tracking history, and it predicts the configuration of each body part for the next frame. The spatial prior is encoded by a star-structured graphical model and embedded in the temporal prior, and it can be constructed "on-the-fly" from the predicted pose and used to evaluate and correct the prediction by assembling part detection results. Both temporal and spatial priors can be online learned incrementally through the Back Constrained-Gaussian Process Latent Variable Model (BC-GPLVM) that involves a temporal sliding window for online learning. Experiments show that the proposed algorithm can achieve accurate and robust tracking results for different walking subjects with significant appearance and motion variability.
2002
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution. These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set; efficiency is particularly important for tracking. Towards that end, we learn a low dimensional linear model of human motion that is used to structure the example motion database into a binary tree. An approximate probabilistic tree search method exploits the coefficients of this low-dimensional representation and runs in sub-linear time. This probabilistic tree search returns a particular sample human motion with probability approximating the true distribution of human motions in the database. This sampling method is suitable for use with particle filtering techniques and is applied to articulated 3D tracking of humans within a Bayesian framework. Successful tracking results are presented, along with examples of synthesizing human motion using the model.
Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, 2001
Particle filters are used for hidden state estimation with nonlinear dynamical systems. The inference of 3-d human motion is a natural application, given the nonlinear dynamics of the body and the nonlinear relation between states and image observations. However, the application of particle filters has been limited to cases where the number of state variables is relatively small, because the number of samples needed with high dimensional problems can be prohibitive. We describe a filter that uses hybrid Monte Carlo (HMC) to obtain samples in high dimensional spaces. It uses multiple Markov chains that use posterior gradients to rapidly explore the state space, yielding fair samples from the posterior. We find that the HMC filter is several thousand times faster than a conventional particle filter on a 28D people tracking problem.
2009 Workshop on Motion and Video Computing (WMVC), 2009
Efficient monocular human pose tracking in dynamic scenes is an important problem. Existing pose tracking methods either use activity priors to restrict the search space, or use generative body models with weak kinematic constraints to infer pose over multiple frames; these often tends to be slow. We develop an efficient algorithm to track human pose by estimating multi-frame body dynamics without activity priors. We present a monte-carlo approximation of the body dynamics using spatio-temporal distributions over part tracks. To obtain tracks that favor kinematically feasible body poses, we propose a novel "kinematically constrained" particle filtering approach which results in more accurate pose tracking than other stochastic approaches that use single frame priors. We demonstrate the effectiveness of our approach on videos with actors performing various actions in indoor dynamic scenes.
2000
We propose a novel hierarchical model of human dynamics for view independent tracking of the human body in monocular video sequences. The model is trained using real data from a collection of people. Kinematics are encoded using Hierarchical Principal Component Analysis, and dynamics are encoded using Hidden Markov Models. The top of the hierarchy contains information about the whole body. The lower levels of the hierarchy contain more detailed information about possible poses of some subpart of the body. When tracking, the lower levels of the hierarchy are shown to improve accuracy. In this article we describe our model and present experiments that show we can recover 3D skeletons from 2D images in a view independent manner, and also track people the system was not trained on.
Pattern Analysis and Machine Intelligence, …, 2008
We propose a framework for tracking multiple targets, where the input is a set of candidate regions in each frame, as obtained from a state-of-the-art background segmentation module, and the goal is to recover trajectories of targets over time. Due to occlusions by targets and static objects, as also by noisy segmentation and false alarms, one foreground region may not correspond to one target faithfully. Therefore, the one-to-one assumption used in most data association algorithms is not always satisfied. Our method overcomes the one-to-one assumption by formulating the visual tracking problem in terms of finding the best spatial and temporal association of observations, which maximizes the consistency of both motion and appearance of trajectories. To avoid enumerating all possible solutions, we take a Data-Driven Markov Chain Monte Carlo (DD-MCMC) approach to sample the solution space efficiently. The sampling is driven by an informed proposal scheme controlled by a joint probability model combining motion and appearance. Comparative experiments with quantitative evaluations are provided.
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010
We introduce a new class of probabilistic latent variable model called the Implicit Mixture of Conditional Restricted Boltzmann Machines (imCRBM) for use in human pose tracking. Key properties of the imCRBM are as follows: (1) learning is linear in the number of training exemplars so it can be learned from large datasets; (2) it learns coherent models of multiple activities; (3) it automatically discovers atomic "movemes"; and (4) it can infer transitions between activities, even when such transitions are not present in the training set. We describe the model and how it is learned and we demonstrate its use in the context of Bayesian filtering for multi-view and monocular pose tracking. The model handles difficult scenarios including multiple activities and transitions among activities. We report state-of-the-art results on the HumanEva dataset.
2011
This chapter provides an introduction to models of human pose and motion for use in 3D human pose tracking. We concentrate on probabilistic latent variable models of kinematics, most of which are learned from motion capture data, and on recent physics-based models. We briefly discuss important open problems and future research challenges.
Proceedings of the 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications, 2013
We present a solution to the people tracking problem using a monocular vision approach from a bird's eye view and Sequential Monte-Carlo Filtering. Each tracked human is represented by an individual Particle Filter using spheroids as a three-dimensional approximation to the shape of the upstanding human body. We use the bearings-only model as the state update function for the particles. Our measurement likelihood function to estimate the probability of each particle is imitating the image formation process. This involves also partial occlusion by dynamic movements from other humans within neighbored areas. Due to algorithmic optimization the system is real-time capable and therefore not only limited to surveillance or human motion analysis. It could rather be used for Human-Computer-Interaction (HCI) and indoor location. To demonstrate this capabilities we evaluated the accuracy of the system and show the robustness in different levels of difficulty.
2000
A probabilistic method for tracking 3D articulated human figures in monocular image sequences is presented. Within a Bayesian framework, we define a generative model of image appearance, a robust likelihood function based on image graylevel differences, and a prior probability distribution over pose and joint angles that models how humans move. The posterior probability distribution over model parameters is represented using a discrete set of samples and is propagated over time using particle filtering. The approach extends previous work on parameterized optical flow estimation to exploit a complex 3D articulated motion model. It also extends previous work on human motion tracking by including a perspective camera model, by modeling limb self occlusion, and by recovering 3D motion from a monocular sequence. The explicit posterior probability distribution represents ambiguities due to image matching, model singularities, and perspective projection. The method relies only on a frame-to-frame assumption of brightness constancy and hence is able to track people under changing viewpoints, in grayscale image sequences, and with complex unknown backgrounds.
IEE Proceedings - Radar, Sonar and Navigation, 2005
In this paper we consider the problem of extended object tracking. An extended object is modelled as a set of point features in a target reference frame. The dynamics of the extended object are formulated in terms of the translation and rotation of the target reference frame relative to a fixed reference frame. This leads to realistic, yet simple, models for the object motion. We assume that the measurements of the point features are unlabelled, and contaminated with a high level of clutter, leading to measurement association uncertainty. Marginalising over all the association hypotheses may be computationally prohibitive for realistic numbers of point features and clutter measurements. We present an alternative approach within the context of particle filtering, where we augment the state with the unknown association hypothesis, and sample candidate values from an efficiently designed proposal distribution. This proposal elegantly captures the notion of a soft gating function. We demonstrate the performance of the algorithm on a challenging synthetic tracking problem, where the ground truth is known, in order to compare between different algorithms.
Digital Signal Processing
This work presents the current state-of-the-art in techniques for tracking a number of objects moving in a coordinated and interacting fashion. Groups are structured objects characterized with particular motion patterns. The group can be comprised of a small number of interacting objects (e.g. pedestrians, sport players, convoy of cars) or of hundreds or thousands of components such as crowds of people. The group object tracking is closely linked with extended object tracking but at the same time has particular features which differentiate it from extended objects. Extended objects, such as in maritime surveillance, are characterized by their kinematic states and their size or volume. Both group and extended objects give rise to a varying number of measurements and require trajectory maintenance. An emphasis is given here to sequential Monte Carlo (SMC) methods and their variants. Methods for small groups and for large groups are presented, including Markov Chain Monte Carlo (MCMC) ...
2011 18th IEEE International Conference on Image Processing, 2011
In this paper, we present a visual pose tracking algorithm based on Monte Carlo sampling of special Euclidean group SE(3) and knowledge of a 3D model. In general, the relative pose of an object in 3D space can be described by sequential transformation matrices at each time step. Thus, the objective of this work is to find a transformation matrix in SE(3) so that the projection of an object transformed by this matrix coincides with an object of interest in the 2D image plane. To do this, first, the set of these transformation matrices is randomly generated via an autoregressive model. Next, 3D transformation is performed on a 3D model by these matrices. Finally, a region-based energy model is designed to evaluate the optimality of a transformed model's projection. Experimental results demonstrate the robustness of the proposed method in several tracking scenarios.
2002
Statistical inefficiency often limits the effectiveness of particle filters for high-dimensional Bayesian tracking problems. To improve sampling efficiency on continuous domains, we propose the use of a particle filter with hybrid Monte Carlo (HMC), an MCMC method that follows posterior gradients toward high probability states, while ensuring a properly weighted approximation to the posterior. We use HMC filtering to infer the 3D shape and motion of people from natural, monocular image sequences. The approach currently uses an empirical, edge-based likelihood function, and a second-order dynamical model with soft bio-mechanical joint constraints.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.