0% found this document useful (0 votes)

67 views10 pages

3D Pose Estimation from 2D Silhouette

This document summarizes a research paper that proposes a new technique for extracting 3D human poses from a single 2D image based on silhouette shape recognition. The technique uses a 3D human pose and action simulator to build a silhouette database, which is then used to match detected silhouettes and estimate possible 3D poses. Krawtchouk geometric moments are employed as shape descriptors to compare silhouettes across different actions and subjects captured from different viewpoints. The approach provides quantitative results for 3D pose extraction without requiring complex machine learning algorithms or additional sensors beyond a single 2D image.

Uploaded by

isac27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views10 pages

3D Pose Estimation from 2D Silhouette

Uploaded by

isac27

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

3D Human Poses Estimation from a single 2D silhouette

Fabrice Atrevi, Damien Vivet, Florent Duculty, Bruno Emile

To cite this version:

Fabrice Atrevi, Damien Vivet, Florent Duculty, Bruno Emile. 3D Human Poses Estimation from
a single 2D silhouette. 11th International Joint Conference on Computer Vision, Imaging and
Computer Graphics Theory and Applications, Feb 2016, Rome, Italy. Proceedings of the 11th
Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications,
<10.5220/0005711503610369>. <hal-01636974>

HAL Id: hal-01636974

https://hal.archives-ouvertes.fr/hal-01636974
Submitted on 17 Nov 2017

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est

archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
3D Human Poses Estimation from a single 2D silhouette

Fabrice Dieudonné Atrevi1 , Damien Vivet1 , Florent Duculty1 and Bruno Emile1
1 Univ. Orléans, PRISME, EA 4229, F45072, Orléans, France

{damien.vivet, bruno.emile,florent.duculty}@univ-orleans.fr, [email protected]

Keywords: Pose estimation, 3D pose, 3D modeling, skeleton extraction, shape descriptor, geometric moment, Krawtchouk
moment.

Abstract: This work focuses on the problem of automatically extracting human 3D poses from a single 2D image. By
pose we mean the configuration of human bones in order to reconstruct a 3D skeleton representing the 3D
posture of the detected human. This problem is highly non-linear in nature and confounds standard regres-
sion techniques. Our approach combines prior learned correspondences between silhouettes and skeletons
extracted from 3D human models. In order to match detected silhouettes with simulated silhouettes, we used
Krawtchouk geometric moment as shape descriptor. We provide quantitative results for image retrieval across
different action and subjects, captured from differing viewpoints. We show that our approach gives promising
result for 3D pose extraction from a single silhouette.

1 INTRODUCTION ficult task, moreover camera parameters are unknown

making the correspondence 2D/3D difficult.
In this work we propose a new technique for the ex-
Recognizing human actions is really challenging for
traction of 3D skeleton pose assumptions from a sin-
computer vision scientists and researchers since the
gle 2D image based on the silhouette shape recogni-
last two decades (Wang et al., 2011). Nevertheless,
tion. This technique is based on the use of a 3D hu-
human action recognition systems have a lot of pos-
man pose and action simulator. A silhouette database
sible applications in surveillance, pedestrian tracking
is constructed from this simulator and is used in order
and Human Machine Interaction (Aggarwal and Cai,
to match nearest silhouette and as a results possible
1999). Human pose estimation is a key step to action
3D human pose.
recognition.
This article presents a silhouette shape description
A human action is often represented as a succession and comparison between different subjects and action
of human poses (Wang et al., 2013). As these poses steps and show that we can obtain 3D skeleton con-
could be 2D or 3D, so estimating them have attracted figuration by using only a single 2D silhouette detec-
a lot of attention. A 2D pose is usually represented tion. Section 2 presents related works in the human
by a set of joint locations (Yang and Ramanan, 2011) skeleton and action recognition. Section 3 presents
whose estimation remains challenging because of the the global framework of the method and the 3D sim-
human body shape variability, viewpoint change, etc. ulation used. Section 4 deals with Krawtchouk shape
Considering 3D pose, we usually represent it by a descriptors applied to human silhouettes. Finally, sec-
skeleton model parameterized by joint locations (Tay- tion 3.1 and 5 present the databases and the obtained
lor, 2000) or by rotation angles (Lee and Nevatia, results.
2009). Such representation has the advantage to
be Viewpoint-invariant however, estimating 3D poses
from a single image still remains a difficult problem.
The reasons are multiple. First, multiple 3D poses 2 RELATED WORKS
may have the same 2D pose reprojection. Second,
3D pose is inferred from detected 2D joint locations There are many methods in the state-of-the-art that
so 2D pose reliability is essential because it greatly deals with the human pose estimation and action
affects skeleton estimation performance. In camera recognition. Nevertheless, these tasks are still chal-
network used in a video-surveillance context, image lenging for computer vision community. Human
quality is often poor making 2D joint detection a dif- activity analyses started with O’Rourke and Badler
(O’Rourke et al., 1980) and Hogg (Hogg, 1983) in the Triggs (Agarwal and Triggs, 2006) used the shape
eighties. Since last decades scientists proposed many context in their research on human pose estimation.
approaches. We can categorize these approaches into Gorce and al. (de La Gorce et al., 2011) estimated
two main categories: on one hand the methods using and tracked the human hand from monocular video
3D information and on the other hand technics using through minimization of an objective function. This
only 2D data. minimization is done using a quasi-Newton method,
Most of the approaches use a 3D model or 3D detec- for which they provide a rigorous derivation of the ob-
tion for estimating the pose of a subject and for action jective function gradient. Yang and Ramanan (Yang
classification. Rehg and Kanade (Rehg and Kanade, and Ramanan, 2011) estimated the pose by capturing
1994) presented a 3D model-based hand tracking sys- the orientation of each part with a mixture of tem-
tem that can cover the state of a 27 DOF skeleton. plates modeled by linear SVMs. All of these methods
Gavrial and al.(Gavrila and Davis, 1996) used a 3D focus on 2D image interpretation in order to detect
model-based tracking of unconstrained human move- human pose or action. For this purpose, learning is
ment. They used some sequence images acquired requiered and such algorithms need complex and ex-
from multiple views for recovering 3D body pose of pensive systems to get the training data set with the
a human. ground truth.
Bourdev and Malik (Bourdev and Malik, 2009) esti- Our method is based on a very simple silhou-
mated the human pose from key points. They used a ette extraction and description. We use the robust
data set of annotations of human with 3D joins infor- Krawtchouk geometric moment to shape analysis in
mations inferred using anthropometric constraints for monocular image. For the database, we proposed to
human action classification (Maji et al., 2011). Hiyadi use software applications from the open source com-
and al. (Hiyadi et al., 2015) used the depht informa- munity. These softwares makes realistic simulation
tion obtained from Kinect sensor and a tracking algo- of various human poses and action possible. We
rithm for 3D human gestures recognition. Jian (Jiang, have shown in this work that using 3D simulations
2010) proposed an exemple-based method, based on for learning, without complex machine learning algo-
the kd-tree achieves real-time performance, to prune rithm and with a simple real time shape descriptor we
the hypotheses. Ramakrishna and al. (Andriluka can achieve 3D pose estimation on real data with good
et al., 2010) proposed a three-stage process for 3D accuracy from a unique 2D image.
poses recovering in uncontrolled environment. Val-
madre and Lucey (Valmadre and Lucey, 2010) used
deterministic structure from multiple view of motion, 3 METHODOLOGY
based on the related work of Wei and Chai (Wei and
Chai, 2009), for 3D pose estimation. The proposed approach for pose estimation is based
These approaches need multiple sensors or specific on shape analysis of human silhouette. The method
devices such as time of flight or active camera for can be decomposed into four parts: (1) simulated
acquiring 3D information. These models also, need silhouette and skeleton database, (2) Human detec-
good parametrization. tion and 2D silhouette extraction, (3) silhouette shape
The second category of approaches, to which our pro- matching, (4) skeleton scaling and validation. The
posed method belongs, used 2D models trained from workflow is presented Fig.1.
various images. Baumberg and Hogg (Baumberg and (1) First, silhouette and skeleton database is built
Hogg, 1994) used active shape model to track pedes- thanks to opensource 3D software (see section 3.1).
trians in real world scenes. They used the B-spline Such database is composed of human silhouettes and
as a shape vector for training the model. Wren and its corresponding 3D skeletons for different kind of
al. (Wren et al., 1997) tracked people and interpreted actions we want to recognize. So, for a requested sil-
their behaviour by using a multiclass statistical model houette, it’ll be possible to find the matching silhou-
of colour and shape to obtain 2D representation of ette in the database and then the corresponding 3D
head and hand. Gorelick and al. (Gorelick et al., skeleton.
2005) used the solution of Poisson’s equation to ex- (2) 2D silhouettte detection is a well-studied field in
tract spatiotemporal features such as the saliancy, the machine learning and computer vision. For this pur-
orientation of the shape for action recognition and pose we used classical real-time approach proposed
then human pose estimation. Guo and al. (Guo et al., by Dollar et al. (P. Dollar and Perona, 2010) based on
2009) used a geometrical normalized vector of dimen- multiscale HOG (Dalal and Triggs, 2005). Once the
sion 13 for describing the shape of a human. Mori human silhouette is detected, we converted it in a 128
and Jitendra (Mori and Malik, 2002), or Agarwal and x 48 pixels image for solving the translation and scale
Figure 1: Human pose estimation methodology.

problem. graphics software called Blender1 associated with a

(3) Silhouette description and similarity measure- free software to create realistic 3d human makehu-
ment is the key point of our methodology. The main man2 (see Fig. 2). These avatars can be animated
objective is to describe accurately the shape of the sil- thanks to motion capture data in order to simulate
houette. For this task, we used the geometric moment very realistic actions.
of Krawtchouk because of its robustness compared to
Hu, Zernike or Shapecontext descriptors. (See sec-
tion 4) Based on this descriptor, a characteristic vector
is computed for each silhouette in the database. The
similarity between characteristic vector is measured
with the Euclidean distance given by :
T 2
d(zr , zt ) = ∑ zri − zti (1)
i=1
where zr et zt is respectively the characteristic vec-
tor of request silhouette and the t th silhouette in the
database. Figure 2: 3D simulated avatar and its associated skeleton
(4) Skeleton scaling and validation. For each sil-
houette we retrieve a 3D skeleton. This skeleton is In these softwares, we simulate different human
scaled to the current silhouette size. At this step we avatars with different morphologies and clothes and
use ground truth simulated database to valide the ap- animate them with different realistic motions taken
proach. The confidence score is process by measuring from the CMU motion capture database3
the reprojection error of predicted joints on the silhou-
3.1.2 Database construction
ette.

3.1 Construction of the 2D/3D matching In the 3D computer graphics software, we positioned
on an emisphere a virtual camera looking at the sub-
database ject. For each movement of the avatar, we record
3.1.1 3D human avatar and action simulation 1 https://www.blender.org/
2 http://www.makehuman.org/
In order to build our simulated humans, we choose to 3 Thedata used in this project was obtained from mo-
use a professional free and open-source 3D computer cap.cs.cmu.edu.
both: 2D image and silhouette (see fig 3), 3D cam- and satisfies the orthogonality condition :
era poses and 3D joints and bones poses. As a result
N
for each subject’s pose we can collect the detected sil-
houette related to its 3D skeleton which contains 19
∑ w(x; p, N)Kn (x; p, N)Km (x; p, N) = ρ(n; p, N)δnm
k=0
bones. We recorded in 4 subjects with different phe- n (6)
notypes and for 4 differents animations: walk cycle, where ρ(n; p, N) = (−1)n 1−p n!
and δnm
p (−N) n
basket action, jumb and climb. As a result, we ob-
tained 2925 couples silhouette / 3D skeleton. is the Kronecher function.
For each silhouette, we calculated the feature vector In order to eliminate the large variability in the
of the shape descriptors presented in section 4 and the dynamic range, a normalization process is applied.
2D poses of reprojected joints for quantitative evalu- Then, the set of normalized (weighted) Krawtchouk
ation of the method. polynomials is defined by (Yap et al., 2003) as:
s
w(x; p, N)
K̄n (x; p, N) = Kn (x; p, N) (7)
ρ(n; p, N)

4.2 Krawtchouk Moment

Krawtchouk moment is firstly used in image analysis

by P.T Yap and al.(Yap et al., 2003). Based on the
weighted Krawtchouk polynomials, the (n + m) order
of Krawtchouk moment for an N x M image with in-
tensity function f (x, y) is defined as:
Figure 3: Human silhouette extracted
N−1 M−1
Qnm = ∑ ∑ K̄n (x; p1, N − 1) K̄m (y; p2, M − 1) f (x, y)
x=0 y=0
(8)
4 KRAWTCHOUK POLYNOMIAL The parameter p1 and p2 can be viewed as a trans-
AND MOMENTS lation factor. Indeed, if p = 0.5 + ∆p, the weighted
Krawtchouk polynomials are shifted by about N∆p.
The direction of shifting relies on the sign of ∆p, with
4.1 Krawtchouk Polynomial the polynomials shifting along + x direction when ∆p
is positive and vice versa. This property allows to ex-
The n-th order of Krawtchouk polynomial is based on tract the local properties of an images. For software
the hypergeometric function and is defined as: like Matlab, there is a matrix form of the Krawtchouk
N
1
moment. In matrix form, it is defined as:
Kn (x; p, N) = ∑ ak,n,p xk = 2 F1 −n, −x; −N;
k=0 p Q = K2 AK1T (9)
(2)
i, j=N−1
where x, n = 0, 1, 2, ..., N et N > 0, p ∈ (0, 1) and where Q = {Q ji }i, j=0 ,
the hypergeometric function defined as: i, j=N−1
Kv = {K̄i ( j; pv, N − 1)}i, j=0 and
i, j=N−1
A = { f ( j, i)}i, j=0
(a)k (b)k zk zk
∞
2 F1 (a, b; c; z) = ∑ (3)
k=0 (c)k k! 4.3 Feature extraction

Γ(a + k) For a given image of human silhouette, we used

(a)k = a(a + 1)...(a + k − 1) = (4) Krawtchouk moment to describe the shape of the
Γ(a)
human belong to the image. That means to calcu-
Equation (4) is the Pochhammer symbol. late the characteristic vector of the image with dif-
The set of (N+1) Krawtchouk polynomial forms the ferent values of the moment. Thanks to the ability of
complete set of discrete basis functions with weight Krawtchouk moment to extract feature of specific re-
function gions of the image, we divided each silhouette in two

N parts (up and bottom) (fig. 4) with the parameter p1 =
w(x; p, N) = px (1 − p)N−x (5) 0.5, p2 = 0.1 (for the up) and p1 = 0.5, p2 = 0.95
x
(for the bottom). Then, we calculated two character- not only give the more suitable silhouette but gives in
istic vectors and combined them to get one vector de- a classified way the N th most probable silhouettes. In
scriptor. Each human silhouette extracted is converted order to evaluate the given result, we used the simula-
to a common space 128 x 48 to get the invariance to tion. By knowing the real skeleton of the test image,
translation and scale. For rotation invariance, we sup- we can process the reprojection error of the estimated
posed that the vertical is preserved. 3D joints. According to experimental result, when the
mean error is less than 5 pixels, the pose of the re-
sult is considered similar to the pose of the request
silhouette. For this empiric threshold, the difference
between two silhouettes is hardly visible for a human.

5.1 Representativity and descriptor

robustness to noise

Silouette extraction is still an active reseach field. It

is well known that extraction is subjected to noise.
Figure 4: Krawtchouk polynomial for up and bottom First point was to check our descriptors robustness to
noise. For this, we conducted experiments with two
According to some related works, we chose to calcu- databases of simulated data for a human avatar with
late Krawtchouk moment with parameter (m = n). In different morphology and different actions. The first
order to find the best value of n, we used a database database contains 2925 training data with Gaussian
with 600 simulated silhouettes and done cross valida- noise around the contour of the shape and the second
tion over all. The fig 5 show that from order (n = m = database contains 608 unlearning data. The aims of
24), we got a stable and best accuracy for pose recog- this experience is to evaluate the capacity of shape de-
nition. So, the final feature vector has 48 dimensions. scriptors to encode various shapes with different value
of the standard deviation of Gaussian noise. Con-
sidering x0 = [0, 0] the center of the silhouette, let
xi = [ρi , θi ] the polar coordinates of a contour point.
The noise ∆σ is applied on ρi . ∆σ ,→ N (0, std) with
std = {0, 1, 2, 3}. Example of noised silhouettes are
presented on figure 6.

Figure 5: Accuracy of cross validation with differents value

of n

Figure 6: Noised silhouettes with ∆σ ,→ N (0, std) and

5 EXPERIMENTS std = {1, 2, 3}

In section 3.1 we have shown that for each 2D image The aim of this experience is to see if the shape de-
of silhouette of the database, we store both the silhou- scriptor can perfectly encode a silhouette and make
ette vector descriptors and the associated 3D skeleton the difference between closed postures. The silhou-
composed of 19 joints. Then, for a test image with ette in the database can be very similar because we
extracted silhouette, similarity is computed between extracted it from a video of the motion, so two near
the processed vector of descriptors and database de- frames provide a very similar silhouette. For std = 0,
scriptors using the Euclidian distance. As a result we we have the original silhouette and for std > 0, the
extract the corresponding silhouette in the database Gaussian white noise is added on the silhouette. Fig-
and its joints 3D poses. Note that the approach does ure 7 shows that the more the std increases, the more
the recognition accuracy decreases. For this test we that the shape of the curve changes as a fonction of
used a training data set composed of 2925 and a the motion. The means error obtained form the jump
testing test of 608 silouhettes. For a single neigh- motion is 1.9892 px. This discrimination factor con-
bour (N = 1), with std = {0, 1, 2, 3}, the recognition firmed that the 3D poses can be used for actions clas-
rate is respectively RR = {98.81, 96.43, 74.6, 44.84}. sification in a video.
But, if we augment the number of N assumption re-
turned by the program, the recognition rate grows up
quickly. For N = 7 and std = {0, 1, 2, 3}, the RR are 5.2 Application to action recognition on
{100, 100, 96.43, 73.41}. Considering that the silhou- real data
ettes are very similar and the noise very strong, the
method gives very good results. For the rest of the ar-
ticle we will consider N = 7 first silhouettes given by We used the same shape descriptor for human ac-
the matcher. tion classification in video, with the public Weizmann
In order to estimate the 3D extracted skeleton, we database (see Figure 12). As we do not use tempo-
use the same request silhouette as for previous exper- ral information, our method consists in matching each
iment. For each extracted silhouette, we process the frame to an action class and took the class with the
reprojection error and evaluate the accuracy for dif- highest associated rate as the class action.
ferent value of N. The Figure 8 shows skeletons esti- The database is a collection of 90 low-resolution (180
mations from a single monocular image. For this re- x 144, deinterlaced 50 fps) video sequences show-
sult, the reprojection error of the first image (human ing 9 different people, each performing 10 natural ac-
walking) is 2.4739 px and that of the second image tions: run, walk, skip, jumping-jack, jump, gallop-
(human in cross position) is 1.2614 px. This means sideways, wave-two-hands, waveone- hand, or bend.
error show that the retrieval pose is near to the origi- On Weizmann data base, we made a cross validation
nal pose. Note that, in the database, there a no avatar with the different movements and with the different
with the similar appaerance, so this error is reason- phenotypes. In each case and for each frame, we ap-
able. ply our shape matching method to each frame. As
The images that we used as request in fig 8 are sim- the resulting silhouette from the database belongs to a
ulate image. So, we got a perfect result with low re- specific movement class we simply count the number
projection error. of occurencies. The more represented class is then
The result of this 3D skeleton extraction of fig 9 is considered as the detected movement.
perfect because this pose is unique and easy to find. Based on this very simple workflow, we got 71.66%
The silhouette extraction is too easy because we have of good action classification. The confusion matrix
a static and uniform background. is shown on the fig.13. Of course, this accuracy
In fig 10, we used a real world image extracted from a rate is lower than the recent accuracy obtained on the
walking action video. The pose that we choose is sim- same database (Blank 99.64% (Blank et al., 2005) and
ilar but not exact with pose in the learning database. Gorelick 97.83% (Gorelick et al., 2007)). But both of
So, we don’t expect to get a very simular 3D pose as these approach used space-times cubes to analyse the
result, but some pose similar. The result show a good motion while we do not consider yet the temporal cor-
result in term of the shape of the pose. But, confusion relation between successives frame.
was made between right and left foot and arm. According to Gorelick et al.: many successive frames
In order to evaluate the stability and therobustness from the first action (about run) may exhibit high spa-
of our approach, we considered the successive detec- tial similarity to the successive frames from the sec-
tions during a complete movie of the movement. Note ond one. Ignoring the dynamics within the frames
that there is no use of the time line and each frame might lead to confusion between the two actions. As
is processed independently. Figure 11 (a) shows the the approach does not take into account time dimen-
tracking results of four human’s joints during the ex- sion, frame to frame comparison leads to misclassifi-
ecution of the climbing motion. The red curve show cation for these very similar frame to frame actions:
the real position over time and the green curve show run, skip and jump.
the estimate position over time. We can note that the In future work, we will use our proposed approach
shape of different curves is the same. That means that combined with the multi-hypothesis tracking tech-
the successives detections are stable in time and that niques (with N neighboors) to improve the accuracy
our shape descriptor is reliable. We can note that there of action classification. By this way, we will take into
is an offset due to shape scaling. The means error over account the temporal information and the dynamic of
the motion execution is 1.9765 px. Figure 11 shows the action.
Figure 7: Histogramm of accuracy: colors represent the noise amplitude resp. {0, 1, 2, 3} pixels. The abscisses represent the
number N of neighboors considered {1, 3, 5, 7}.

Figure 9: Real world data 1

Figure 8: 3D pose estimation result: Left, the resquest sil-

houette and from left to right, the 3D estimated skeleton
from various viewpoints
tion to 3D pose estimation and action classification
have been presented. In our work, we tested different
6 CONCLUSIONS moment order and selected the best suitable for our
approach. We compared our approach with some re-
In this paper, we presented a new approach for lated work in action classification and we concluded
3D human pose estimation and action classification that this approach can be improved by using multi-
in video. The learning database is easily generated hypothesis tracking during action identification and
thanks to open source softwares which allow any hu- classification. In future work, we will use a combina-
man pose simulaion. The proposed posture recognition of local and global shape descriptor for improv-
tion method is based on the geometric Krawtchouck ing the pose estimation, and use the estimated poses
moment and gives promising results. Both applica- to construct an action model for activity classification.
Figure 12: Some images of Weizmann database

Figure 10: Real world data 2

Figure 13: Confusion matrix

REFERENCES
Agarwal, A. and Triggs, B. (2006). Recovering 3d hu-
man pose from monocular images. Pattern Analy-
(a) Climb motion sis and Machine Intelligence, IEEE Transactions on,
28(1):44–58.
Aggarwal, J. and Cai, Q. (1999). Human motion analysis:
A review. Computer Vision and Image Understanding,
73(3):428–440.
Andriluka, M., Roth, S., and Schiele, B. (2010). Monocular
3d pose estimation and tracking by detection. In Com-
puter Vision and Pattern Recognition (CVPR), 2010
IEEE Conference on, pages 623–630. IEEE.
Baumberg, A. and Hogg, D. (1994). Learning flexible mod-
els from image sequences. Springer.
Blank, M., Gorelick, L., Shechtman, E., Irani, M., and
(b) Jump motion
Basri, R. (2005). Actions as space-time shapes. In
Figure 11: Tracking result The Tenth IEEE International Conference on Com-
puter Vision (ICCV’05), pages 1395–1402.
7 ACKNOWLEDGE Bourdev, L. and Malik, J. (2009). Poselets: Body part de-
tectors trained using 3d human pose annotations. In
Computer Vision, 2009 IEEE 12th International Con-
This work is part of LUMINEUX project, sup- ference on, pages 1365–1372. IEEE.
ported by the Regional Centre-Val de Loire (France). Dalal, N. and Triggs, B. (2005). Histograms of oriented gra-
The authors would like to acknowledge the Conseil dients for human detection. In In: IEEE Conference
Regional of Centre-Val de Loire for its support. on Computer Vision and Pattern Recognition, pages
886–893.
de La Gorce, M., Fleet, D., and Paragios, N. (2011).
Model-based 3d hand pose estimation from monocu-
lar video. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, 33(9):1793–1805.
Gavrila, D. M. and Davis, L. S. (1996). 3-d model-based Wang, C., Wang, Y., and Yuille, A. (2013). An approach
tracking of humans in action: a multi-view approach. to pose-based action recognition. In Computer Vision
In Computer Vision and Pattern Recognition, 1996. and Pattern Recognition (CVPR), 2013 IEEE Confer-
Proceedings CVPR’96, 1996 IEEE Computer Society ence on, pages 915–922.
Conference on, pages 73–80. IEEE. Wang, L., Wang, Y., and Gao, W. (2011). Mining layered
Gorelick, L., Blank, M., Shechtman, E., Irani, M., and grammar rules for action recognition. International
Basri, R. (2005). Actions as space-time shapes. In Journal of Computer Vision, 93(2):162–182.
In ICCV, pages 1395–1402. Wei, X. K. and Chai, J. (2009). Modeling 3d human poses
Gorelick, L., Blank, M., Shechtman, E., Irani, M., and from uncalibrated monocular images. In Computer
Basri, R. (2007). Actions as space-time shapes. Trans- Vision, 2009 IEEE 12th International Conference on,
actions on Pattern Analysis and Machine Intelligence, pages 1873–1880. IEEE.
29(12):2247–2253. Wren, C. R., Azarbayejani, A., Darrell, T., and Pentland,
Guo, K., Ishwar, P., and Konrad, J. (2009). Action recogni- A. P. (1997). Pfinder: Real-time tracking of the hu-
tion in video by covariance matching of silhouette tun- man body. Pattern Analysis and Machine Intelligence,
nels. In In: XXII Brazilian Symposium on Computer IEEE Transactions on, 19(7):780–785.
Graphics and Image Processing, pages 299–306. Yang, Y. and Ramanan, D. (2011). Articulated pose esti-
Hiyadi, H., Ababsa, F., Bouyakhf, E. H., Regragui, F., and mation with flexible mixtures-of-parts. In Computer
Montagne, C. (2015). Reconnaissance 3d des gestes Vision and Pattern Recognition (CVPR), 2011 IEEE
pour l’interaction naturelle homme robot. In Journées Conference on, pages 1385–1392. IEEE.
francophones des jeunes chercheurs en vision par or- Yap, P.-T., Paramesran, R., and Ong, S.-H. (2003). Image
dinateur. analysis by krawtchouk moments. Image Processing,
Hogg, D. (1983). Model-based vision: a program to see a IEEE Transactions on, 12(11):1367–1377.
walking person. Image and Vision computing, 1(1):5–
20.
Jiang, H. (2010). 3d human pose reconstruction using mil-
lions of exemplars. In Pattern Recognition (ICPR),
2010 20th International Conference on, pages 1674–
1677.
Lee, M. W. and Nevatia, R. (2009). Human pose tracking in
monocular sequence using multilevel structured mod-
els. Pattern Analysis and Machine Intelligence, IEEE
Transactions on, 31(1):27–38.
Maji, S., Bourdev, L., and Malik, J. (2011). Action recog-
nition from a distributed representation of pose and
appearance. In Computer Vision and Pattern Recogni-
tion (CVPR), 2011 IEEE Conference on, pages 3177–
3184. IEEE.
Mori, G. and Malik, J. (2002). Estimating human body con-
figurations using shape context matching. In Com-
puter VisionECCV 2002, pages 666–680. Springer.
O’Rourke, J., Badler, N., et al. (1980). Model-based im-
age analysis of human motion using constraint prop-
agation. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, (6):522–536.
P. Dollar, S. B. and Perona, P. (2010). The fastest pedestrian
detector in the west. In In: Proceedings of the British
Machine Vision Conference, pages 1–11.
Rehg, J. M. and Kanade, T. (1994). Visual tracking of high
dof articulated structures: an application to human
hand tracking. In Computer VisionECCV’94, pages
35–46. Springer.
Taylor, C. (2000). Reconstruction of articulated objects
from point correspondences in a single uncalibrated
image. In Computer Vision and Pattern Recognition,
2000. Proceedings. IEEE Conference on, volume 1,
pages 677–684 vol.1.
Valmadre, J. and Lucey, S. (2010). Deterministic 3d hu-
man pose estimation using rigid structure. In Com-
puter Vision–ECCV 2010, pages 467–480. Springer.

An Evaluation of 2D Human Pose Estimation Based On
No ratings yet
An Evaluation of 2D Human Pose Estimation Based On
9 pages
Kle Dr.M.S.Sheshgiri College of Engineering and Technolog, Belgaum
No ratings yet
Kle Dr.M.S.Sheshgiri College of Engineering and Technolog, Belgaum
18 pages
Luvizon 2D3D Pose Estimation CVPR 2018 Paper
No ratings yet
Luvizon 2D3D Pose Estimation CVPR 2018 Paper
10 pages
A Comprehensive Survey On Human Pose Estimation AP
No ratings yet
A Comprehensive Survey On Human Pose Estimation AP
30 pages
GLA-GCN: Global-Local Adaptive Graph Convolutional Network For 3D Human Pose Estimation From Monocular Video
No ratings yet
GLA-GCN: Global-Local Adaptive Graph Convolutional Network For 3D Human Pose Estimation From Monocular Video
12 pages
Pid 151
No ratings yet
Pid 151
5 pages
Multi-Camera 3D Human Pose Estimation
No ratings yet
Multi-Camera 3D Human Pose Estimation
7 pages
3D-Posture Recognition Using Joint Angle Representation
No ratings yet
3D-Posture Recognition Using Joint Angle Representation
11 pages
Multi-Person 3D Human Pose Estimation From Monocular Images
No ratings yet
Multi-Person 3D Human Pose Estimation From Monocular Images
10 pages
Applsci 11 01826 v2
No ratings yet
Applsci 11 01826 v2
14 pages
DeepSkeleton-Skeleton Map For 3D Human Pose Regression
No ratings yet
DeepSkeleton-Skeleton Map For 3D Human Pose Regression
11 pages
Proposal UNSW
100% (1)
Proposal UNSW
18 pages
SSRN 3833854
No ratings yet
SSRN 3833854
5 pages
1 s2.0 S0957417414001870 Main
No ratings yet
1 s2.0 S0957417414001870 Main
10 pages
Toread Survey
No ratings yet
Toread Survey
26 pages
3D Pose Estimation from 2D Keypoints
No ratings yet
3D Pose Estimation from 2D Keypoints
10 pages
Joint Training for Human Pose Estimation
No ratings yet
Joint Training for Human Pose Estimation
6 pages
3D Human Pose Estimation A Review of The Literature and Analysis of Covariates
No ratings yet
3D Human Pose Estimation A Review of The Literature and Analysis of Covariates
28 pages
Human Pose Estimation with Compositional Tokens
No ratings yet
Human Pose Estimation with Compositional Tokens
12 pages
Learning Human Pose Estimation Features With Convolutional Networks
No ratings yet
Learning Human Pose Estimation Features With Convolutional Networks
10 pages
Optimization For 3D Human Pose and Shape Estimation
No ratings yet
Optimization For 3D Human Pose and Shape Estimation
21 pages
A New Multi-Person Pose Estimation Method Using The Partitioned CenterPose Network
No ratings yet
A New Multi-Person Pose Estimation Method Using The Partitioned CenterPose Network
14 pages
Base 01
No ratings yet
Base 01
6 pages
3D Pose Estimation from RGB Images
No ratings yet
3D Pose Estimation from RGB Images
10 pages
Research Proposal PDF
No ratings yet
Research Proposal PDF
4 pages
Video Action Detection via Pose Estimation
No ratings yet
Video Action Detection via Pose Estimation
4 pages
Human Pose Estimation Using MediaPipe Pose and Opt
No ratings yet
Human Pose Estimation Using MediaPipe Pose and Opt
21 pages
PifPaf: Composite Fields for Pose Estimation
No ratings yet
PifPaf: Composite Fields for Pose Estimation
10 pages
Sigal Encyclopedia CVdraft
No ratings yet
Sigal Encyclopedia CVdraft
12 pages
SGAT - Semantic Graph Attention For 3D Human Pose Estimation
No ratings yet
SGAT - Semantic Graph Attention For 3D Human Pose Estimation
8 pages
Mid Term Project Report Training
No ratings yet
Mid Term Project Report Training
23 pages
Applsci 13 09475
No ratings yet
Applsci 13 09475
17 pages
Multi-Task Learning for Pose & Action
No ratings yet
Multi-Task Learning for Pose & Action
13 pages
Human Pose Estimation Presentation
No ratings yet
Human Pose Estimation Presentation
13 pages
Image & Video Processing Resources
No ratings yet
Image & Video Processing Resources
5 pages
3D Human Pose Estimation Under Limited Supervision Using Metric Learning
No ratings yet
3D Human Pose Estimation Under Limited Supervision Using Metric Learning
12 pages
Diplomarbeit Lassner
No ratings yet
Diplomarbeit Lassner
115 pages
3D Skeleton-Based Human Action Classification: A Survey
No ratings yet
3D Skeleton-Based Human Action Classification: A Survey
18 pages
Learning To Estimate 3D Human Pose and Shape From A Single Color Image
No ratings yet
Learning To Estimate 3D Human Pose and Shape From A Single Color Image
10 pages
Human Modelling and Pose Estimation Overview: Pawel Knap University of Southampton Pmk1g20@soton - Ac.uk
No ratings yet
Human Modelling and Pose Estimation Overview: Pawel Knap University of Southampton Pmk1g20@soton - Ac.uk
13 pages
Human Pose Estimation Paper
No ratings yet
Human Pose Estimation Paper
13 pages
3D Human Pose Estimation via Shape Contexts
No ratings yet
3D Human Pose Estimation via Shape Contexts
8 pages
Pavllo 3D Human Pose Estimation in Video With Temporal Convolutions and CVPR 2019 Paper
No ratings yet
Pavllo 3D Human Pose Estimation in Video With Temporal Convolutions and CVPR 2019 Paper
10 pages
Chen Adversarial PoseNet A ICCV 2017 Paper
No ratings yet
Chen Adversarial PoseNet A ICCV 2017 Paper
10 pages
Stable 3D Human Pose Estimation in Low - Resolution Videos With A Few Views
No ratings yet
Stable 3D Human Pose Estimation in Low - Resolution Videos With A Few Views
7 pages
1stconference Human Pose Estimation
No ratings yet
1stconference Human Pose Estimation
7 pages
Case Study3
No ratings yet
Case Study3
15 pages
Action N Pose Estimation
No ratings yet
Action N Pose Estimation
84 pages
3D Human Pose Recovery from Images
No ratings yet
3D Human Pose Recovery from Images
15 pages
【2022】IEEE CSVT a Progressive Quadric Graph Convolutional Network for 3D Human Mesh Recovery2
No ratings yet
【2022】IEEE CSVT a Progressive Quadric Graph Convolutional Network for 3D Human Mesh Recovery2
14 pages
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
No ratings yet
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
13 pages
An Overview of Human Pose Estimation With Deep Learning
No ratings yet
An Overview of Human Pose Estimation With Deep Learning
11 pages
2012 13392 PDF
No ratings yet
2012 13392 PDF
37 pages
Deeper Cut
No ratings yet
Deeper Cut
22 pages
Human Pose Estimation Guide 2019
No ratings yet
Human Pose Estimation Guide 2019
16 pages
Computer Vision and Image Understanding: Yucheng Chen, Yingli Tian, Mingyi He
No ratings yet
Computer Vision and Image Understanding: Yucheng Chen, Yingli Tian, Mingyi He
20 pages
Human Regression
No ratings yet
Human Regression
14 pages
Efficient Monocular Human Pose Estimation Based On
No ratings yet
Efficient Monocular Human Pose Estimation Based On
11 pages
Owner'S Manual: Model No. 200,71440
No ratings yet
Owner'S Manual: Model No. 200,71440
14 pages
Understanding Process Decision Program Charts
No ratings yet
Understanding Process Decision Program Charts
15 pages
ASEAN Energy Management Contest
No ratings yet
ASEAN Energy Management Contest
16 pages
Form
No ratings yet
Form
3 pages
BioSyndrome (Version 1.7)
No ratings yet
BioSyndrome (Version 1.7)
34 pages
Dimensions of Health 2018
No ratings yet
Dimensions of Health 2018
28 pages
To Study and Implement The Basic UNIX Commands
No ratings yet
To Study and Implement The Basic UNIX Commands
17 pages
Automation With Ansible Playbook - Online MCQS, Quiz and Questions
No ratings yet
Automation With Ansible Playbook - Online MCQS, Quiz and Questions
5 pages
Civil Engineering Tunnel Tech
100% (1)
Civil Engineering Tunnel Tech
61 pages
C 2 Railway Power Supply System
No ratings yet
C 2 Railway Power Supply System
14 pages
Single vs Multi-Shaft Power Generation
No ratings yet
Single vs Multi-Shaft Power Generation
1 page
Artificial Intelligence Neural Networks
100% (1)
Artificial Intelligence Neural Networks
7 pages
PEC Visio Universal TV Remote Control
No ratings yet
PEC Visio Universal TV Remote Control
8 pages
Otis Elevator: Accelerating Business Transformation With It: Group03, Section-C
No ratings yet
Otis Elevator: Accelerating Business Transformation With It: Group03, Section-C
1 page
Q4 Summative TEST COMPUTER 3
No ratings yet
Q4 Summative TEST COMPUTER 3
2 pages
PTP7074 MBA/4ST-BLK-4P ANC#5: Selector Switch 5way 20P Double W/WH Knob (Woo Seok)
No ratings yet
PTP7074 MBA/4ST-BLK-4P ANC#5: Selector Switch 5way 20P Double W/WH Knob (Woo Seok)
1 page
Photograpy Contest & Exhibition Rules
No ratings yet
Photograpy Contest & Exhibition Rules
2 pages
Final Draft Question Paper: Product Handling, Storage and Distribution
No ratings yet
Final Draft Question Paper: Product Handling, Storage and Distribution
2 pages
99 269 Testing of Plastics and Rubber e en
No ratings yet
99 269 Testing of Plastics and Rubber e en
56 pages
Artifact Lesson Plan 1
No ratings yet
Artifact Lesson Plan 1
2 pages
Unit 1: Introduction To Embedded System
No ratings yet
Unit 1: Introduction To Embedded System
48 pages
Pump Mechanical Efficiency Calculation
100% (1)
Pump Mechanical Efficiency Calculation
2 pages
Cell-Mobile Phone Repair Technician - 2071 - L-2 - PDF
0% (1)
Cell-Mobile Phone Repair Technician - 2071 - L-2 - PDF
159 pages
Flight Note for Aircraft Operations
No ratings yet
Flight Note for Aircraft Operations
1 page
Central Diagnostic Centre CD25 Quotation Revised - 034935
No ratings yet
Central Diagnostic Centre CD25 Quotation Revised - 034935
5 pages
Arjo Sara 3000 Instructions For Use
No ratings yet
Arjo Sara 3000 Instructions For Use
32 pages
Grande
No ratings yet
Grande
4 pages
VFX Evolution in 'Parasmani'
No ratings yet
VFX Evolution in 'Parasmani'
10 pages
Colossus Design Schematic
No ratings yet
Colossus Design Schematic
103 pages
Translucent Concrete
No ratings yet
Translucent Concrete
7 pages

3D Pose Estimation from 2D Silhouette

Uploaded by

3D Pose Estimation from 2D Silhouette

Uploaded by

3D Human Poses Estimation from a single 2D silhouette

Fabrice Atrevi, Damien Vivet, Florent Duculty, Bruno Emile

To cite this version:

HAL Id: hal-01636974

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est

{damien.vivet, bruno.emile,florent.duculty}@univ-orleans.fr, [email protected]

1 INTRODUCTION ficult task, moreover camera parameters are unknown

problem. graphics software called Blender1 associated with a

4.2 Krawtchouk Moment

Krawtchouk moment is firstly used in image analysis

Γ(a + k) For a given image of human silhouette, we used

5.1 Representativity and descriptor

Silouette extraction is still an active reseach field. It

Figure 5: Accuracy of cross validation with differents value

Figure 6: Noised silhouettes with ∆σ ,→ N (0, std) and

Figure 9: Real world data 1

Figure 8: 3D pose estimation result: Left, the resquest sil-

Figure 10: Real world data 2

Figure 13: Confusion matrix

You might also like