Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
160 pages
1 file
Modeling dynamic scenes/events from multiple fixed-location vision sensors, such as video camcorders, infrared cameras, Time-of-Flight sensors etc, is of broad interest in computer vision society, with many applications including 3D TV, virtual reality, medical surgery, markerless motion capture, video games, and security surveillance. However, most of the existing multi-view systems are set up in a strictly controlled indoor environment, with fixed lighting conditions and simple background views. Many challenges are preventing the technology to an outdoor natural environment. These include varying sunlight, shadows, reflections, background motion and visual occlusion. In this thesis, I address different aspects to overcome all of the aforementioned difficulties, so as to reduce human preparation and manipulation, and to make a robust outdoor system as automatic as possible. In particular, the main novel technical contributions of this thesis are as follows: a generic heterogeneous ...
2015 IEEE International Conference on Computer Vision (ICCV), 2015
This paper introduces a general approach to dynamic scene reconstruction from multiple moving cameras without prior knowledge or limiting constraints on the scene structure, appearance, or illumination. Existing techniques for dynamic scene reconstruction from multiple wide-baseline camera views primarily focus on accurate reconstruction in controlled environments, where the cameras are fixed and calibrated and background is known. These approaches are not robust for general dynamic scenes captured with sparse moving cameras. Previous approaches for outdoor dynamic scene reconstruction assume prior knowledge of the static background appearance and structure. The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras. Evaluation is performed on a variety of indoor and outdoor scenes with cluttered backgrounds and multiple dynamic non-rigid objects such as people. Comparison with state-of-the-art approaches demonstrates improved accuracy in both multiple view segmentation and dense reconstruction. The proposed approach also eliminates the requirement for prior knowledge of scene structure and appearance.
2004
Modélisation de Scènes Dynamiques par Recalage de Séquences d'Images Multi-Caméras Résumé : Nous présentons une nouvelle méthode variationnelle pour la stéréovision multi-caméras et l'estimation du mouvement tridimensionnel non-rigide à partir de plusieurs séquences vidéos. Notre méthode minimise l'erreur de prédiction de la forme et du mouvement estimés. Les deux problèmes se ramènent alors à une tâche générique de recalage d'images. Cette dernière est confiée à une mesure de similarité choisie en fonction des conditions de prise de vue et des propriétés de la scène. En particulier, notre méthode peut être rendue robuste aux changements d'apparence dus aux matériaux non-lambertiens et aux changements d'illumination. Notre méthode aboutit à une implémentation plus simple, plus souple et plus efficace que les approches par déformation de surface existantes. Le temps de calcul sur de gros jeux de données ne dépasse pas trente minutes. De plus, notre méthode est compatible avec une implémentation matérielle à l'aide de cartes graphiques. Notre algorithme de stéréovision donne de très bons résultats sur de nombreux jeux de données comportant des spécularités et des transparences. Nous avons testé avec succès notre algorithme d'estimation du mouvement sur une séquence vidéo multi-caméras d'une scène non-rigide.
Computer Vision – ACCV 2010, 2011
Dynamic scene modeling is a challenging problem in computer vision. Many techniques have been developed in the past to address such a problem but most of them focus on achieving accurate reconstructions in controlled environments, where the background and the lighting are known and the cameras are fixed and calibrated. Recent approaches have relaxed these requirements by applying these techniques to outdoor scenarios. The problem however becomes even harder when the cameras are allowed to move during the recording since no background color model can be easily inferred. In this paper we propose a new approach to model dynamic scenes captured in outdoor environments with moving cameras. A probabilistic framework is proposed to deal with such a scenario and to provide a volumetric reconstruction of all the dynamic elements of the scene. The proposed algorithm was tested on a publicly available dataset filmed outdoors with six moving cameras. A quantitative evaluation of the method was also performed on synthetic data. The obtained results demonstrated the effectiveness of the approach considering the complexity of the problem.
Lecture Notes in Computer Science, 2007
This paper presents a novel multi-view camera system that produces real-time single view scene video which sees through the static objects to observe the dynamic objects. The system employs a training phase to recover the correspondences and occlusions between the views to determine the image positions where seeing through would be necessary. During the runtime phase, each dynamic object is detected and automatically registered between the views. The registered objects are learned using an appearance based method and they are later used to superimpose the occluded dynamic objects on the desired view. The occlusion detection is done using a very efficient and effective method. The system is very practical and can be used in real life applications including video surveillance, communication, activity analysis, and entertainment. We validated the system by running various tests in office and outdoor environments.
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
In this paper, we present a new variational method for multi-view stereovision and non-rigid three-dimensional motion estimation from multiple video sequences. Our method minimizes the prediction error of the shape and motion estimates. Both problems then translate into a generic image registration task. The latter is entrusted to a similarity measure chosen depending on imaging conditions and scene properties. In particular, our method can be made robust to appearance changes due to non-Lambertian materials and illumination changes. It results in a simpler, more flexible, and more efficient implementation than existing deformable surface approaches. The computation time on large datasets does not exceed thirty minutes. Moreover, our method is compliant with a hardware implementation with graphics processor units. Our stereovision algorithm yields very good results on a variety of datasets including specularities and translucency. We have successfully tested our scene flow algorithm on a very challenging multi-view video sequence of a non-rigid scene.
Multimedia Tools and Applications, 2013
Computer Vision and Image Understanding, 2011
Handbook of Mathematical Models in Computer Vision
This chapter focuses on the problem of obtaining a complete spatio-temporal description of some objects undergoing a non-rigid motion, given several calibrated and synchronized videos of the scene. Using stereovision and scene flow methods in conjunction, the three-dimensional shape and the non-rigid three-dimensional motion field of the objects can be recovered. We review the unrealistic photometric and geometric assumptions which plague existing methods. A novel method based on deformable surfaces is proposed to alleviate some of these limitations.
Proceedings - IEEE International Conference on Video and Signal Based Surveillance 2006, AVSS 2006, 2006
In this paper a system is presented able to reproduce the actions of multiple moving objects into a 3D model. A multi-camera surveillance system is used for automatically detect, track and classify the objects. Data fusion from multiple sensors allows to get a more precise estimation of the position of detected moving objects and to solve occlusions problem. These data are then used to automatically place and animate objects avatars in a 3D virtual model of the scene, thus allowing a human operator to remotely visualize the dynamic 3D reconstruction by selecting a arbitrary point of view.
1997
Abstract This paper presents our approach to retrieve a dependable three-dimensional description of a partially known indoor environment. We describe the way the sensor data from a video camera is preprocessed by contour tracing to extract the boundary lines of the objects and how this information is transformed into a three-dimensional environmental model of the world.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
IEEE Computer Graphics and Applications, 2002
Lecture Notes in Computer Science, 2006
… of Graphics Interface …, 2010
Digital Media Processing for Multimedia Interactive Services, 2003
IEEE Transactions on Image …, 2007
9th European Signal Processing Conference (EUSIPCO 1998), 1998
Proceedings. International Conference on Image Processing, 2002
Image and Vision Computing, 2004