Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2010, 2010 IEEE International Conference on Multimedia and Expo
…
6 pages
1 file
Though the variety of desktop real time stereo vision systems has grown considerably in the past several years, few make any verifiable claims about the accuracy of the algorithms used to construct 3D data or describe how the data generated by such systems, which is large in size, can be effectively distributed. In this paper, we describe a system that creates an accurate (on the order of a centimeter), 3D reconstruction of an environment in real time (under 30 ms) that also allows for remote interaction between users. This paper addresses how to reconstruct, compress, and visualize the 3D environment. In contrast to most commercial desktop real time stereo vision systems our algorithm produces 3D meshes instead of dense point clouds, which we show allows for better quality visualizations. The chosen representation of the data also allows for high compression ratios for transfer to remote sites. We demonstrate the accuracy and speed of our results on a variety of benchmarks.
1998
In this paper we address an application of computer vision which can in the future change completely our way of communicating over the network. We present our version of a testbed for telecollaboration. It is based on a highly accurate and precise stereo algorithm. The results demonstrate the live (on-line) recovery of 3D models of a dynamically changing environment and the simultaneous display and manipulation of the models
2008 Tenth IEEE International Symposium on Multimedia, 2008
In this paper, we present a framework for immersive 3D video conferencing and geographically distributed collaboration. Our multi-camera system performs a full-body 3D reconstruction of users in real time and renders their image in a virtual space allowing remote interaction between users and the virtual environment. The paper features an overview of the technology and algorithms used for calibration, capturing, and reconstruction. We introduce stereo mapping using adaptive triangulation which allows for fast (under 25 ms) and robust real-time 3D reconstruction. The chosen representation of the data provides high compression ratios for transfer to a remote site. The algorithm produces partial 3D meshes, instead of dense point clouds, which are combined on the renderer to create a unified model of the user. We have successfully demonstrated the use of our system in various applications such as remote dancing and immersive Tai Chi learning.
Confluence of Computer Vision and Computer Graphics, 2000
In this paper we present the first implementation of a new medium for tele-collaboration. The realized testbed consists of two tele-cubicles at two Internet nodes. At each tele-cubicle a stereo-rig is used to provide an accurate dense 3D-reconstruction of a person in action. The two real dynamic worlds are exchanged over the network and visualized stereoscopically. The remote communication and the dynamic nature of tele-collaboration raise the question of optimal representation for graphics and vision. We treat the issues of limited bandwidth, latency, and processing power with a tunable 3D-representation where the user can decide over the trade-off between delay and 3D-resolution by tuning the spatial resolution, the size of the working volume, and the uncertainty of reconstruction. Due to the limited number of cameras and displays our system can not provide the user with a surround-immersive feeling. However, it is the first system that uses 3D-real-data that are reconstructed online at another site. The system has been implemented with low-cost off-the-shelf hardware and has been successfully demonstrated in local area networks.
International Journal of Digital Multimedia Broadcasting, 2010
We present a multicamera real-time 3D modeling system that aims at enabling new immersive and interactive environments. This system, called Grimage, allows to retrieve in real-time a 3D mesh of the observed scene as well as the associated textures. This information enables a strong visual presence of the user into virtual worlds. The 3D shape information is also used to compute collisions and reaction forces with virtual objects, enforcing the mechanical presence of the user in the virtual world. The innovation is a fully integrated system with both immersive and interactive capabilities. It embeds a parallel version of the EPVH modeling algorithm inside a distributed vision pipeline. It also adopts the hierarchical component approach of the FlowVR middleware to enforce software modularity and enable distributed executions. Results show high refresh rates and low latencies obtained by taking advantage of the I/O and computing resources of PC clusters. The applications we have developed demonstrate the quality of the visual and mechanical presence with a single platform and with a dual platform that allows telecollaboration.
Lecture Notes in Computer Science, 2014
We present an approach for high quality rendering of the 3D representation of a remote collaboration scene, along with real-time rendering speed, by expanding the unstructured lumigraph rendering (ULR) method. ULR uses a 3D proxy which is in the simplest case a 2D plane. We develop dynamic proxy for ULR, to get a better and more detailed 3D proxy in real-time; which leads to the rendering of highquality and accurate 3D scenes with motion parallax support. The novel contribution of this work is the development of a dynamic proxy in realtime. The dynamic proxy is generated based on depth images instead of color images as in the Lumigraph approach.
IEEE Transactions on Multimedia, 2000
The growing popularity of 3-D movies has led to the rapid development of numerous affordable consumer 3-D displays. In contrast, the development of technology to generate 3-D content has lagged behind considerably. In spite of significant improvements to the quality of imaging devices, the accuracy of the algorithms that generate 3-D data, and the hardware available to render such data, the algorithms available to calibrate, reconstruct, and then visualize such data remain difficult to use, extremely noise sensitive, and unreasonably slow. In this paper, we present a multi-camera system that creates a highly accurate (on the order of a centimeter), 3-D reconstruction of an environment in real-time (under 30 ms) that allows for remote interaction between users. This paper focuses on addressing the aforementioned deficiencies by describing algorithms to calibrate, reconstruct, and render objects in the system. We demonstrate the accuracy and speed of our results on a variety of benchmarks and data collected from our own system.
IEEE Transactions on Circuits and Systems for Video Technology, 2004
The processing power and network bandwidth required for true immersive telepresence applications are only now beginning to be available. We draw from our experience developing stereo based tele-immersion prototypes to present the main issues arising when building these systems. Tele-immersion is a new medium that enables a user to share a virtual space with remote participants. The user is immersed in a rendered three-dimensional (3-D) world that is transmitted from a remote site. To acquire this 3-D description, we apply binocular and trinocular stereo techniques which provide a view-independent scene description. Slow processing cycles or long network latencies interfere with the users' ability to communicate, so the dense stereo range data must be computed and transmitted at high frame rates. Moreover, reconstructed 3-D views of the remote scene must be as accurate as possible to achieve a sense of presence. We address both issues of speed and accuracy using a variety of techniques including the power of supercomputing clusters and a method for combining motion and stereo in order to increase speed and robustness. We present the latest prototype acquiring a room-size environment in real time using a supercomputing cluster, and we discuss its strengths and current weaknesses.
Stereoscopic Displays and Virtual Reality Systems XIV, 2007
We present a point based reconstruction and transmission pipeline for a collaborative tele-immersion system. Two or more users in different locations collaborate with each other in a shared, simulated environment as if they were in the same physical room. Each user perceives point-based models of distant users along with collaborative data like molecule models. Disparity maps, computed by a commercial stereo solution, are filtered and transformed into clouds of 3D points. The clouds are compressed and transmitted over the network to distant users. At the other side the clouds are decompressed and incorporated into the 3D scene. The viewpoint used to display the 3D scene is dependent on the position of the head of the user. Collaborative data is manipulated through natural hand gestures. We analyse the performance of the system in terms of computation time, latency and photo realistic quality of the reconstructed models.
Stereoscopic Displays and Virtual Reality Systems XIV, 2007
We present a point based reconstruction and transmission pipeline for a collaborative tele-immersion system. Two or more users in different locations collaborate with each other in a shared, simulated environment as if they were in the same physical room. Each user perceives point-based models of distant users along with collaborative data like molecule models. Disparity maps, computed by a commercial stereo solution, are filtered and transformed into clouds of 3D points. The clouds are compressed and transmitted over the network to distant users. At the other side the clouds are decompressed and incorporated into the 3D scene. The viewpoint used to display the 3D scene is dependent on the position of the head of the user. Collaborative data is manipulated through natural hand gestures. We analyse the performance of the system in terms of computation time, latency and photo realistic quality of the reconstructed models.
The Visual Computer, 2010
Vision-based full-body 3D reconstruction for tele-immersive applications generates large amount of data points, which have to be sent through the network in real time. In this paper, we introduce a skeleton-based compression method using motion estimation where kinematic parameters of the human body are extracted from the point cloud data in each frame. First we address the issues regarding the data capturing and transfer to a remote site for the tele-immersive collaboration. We compare the results of the existing compression methods and the proposed skeletonbased compression technique. We examine the robustness and efficiency of the algorithm through experimental results with our multi-camera tele-immersion system. The proposed skeleton-based method provides high and flexible compression ratios from 50:1 to 5000:1 with reasonable reconstruction quality (peak signal-to-noise ratio from 28 to 31 dB) while preserving real-time (10+ fps) processing.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
IEEE Computer Graphics and Applications, 2002
Computational Aerosciences in the 21st Century, 2000
Archiwum ISPRS, 2004
2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 2009
2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005
Proceedings of the first annual ACM SIGMM conference on Multimedia systems - MMSys '10, 2010
2008 Canadian Conference on Computer and Robot Vision, 2008