Papers by David Antonio Gómez Jáuregui
Reviews in Fish Biology and Fisheries
Networked 3D virtual environments allow multiple users to interact with each other over the Inter... more Networked 3D virtual environments allow multiple users to interact with each other over the Internet.
Résumé : Dans une collaboration à distance médiatisée par ordinateur, la communication entre pers... more Résumé : Dans une collaboration à distance médiatisée par ordinateur, la communication entre personnes est renforcée par le contact visuel. Toutefois, la vidéoconférence ne permet pas restituer les actions sur les objets partagés. La perception mutuelle des participants et de leurs ...

Postural control is a dynamical process that has been extensively studied in motor control resear... more Postural control is a dynamical process that has been extensively studied in motor control research. Recent experimental work shows a direct impact of affects on human balance. However, few studies on the automatic recognition of affects in full body expressions consider balance variables such as center of gravity displacements. Force plates enable the capture of balance variables with high precision. Automatic video extraction of the center of gravity is a basic alternative, which can be easily accessible for a wide range of public applications. This paper presents a comparison of balance variables extracted from the force plate and video processing. These variables are used to capture the bodily expressions of participants in a public speaking task designed to elicit stress. Results show that the variability of the center of gravity displacements from the force plate and video are related to negative emotions and situation appraisals. The power spectrum density broadness of the center of pressure from the force plate is related to Difficulty Describing Feelings, an important factor from a dispositional trait of Alexithymia. Implications of the use of such methods are discussed.

Postural control is a dynamical process that has been extensively studied in motor control resear... more Postural control is a dynamical process that has been extensively studied in motor control research. Recent experimental work shows a direct impact of affects on human balance. However, few studies on the automatic recognition of affects in full body expressions consider balance variables such as center of gravity displacements. Force plates enable the capture of balance variables with high precision. Automatic video extraction of the center of gravity is a basic alternative, which can be easily accessible for a wide range of public applications. This paper presents a comparison of balance variables extracted from the force plate and video processing. These variables are used to capture the bodily expressions of participants in a public speaking task designed to elicit stress. Results show that the variability of the center of gravity displacements from the force plate and video are related to negative emotions and situation appraisals. The power spectrum density broadness of the center of pressure from the force plate is related to Difficulty Describing Feelings, an important factor from a dispositional trait of Alexithymia. Implications of the use of such methods are discussed.

IEEE Transactions on Visualization and Computer Graphics, Apr 2014
In this paper we study how the visual animation of a self-avatar can be artificially modified in re... more In this paper we study how the visual animation of a self-avatar can be artificially modified in real-time in order to generate
different haptic perceptions. In our experimental setup, participants could watch their self-avatar in a virtual environment in mirror mode while performing a weight lifting task. Users could map their gestures on the self-animated avatar in real-time using a Kinect.
We introduce three kinds of modification of the visual animation of the self-avatar according to the effort delivered by the virtual
avatar: 1) changes on the spatial mapping between the user’s gestures and the avatar, 2) different motion profiles of the animation, and 3) changes in the posture of the avatar (upper-body inclination). The experimental task consisted of a weight lifting task in which participants had to order four virtual dumbbells according to their virtual weight. The user had to lift each virtual dumbbells by means of a tangible stick, the animation of the avatar was modulated according to the virtual weight of the dumbbell. The results showed that the altering the spatial mapping delivered the best performance. Nevertheless, participants globally appreciated all the different visual effects. Our results pave the way to the exploitation of such novel techniques in various VR applications such as sport training, exercise games, or industrial training scenarios in single or collaborative mode.

Proceedings of the 2013 on Emotion recognition in the wild challenge and workshop, Dec 13, 2013
Several vision-based systems for automatic recognition of
emotion have been proposed in the lite... more Several vision-based systems for automatic recognition of
emotion have been proposed in the literature. However most of
these systems are evaluated only under controlled laboratory
conditions. These controlled conditions poorly represent the
constraints faced in real-world ecological situations. In this paper, two studies are described. In the first study we evaluate whether two robust vision-based measures (approach-avoidance detection and quantity of motion) can be used to discriminate between different emotions in a dataset containing acted facial expressions under uncontrolled conditions. In the second study we evaluate in the same dataset the accuracy of a commercially available software used for automatic emotion recognition under controlled
conditions. Results showed that the evaluated measures are able to discriminate different emotions in uncontrolled conditions. In addition, the accuracy of the commercial software evaluated is reported.

ICMI '13 Proceedings of the 15th ACM on International conference on multimodal interaction, Dec 13, 2013
Analysis of non-verbal behaviors in HCI allows understanding how individuals apprehend and adapt ... more Analysis of non-verbal behaviors in HCI allows understanding how individuals apprehend and adapt to different situations of interaction. This seems particularly relevant when considering tasks such as speaking in a foreign language, which is known to elicit anxiety. This is even truer for young users for whom negative pedagogical feedbacks might have a strong negative impact on their motivation to learn.
In this paper, we consider the approach-avoidance behaviors of teenagers speaking with virtual agents when using an e-learning platform for learning English. We designed an algorithm for processing the video of these teenagers outside laboratory conditions (e.g. a classical collective classroom in a secondary school) using a webcam. This algorithm processes the video of the user and computes the inter-ocular distance. The anxiety of the users is also collected with questionnaires.
Results show that the inter-ocular distance enables to discriminate between approach and avoidance behaviors of teenagers reacting to positive or negative stimulus. This simple metric collected via video processing enables to detect an approach behavior related to a positive stimulus and an avoidance behavior related to a negative stimulus. Furthermore, we observed that these automatically detected approach-avoidance behaviors are correlated with anxiety.
Ninth Artificial Intelligence and Interactive Digital Entertainment Conference, Nov 2013
Virtual agents used in storytelling applications should display consistent and natural multimodal... more Virtual agents used in storytelling applications should display consistent and natural multimodal expressions of emotions. In this paper, we describe the method that we defined to endow virtual narrators with individual gesture profiles. We explain how we collected a corpus of gestural behaviors displayed by different actors telling the same story. Videos were annotated both manually and automatically. Preliminary analyses are presented.

Affective Computing and Intelligent Interaction 2013 (ACII 2013), Sep 2, 2013
Postural control is a dynamical process that has been extensively studied in motor control resear... more Postural control is a dynamical process that has been extensively studied in motor control research. Recent experimental work shows a direct impact of affects on human balance. However, few studies on the automatic recognition of affects in full body expressions consider balance variables such as center of gravity displacements. Force plates enable the capture of balance variables with high precision. Automatic video extraction of the center of gravity is a basic alternative, which can be easily accessible for a wide range of public applications. This paper presents a comparison of balance variables extracted from the force plate and video processing. These variables are used to capture the bodily expressions of participants in a public speaking task designed to elicit stress. Results show that the variability of the center of gravity displacements from the force plate and video are related to negative emotions and situation appraisals. The power spectrum density broadness of the center of pressure from the force plate is related to Difficulty Describing Feelings, an important factor from a dispositional trait of Alexithymia. Implications of the use of such methods are discussed.
Affective Computing and Intelligent Interaction 2013 (ACII 2013), Sep 2, 2013
Databases of spontaneous multimodal expressions of affective states occurring during a task are f... more Databases of spontaneous multimodal expressions of affective states occurring during a task are few. This paper presents a protocol for eliciting stress in a public speaking task. Behaviors of 19 participants were recorded via a multimodal setup including speech, video of the facial expressions and body movements, balance via a force plate, and physiological measures. Questionnaires were used to assert emotional states, personality profiles and relevant coping behaviors to study how participants cope with stressful situations. Several subjective and objective performances were also evaluated. Results show a significant impact of the overall task and conditions on the participants' emotional activation. The possible future use of this new multimodal emotional corpus is described.
ACM Transactions on Applied Perception (TAP) - Special issue SAP 2013, Aug 2013
We introduce the Elastic Images, a novel pseudo-haptic feedback technique which enables the perce... more We introduce the Elastic Images, a novel pseudo-haptic feedback technique which enables the perception of the local elasticity of images without the need of any haptic device. The proposed approach focus on whether visual feedback is able to induce a sensation of stiffness when the user interacts with an image using a standard mouse. The user, when clicking on a Elastic Image, is able to deform it locally according to its elastic properties. To reinforce the effect, we also propose the generation of procedural shadows and creases to simulate the compressibility of the image and several mouse cursors replacements to enhance pressure and stiffness perception.
Computer Vision – ECCV 2012. Workshops and Demonstrations, Oct 2012
Avatars in networked 3D virtual environments allow users to interact over the Internet and to get... more Avatars in networked 3D virtual environments allow users to interact over the Internet and to get some feeling of virtual telepresence. However, avatar control may be tedious. Motion capture systems based on 3D sensors have recently reached the consumer market, but webcams and camera-phones are more widespread and cheaper. The proposed demonstration aims at animating a user's avatar from real time 3D motion capture by monoscopic computer vision, thus allowing virtual telepresence to anyone using a personal computer with a webcam or a camera-phone. This kind of immersion allows new gesture-based communication channels to be opened in a virtual inhabited 3D space.

EuroHaptics'12 Proceedings of the 2012 international conference on Haptics: perception, devices, mobility, and communication, 2012
Pseudo-haptic textures allow to optically-induce relief in textures without a haptic device by ad... more Pseudo-haptic textures allow to optically-induce relief in textures without a haptic device by adjusting the speed of the mouse pointer according to the depth information encoded in the texture. In this work, we present a novel approach for using curvature information instead of relying on depth information. The curvature of the texture is encoded in a normal map which allows the computation of the curvature and local changes of orientation, according to the mouse position and direction. A user evaluation was conducted to compare the optically-induced haptic feedback of the curvature-based approach versus the original depth-based approach based on depth maps. Results showed that users, in addition to being able to eciently recognize simulated bumps and holes with the curvature-based approach, were also able to discriminate shapes with lower frequency and amplitude.

3D User Interfaces (3DUI), 2012 IEEE Symposium on, Mar 2012
The selection and manipulation of 3D content in desktop virtual environments is commonly achieved... more The selection and manipulation of 3D content in desktop virtual environments is commonly achieved with 2D mouse cursor-based interaction. However, by interacting with image-based techniques we introduce a conflict between the 2D space in which the 2D cursor lays and the 3D content. For example, the 2D mouse cursor does not provide any information about the depth of the selected objects. In this situation, the user has to rely on the depth cues provided by the virtual environment, such as perspective deformation, shading and shadows. In this paper, we explore new metaphors to improve the depth perception when interacting with 3D content. Our approach focus on the usage of 3D cursors controlled with 2D input devices (the Hand Avatar and the Torch) and a pseudo-motion parallax effect. The additional depth cues provided by the visual feedback of the 3D cursors and the motion parallax are expected to increase the users' depth perception of the environment. The evaluation of proposed techniques showed that users' depth perception was significantly increased. Users were able to better judge the depth ordering of virtual environment. Although 3D cursors showed a decrease of selection performance, it is compensated by the increased depth perception.

Multimedia Signal Processing (MMSP), 2010 IEEE International Workshop on, Oct 2010
Particle filtering is known as a robust approach for motion tracking by vision, at the cost of he... more Particle filtering is known as a robust approach for motion tracking by vision, at the cost of heavy computation in a high dimensional pose space. In this work, we describe a number of heuristics that we demonstrate to jointly improve robustness and real-time for motion capture. 3D human motion capture by monocular vision without markers can be achieved in realtime by registering a 3D articulated model on a video. First, we search the high-dimensional space of 3D poses by generating new hypotheses (or particles) with equivalent 2D projection by kinematic flipping. Second, we use a semi-deterministic particle prediction based on local optimization. Third, we deterministi-cally resample the probability distribution for a more efficient selection of particles. Particles (or poses) are evaluated using a match cost function and penalized with a Gaussian probability pose distribution learned off-line. In order to achieve real-time, measurement step is parallelized on GPU using the OpenCL API. We present experimental results demonstrating robust real-time 3D motion capture with a consumer computer and webcam.

IHCI 2010 : Second IEEE International Conference on Intelligent Human Computer Interaction, 2010
In a computer supported distant collaboration, communication between users can be enhanced with a... more In a computer supported distant collaboration, communication between users can be enhanced with a visual channel. Plain videos of individual users unfortunately fail to render their joint actions on the objects they share, which limits their mutual perception. Remote interaction can be enhanced by immersing user representations (avatars) and the shared objects in a networked 3D virtual environment, so user actions are rendered by avatar mimicry. Communication gesture (not actions) are captured by real-time computer vision and rendered. We have developed a system based on a single-webcam for body and face 3D motion capture. We have used a library of communication gestures to learn statistical gesture models and used them as prior constraints for monocular motion capture, so improving tracking ambiguous poses and rendering some motion details. We have developed an open source library for real-time image analysis and computer vision that supports acceleration by consumer graphical processing units (GPUs). Finally, users are rendered with low-bandwidth avatar animation, thus opening the path to low-cost remote virtual presence at home.
IT in Medicine & Education, 2009. ITIME '09. IEEE International Symposium on, Aug 2009
Virtual reality has been successful used in real estate, urban planning and video games etc. In t... more Virtual reality has been successful used in real estate, urban planning and video games etc. In this paper, we proposed to build an interactive virtual e-learning environment using virtual reality technologies. With the help of multi-modal user interface, students can control their avatars to interactively communicate with virtual teachers and environment. This interactive e-learning environment can increase the autonomy of students and enhance students' interest in learning.
MIRAGE '09 Proceedings of the 4th International Conference on Computer Vision/Computer Graphics CollaborationTechniques, 2009
3D human motion capture by real-time monocular vision without using markers can be achieved by re... more 3D human motion capture by real-time monocular vision without using markers can be achieved by registering a 3D articulated model on a video. Registration consists in iteratively optimizing the match between primitives extracted from the model and the images with respect to the model position and joint angles. We extend a previous color-based registration algorithm with a more precise edge-based registration step. We present an experimental analysis of the residual error vs. the computation time and we discuss the balance between both approaches.
Uploads
Papers by David Antonio Gómez Jáuregui
different haptic perceptions. In our experimental setup, participants could watch their self-avatar in a virtual environment in mirror mode while performing a weight lifting task. Users could map their gestures on the self-animated avatar in real-time using a Kinect.
We introduce three kinds of modification of the visual animation of the self-avatar according to the effort delivered by the virtual
avatar: 1) changes on the spatial mapping between the user’s gestures and the avatar, 2) different motion profiles of the animation, and 3) changes in the posture of the avatar (upper-body inclination). The experimental task consisted of a weight lifting task in which participants had to order four virtual dumbbells according to their virtual weight. The user had to lift each virtual dumbbells by means of a tangible stick, the animation of the avatar was modulated according to the virtual weight of the dumbbell. The results showed that the altering the spatial mapping delivered the best performance. Nevertheless, participants globally appreciated all the different visual effects. Our results pave the way to the exploitation of such novel techniques in various VR applications such as sport training, exercise games, or industrial training scenarios in single or collaborative mode.
emotion have been proposed in the literature. However most of
these systems are evaluated only under controlled laboratory
conditions. These controlled conditions poorly represent the
constraints faced in real-world ecological situations. In this paper, two studies are described. In the first study we evaluate whether two robust vision-based measures (approach-avoidance detection and quantity of motion) can be used to discriminate between different emotions in a dataset containing acted facial expressions under uncontrolled conditions. In the second study we evaluate in the same dataset the accuracy of a commercially available software used for automatic emotion recognition under controlled
conditions. Results showed that the evaluated measures are able to discriminate different emotions in uncontrolled conditions. In addition, the accuracy of the commercial software evaluated is reported.
In this paper, we consider the approach-avoidance behaviors of teenagers speaking with virtual agents when using an e-learning platform for learning English. We designed an algorithm for processing the video of these teenagers outside laboratory conditions (e.g. a classical collective classroom in a secondary school) using a webcam. This algorithm processes the video of the user and computes the inter-ocular distance. The anxiety of the users is also collected with questionnaires.
Results show that the inter-ocular distance enables to discriminate between approach and avoidance behaviors of teenagers reacting to positive or negative stimulus. This simple metric collected via video processing enables to detect an approach behavior related to a positive stimulus and an avoidance behavior related to a negative stimulus. Furthermore, we observed that these automatically detected approach-avoidance behaviors are correlated with anxiety.
different haptic perceptions. In our experimental setup, participants could watch their self-avatar in a virtual environment in mirror mode while performing a weight lifting task. Users could map their gestures on the self-animated avatar in real-time using a Kinect.
We introduce three kinds of modification of the visual animation of the self-avatar according to the effort delivered by the virtual
avatar: 1) changes on the spatial mapping between the user’s gestures and the avatar, 2) different motion profiles of the animation, and 3) changes in the posture of the avatar (upper-body inclination). The experimental task consisted of a weight lifting task in which participants had to order four virtual dumbbells according to their virtual weight. The user had to lift each virtual dumbbells by means of a tangible stick, the animation of the avatar was modulated according to the virtual weight of the dumbbell. The results showed that the altering the spatial mapping delivered the best performance. Nevertheless, participants globally appreciated all the different visual effects. Our results pave the way to the exploitation of such novel techniques in various VR applications such as sport training, exercise games, or industrial training scenarios in single or collaborative mode.
emotion have been proposed in the literature. However most of
these systems are evaluated only under controlled laboratory
conditions. These controlled conditions poorly represent the
constraints faced in real-world ecological situations. In this paper, two studies are described. In the first study we evaluate whether two robust vision-based measures (approach-avoidance detection and quantity of motion) can be used to discriminate between different emotions in a dataset containing acted facial expressions under uncontrolled conditions. In the second study we evaluate in the same dataset the accuracy of a commercially available software used for automatic emotion recognition under controlled
conditions. Results showed that the evaluated measures are able to discriminate different emotions in uncontrolled conditions. In addition, the accuracy of the commercial software evaluated is reported.
In this paper, we consider the approach-avoidance behaviors of teenagers speaking with virtual agents when using an e-learning platform for learning English. We designed an algorithm for processing the video of these teenagers outside laboratory conditions (e.g. a classical collective classroom in a secondary school) using a webcam. This algorithm processes the video of the user and computes the inter-ocular distance. The anxiety of the users is also collected with questionnaires.
Results show that the inter-ocular distance enables to discriminate between approach and avoidance behaviors of teenagers reacting to positive or negative stimulus. This simple metric collected via video processing enables to detect an approach behavior related to a positive stimulus and an avoidance behavior related to a negative stimulus. Furthermore, we observed that these automatically detected approach-avoidance behaviors are correlated with anxiety.