Papers by Radoslaw Niewiadomski

In this paper, we propose a set of algorithms to compute the cues of the nonverbal leadership in ... more In this paper, we propose a set of algorithms to compute the cues of the nonverbal leadership in an unstructured joint physical activity, i.e., the joint activity of two or more interacting persons who perform some movements without a predefined sequence and without a predefined leader. An example of such activity can be a contact dance improvisation. The paper is composed of three parts: cue set, dataset and algorithms. First, we propose a cue set of nonverbal leadership which is grounded on existing literature and studies. It is composed of eight cues that characterize the nonverbal behaviors of the leader in a joint physical activity. In this paper we also introduce a new dataset. It consists of multimodal data (video, MoCap) of contact dance improvisations. Additionally, sensory deprivation conditions (vision and/or touch restraint) were introduced to collect the evidences of the various strategies used by leaders and followers during improvisation. The dataset was annotated by twenty-seven persons who carried out continuous annotation of leadership in the recorded material. In the last part of the paper, we propose a set of algorithms that works on positional 3D data (i.e., joints' positions obtained from motion capture data of dancers). Each algorithm models one among the discussed cues of the nonverbal leadership.
The rising popularity of learning techniques in data analysis has recently led to an increased ne... more The rising popularity of learning techniques in data analysis has recently led to an increased need of large-scale datasets. In this study, we propose a system consisting of a VR game and a software platform designed to collect the player's multimodal data, synchronized with the VR content, with the aim of creating a dataset for emotion detection and recognition. The game was implemented ad-hoc in order to elicit joy and frustration, following the emotion elicitation process described by Roseman's appraisal theory. In this preliminary study, 5 participants played our VR game along with pre-existing ones and self-reported experienced emotions.

In this paper, we investigate the detection of laughter from the user's non-verbal full-body move... more In this paper, we investigate the detection of laughter from the user's non-verbal full-body movement in social and ecological contexts. 801 laughter and non-laughter segments of full-body movement were examined from a corpus of motion capture data of subjects participating in social activities that stimulated laughter. A set of 13 full-body movement features was identified and corresponding automated extraction algorithms were developed. These features were extracted from the laughter and non-laughter segments and the resulting data set was provided as input to supervised machine learning techniques. Both discriminative (radial basis function-Support Vector Machines, k-Nearest Neighbor, and Random Forest) and probabilistic (Naive Bayes and Logistic Regression) classifiers were trained and evaluated. A comparison of automated classification with the ratings of human observers for the same laughter and non-laughter segments showed that the performance of our approach for automated laughter detection is comparable with that of humans. The highest F-score (0.74) was obtained by the Random Forest classifier, whereas the F-score obtained by human observers was 0.70. Based on the analysis techniques introduced in the paper, a vision based system prototype for automated laughter detection was designed and evaluated. Support Vector Machines and Kohonen's Self Organizing Maps were used for training and the highest F-score was obtained with SVM (0.73).

Full-body human movement is characterized by fine-grain expressive qualities that humans are easi... more Full-body human movement is characterized by fine-grain expressive qualities that humans are easily capable of exhibiting and recognizing in others' movement. In sports (e.g., martial arts) and performing arts (e.g., dance), the same sequence of movements can be performed in a wide range of ways characterized by different qualities, often in terms of subtle (spatial and temporal) perturbations of the movement. Even a non-expert observer can distinguish between a top-level and average performance by a dancer or martial artist. The difference is not in the performed movements-the same in both cases-but in the "quality" of their performance. In this article, we present a computational framework aimed at an automated approximate measure of movement quality in full-body physical activities. Starting from motion capture data, the framework computes low-level (e.g., a limb velocity) and high-level (e.g., synchronization between different limbs) movement features. Then, this vector of features is integrated to compute a value aimed at providing a quantitative assessment of movement quality approximating the evaluation that an external expert observer would give of the same sequence of movements. Next, a system representing a concrete implementation of the framework is proposed. Karate is adopted as a testbed. We selected two different katas (i.e., detailed choreographies of movements in karate) characterized by different overall attitudes and expressions (aggressiveness, meditation), and we asked seven athletes, having various levels of experience and age, to perform them. Motion capture data were collected from the performances and were analyzed with the system. The results of the automated analysis were compared with the scores given by 14 karate experts who rated the same performances. Results show that the movementquality scores computed by the system and the ratings given by the human observers are highly correlated (Pearson's correlations r = 0.84, p = 0.001 and r = 0.75, p = 0.005). CCS Concepts: • Computing methodologies → Activity recognition and understanding; Model development and analysis; • Applied computing → Arts and humanities;

Social interactions entail often complex and dynamic situations that follow non-explicit, unwritt... more Social interactions entail often complex and dynamic situations that follow non-explicit, unwritten rules. Comprehending those signals and knowing how to respond becomes the key to the success of any social communication. Thus, in order to integrate a robot into a social context it should be capable of (at least) understanding others' emotional states. Nonetheless, mastering such skill is beyond reach for current robotics which is why we introduce the single internal state which we believe reveals the most regarding interactive communications. We named it Comfortability and defined it as (disapproving of or approving of) the situation that arises as a result of a social interaction which influences one's own desire of maintaining or withdrawing from it. Consequently, in this paper we aim to show that Comfortability can be evoked by robots, investigating at the same time its connection with other emotional states. To do that, we performed two online experiments on 196 participants asking them to imagine being interviewed by a reporter on a sensitive topic. The interviewer's actions were presented in two different formats: the first experiment (the Narrative Context) presented the actions as text; whereas the second experiment (the Visual Context) presented the actions as videos performed by the humanoid robot iCub. The actions were designed to evoke different Comfortability levels. According to the experimental results, Comfortability differs from the other reported emotional and affective states and more importantly, it can be evoked by both, humans and robots in an imaginary interaction.

Interaction among humans does not always proceed without errors; situations might happen in which... more Interaction among humans does not always proceed without errors; situations might happen in which a wrong word or attitude can cause the partner to feel uneasy. However, humans are often very sensitive to these interaction failures and may be able to fix them. Our research aims to endow robots with the same skill. Thus the first step, presented in this short paper, investigates to what extent a humanoid robot can impact someone's Comfortability [11] in a realistic setting. To capture natural reactions, a set of real interviews performed by the humanoid robot iCub (acting as the interviewer) were organized. The interviews were designed in collaboration with a journalist from the press office of our institution and are meant to appear on the official institutional online magazine. The dialogue along with fluent human-like robotic actions were chosen not only to gather information about the participants' personal interests and professional career, necessary for the magazine column, but also to influence their Comfortability. Once the experiment is completed, the participants' self-report and spontaneous reactions (physical and physiological cues) will be explored to tackle the way people's Comfortability may be manifested through non-verbal cues, and the way it may be impacted by the humanoid robot.

Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems
This work investigates classification of emotions from MoCap full-body data by using Convolutiona... more This work investigates classification of emotions from MoCap full-body data by using Convolutional Neural Networks (CNN). Rather than addressing regular day to day activities, we focus on a more complex type of full-body movement-dance. For this purpose, a new dataset was created which contains short excerpts of the performances of professional dancers who interpreted four emotional states: anger, happiness, sadness, and insecurity. Fourteen minutes of motion capture data are used to explore different CNN architectures and data representations. The results of the four-class classification task are up to 0.79 (F1 score) on test data of other performances by the same dancers. Hence, through deep learning, this paper proposes a novel and effective method of emotion classification which can be exploited in affective interfaces.

The EU H2020 ICT Project DANCE investigates how affective and social qualities of human full-body... more The EU H2020 ICT Project DANCE investigates how affective and social qualities of human full-body movements can be expressed, represented, and analysed by sound and music performance. In this paper we focus on one of the candidate movement qualities: Fluidity. An algorithm to detect Fluidity in full-body movement, and a model of interactive sonification to convey Fluidity through the auditory channel are presented. We developed a set of different sonifications: some follows the proposed sonification model, and others are based on different, in some cases opposite, rules. Our hypothesis is that our proposed sonification model is the most effective in communicating Fluidity. To confirm the hypothesis, we developed a serious game and performed an experiment with 22 participants at MOCO 2016 conference. Results suggest that the sonifications following our proposed model are the most effective in conveying Fluidity.
In this paper we propose an extension of the current SAIBA architecture. The new parts of the arc... more In this paper we propose an extension of the current SAIBA architecture. The new parts of the architecture should manage the generation of Embodied Conversational Agents’ reactive behaviors during an interaction with users both while speaking and listening. General Terms
In this paper we present a complete interactive system enabled to detect human laughs and respond... more In this paper we present a complete interactive system enabled to detect human laughs and respond appropriately, by integrating the information of the human behavior and the context. Furthermore, the impact of our autonomous laughter-aware agent on the humor experience of the user and interaction between user and agent is evaluated by subjective and objective means. Preliminary results show that the laughter-aware agent increases the humor experience (i.e., felt amusement of the user and the funniness rating of the film clip), and creates the notion of a shared social experience, indicating that the agent is useful to elicit positive humor-related affect and emotional contagion.

IEEE Transactions on Affective Computing, 2021
This work investigates classification of emotions from full-body movements by using a novel Convo... more This work investigates classification of emotions from full-body movements by using a novel Convolutional Neural Network-based architecture. The model is composed of two shallow networks processing in parallel when the 8-bit RGB images obtained from time intervals of 3D-positional data are the inputs. One network performs a coarse-grained modelling in the time domain while the other one applies a fine-grained modelling. We show that combining different temporal scales into a single architecture improves the classification results of a dataset composed of short excerpts of the performances of professional dancers who interpreted four affective states: anger, happiness, sadness, and insecurity. Additionally, we investigate the effect of data chunk duration, overlapping, the size of the input images and the contribution of several data augmentation strategies for our proposed method. Better recognition results were obtained when the duration of a data chunk was longer, and this was further improved by applying balanced data augmentation. Moreover, we test our method on other existing motion capture datasets and compare the results with prior art. In all experiments, our results surpassed the state-of-the-art approaches, showing that this method generalizes across diverse settings and contexts.

Numerous studies on emotion recognition from physiological signals have been conducted in laborat... more Numerous studies on emotion recognition from physiological signals have been conducted in laboratory settings. However, differences in the data on emotions elicited in the lab and in the wild have been observed. Thus, there is a need for systems collecting and labelling emotion-related physiological data in ecological settings. This paper proposes a new solution to collect and label such data: an open-source mobile application (app) based on the appraisal theory. Our approach exploits a commercially available wearable physiological sensor connected to a smartphone. The app detects relevant events from the physiological data, and prompts the users to report their emotions using a questionnaire based on the Ortony, Clore and Collins (OCC) Model. We believe that the app can be used to collect emotional and physiological data in ecological settings and to ensure high quality of ground truth labels.
The form of an action, i.e. the way it is performed, conveys important information about the perf... more The form of an action, i.e. the way it is performed, conveys important information about the performer’s attitude. In this paper we investigate spatiotemporal characteristics of different gestures performed with specific vitality forms and we study whether it is possible to recognize these aspects of action automatically. As the first step, we created a new dataset of 7 gestures performed with a vitality form (gentle and rude) or without a vitality form (neutral, slow and fast). Thousand repetitions were collected from 2 professional actors. Next, we identified 22 features from the motion capture data. According to the results, vitality forms are not merely characterized by a velocity/acceleration modulation but by a combination of different spatiotemporal properties. We also perform automatic classification of vitality forms with F-score of 87.3%.
Proceedings of the International Conference on Advanced Visual Interfaces
The term commensality refers to "sharing food and eating together in a social group. In this... more The term commensality refers to "sharing food and eating together in a social group. In this paper, we hypothesize that it would be possible to have the same kind of experience in a HCI setting, thanks to a new type of interface that we call Artificial Commensal Companion (ACC), that would be beneficial, for example, to people who voluntarily choose or are constrained to eat alone. To this aim, we introduce an interactive system implementing an ACC in the form of a robot with non-verbal socio-affective capabilities. Future tests are already planned to evaluate its influence on the eating experience of human participants.

26th International Conference on Intelligent User Interfaces
In this paper, we investigate whether information related to touches and rotations impressed to a... more In this paper, we investigate whether information related to touches and rotations impressed to an object can be effectively used to classify the emotion of the agent manipulating it. We specifically focus on sequences of basic actions (e.g., grasping, rotating), which are constituents of daily interactions. We use the iCube, a 5 cm cube covered with tactile sensors and embedded with an accelometer, to collect a new dataset including 11 persons performing action sequences associated with 4 emotions: anger, sadness, excitement and gratitude. Next, we propose 17 high-level hand-crafted features based on the tactile and kinematics data derived from the iCube. Twelve of these features vary significantly as a function of the emotional context in which the action sequence was performed. In particular, a larger surface of the object is engaged in physical contact for anger and excitement, than for sadness. Furthermore, the average duration of interactions labeled as sad, is longer than for the remaining 3 emotions. More rotations are performed for anger and excitement than for sadness and gratitude. The accuracy of a classification experiment in the case of four emotions reaches 0.75. This result shows that the emotion recognition during hand-object interactions is possible and it may foster development of new intelligent user interfaces.

Companion Publication of the 2020 International Conference on Multimodal Interaction
Commensality is defined as "a social group that eats together", and eating in a commensality sett... more Commensality is defined as "a social group that eats together", and eating in a commensality setting has a number of positive effects on humans. The purpose of this paper is to investigate the effects of technology on commensality by presenting an experiment in which a toy robot showing non-verbal social behaviours tries to influence a participants' food choice and food taste perception. We managed to conduct both a qualitative and quantitative study with 10 participants. Results show the favourable impression of the robot on participants. It also emerged that the robot may be able to influence the food choices using its non-verbal behaviors only. However, these results are not statistically significant, perhaps due to the small sample size. In the future, we plan to collect more data using the same experimental protocol, and to verify these preliminary results. CCS CONCEPTS • Human-centered computing → Human computer interaction (HCI); Interaction paradigms.
Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems
Commensality is defined as "a social group that eats together" and eating in a commensa... more Commensality is defined as "a social group that eats together" and eating in a commensality setting has a number of positive effects on humans. In this paper, we discuss how HCI and technology in general can be exploited to replicate the benefits of commensality for people who choose or are forced to eat alone. We discuss research into and the design of Artificial Commensal Companions that can provide social interactions during food consumption. We present the design of a system, consisting of a toy robot, computer vision tracking, and a simple interaction model, that can show non-verbal social behaviors to influence a user's food choice. Finally, we discuss future studies and applications of this system, and provide suggestions for future research into Artificial Commensal Companions.

Frontiers in Psychology
Emotion, mood, and stress recognition (EMSR) has been studied in laboratory settings for decades.... more Emotion, mood, and stress recognition (EMSR) has been studied in laboratory settings for decades. In particular, physiological signals are widely used to detect and classify affective states in lab conditions. However, physiological reactions to emotional stimuli have been found to differ in laboratory and natural settings. Thanks to recent technological progress (e.g., in wearables) the creation of EMSR systems for a large number of consumers during their everyday activities is increasingly possible. Therefore, datasets created in the wild are needed to insure the validity and the exploitability of EMSR models for real-life applications. In this paper, we initially present common techniques used in laboratory settings to induce emotions for the purpose of physiological dataset creation. Next, advantages and challenges of data collection in the wild are discussed. To assess the applicability of existing datasets to real-life applications, we propose a set of categories to guide and compare at a glance different methodologies used by researchers to collect such data. For this purpose, we also introduce a visual tool called Graphical Assessment of Real-life Application-Focused Emotional Dataset (GARAFED). In the last part of the paper, we apply the proposed tool to compare existing physiological datasets for EMSR in the wild and to show possible improvements and future directions of research. We wish for this paper and GARAFED to be used as guidelines for researchers and developers who aim at collecting affect-related data for real-life EMSR-based applications.
Uploads
Papers by Radoslaw Niewiadomski