Skip to main content

Felix Burkhardt

Followers

56

Following

1

Public Views

University of Zagreb

Uppsala University

University of East London

Armando Marques-Guedes

UNL - New University of Lisbon

University of Leicester

Gwen Robbins Schug

University of North Carolina at Greensboro

Gabriel Gutierrez-Alonso

University of Salamanca

Macquarie University

Universidade Federal do Rio Grande do Sul

Swansea University

Uploads

Papers by Felix Burkhardt

Speaker Classification for Next Generation Voice Dialog Systems

Customer Relationship Management (CRM) is a growing business factor for medium and large enterpri... more Customer Relationship Management (CRM) is a growing business factor for medium and large enterprises. For cost reduction, the automation of business processes in call centers based on Interactive-Voice-Response (IVR) systems has been introduced in many companies. In state-of-the-art IVR systems automation based on automatic speech recognition (ASR) is mainly used for pre-qualifying of customers&amp;amp;amp;amp;amp;amp;#x27; requests with subsequent skill-based routing to a human agent or complete automation of simple business ...

Sprachdialogsystem und Verfahren zum Betreiben

An Affective Spoken Storyteller

We present a software to read texts with emotional expression. The software is developed as part ... more We present a software to read texts with emotional expression. The software is developed as part of the Emofilt open source emotional speech synthesis software. The affective storyteller consists of a text editor which offers a set of emotional speaking styles that can be used to mark up the text. The system was validated in a perception experiment and, although the number of participants wasn't very large, could show the general usability of the approach.

Method and apparatus for evaluating the emotional state of a person from speech utterances

Ontology Evolution

Handbook of Research on Social Dimensions of Semantic Technologies and Web Services, 2000

Verfahren zur Dialoganpassung und Dialogsystem zur Durchführung

The INTERSPEECH 2010 paralinguistic challenge

Interspeech, 2010

Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparabilit... more Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparability, in contrast to more 'traditional' disciplines in speech analysis. The INTERSPEECH 2010 Paralinguistic Challenge shall help overcome the usually low compatibility of results, by addressing three selected subchallenges. In the Age Sub-Challenge, the age of speakers has to be determined in four groups. In the Gender Sub-Challenge, a three-class classification task has to be solved and finally, the Affect Sub-Challenge asks for speakers' interest in ordinal representation. This paper introduces the conditions, the Challenge corpora "aGender" and "TUM AVIC" and standard feature sets that may be used. Further, baseline results are given.

Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age

The most successful systems in previous comparative studies on speaker age recognition used short... more The most successful systems in previous comparative studies on speaker age recognition used short-term cepstral features modeled with Gaussian Mixture Models (GMMs) or applied multiple phone recognizers trained with the data of speakers of the respective class. Acoustic analyses, however, indicate that certain features such as pitch extracted from a longer span of speech correlate clearly with the speaker age although the systems based on those features have been inferior to the before mentioned approaches. In this paper, three novel systems combining short-term cepstral features and long-term features for speaker age recognition are compared to each other. A system combining GMMs using frame-based MFCCs and Support-Vector-Machines using long-term pitch performs best. The results indicate that the combination of the two feature types is a promising approach, which corresponds to findings in related fields like speaker recognition.

Emotional Speech Synthesis: Applications, History and Possible Future

Emotional speech synthesis is an important part of the puzzle on the long way to human-like artif... more Emotional speech synthesis is an important part of the puzzle on the long way to human-like artificial human-machine interaction. During the way, lots of stations like emotional audio messages or believable characters in gaming will be reached. This paper discusses technical aspects of emotional speech synthesis, shows practical applications based on a higher level framework and highlights new developments concerning the realization of affective speech with non-uniform unit selection based synthesis and voice transformation techniques.

Elements of an EmotionML 1.0

Representing Emotions

Likability

Emotion markup language

A taxonomy of applications that utilize emotional awareness

This paper deals with human-computer interaction applications that utilize emotional awareness. W... more This paper deals with human-computer interaction applications that utilize emotional awareness. We will confine our discussion on speech-based applications. Prerequisites — training data, annotations — as well as state of the art in recognition and synthesis are addressed focusing on usability in possible applications and keeping restrictions in industrial environments in mind. We will present a taxonomy of applications using

Emofilt: the simulation of emotional speech by prosody-transformation

Annual Conference of the International Speech Communication Association, 2005

Emofilt is a software program intended to simulate emo- tional arousal with speech synthesis base... more

Verification of Acousical Correlates of Emotional Speech using FormantSynthesis

This paper explores the perceptual relevance of acousticalcorrelates of emotional speech by means... more This paper explores the perceptual relevance of acousticalcorrelates of emotional speech by means of speech synthesis.Besides, the research aims at the development of emotionrules which enable an optimized speech synthesis system togenerate emotional speech. Two investigations using thissynthesizer are described: 1) the systematic variation of selectedacoustical features to gain a preliminary impression regardingthe importance of certain acoustical features for emotionalexpression,

Verification of Acoustical Correlates of Emotional Speech using Formant Synthesis

This paper explores the perceptual relevance of acoustical correlates of emotional speech by mean... more This paper explores the perceptual relevance of acoustical correlates of emotional speech by means of speech synthesis. Besides, the research aims at the development of »emotion- rules« which enable an optimized speech synthesis system to generate emotional speech. Two investigations using this synthesizer are described: 1) the systematic variation of selected acoustical features to gain a preliminary impression regarding the

Application of EmotionML

Voice search in mobile applications with the rootvole framework

Voice search in mobile applications and the use of linked open data

Speaker Classification for Next Generation Voice Dialog Systems

Customer Relationship Management (CRM) is a growing business factor for medium and large enterpri... more Customer Relationship Management (CRM) is a growing business factor for medium and large enterprises. For cost reduction, the automation of business processes in call centers based on Interactive-Voice-Response (IVR) systems has been introduced in many companies. In state-of-the-art IVR systems automation based on automatic speech recognition (ASR) is mainly used for pre-qualifying of customers&amp;amp;amp;amp;amp;amp;#x27; requests with subsequent skill-based routing to a human agent or complete automation of simple business ...

Sprachdialogsystem und Verfahren zum Betreiben

An Affective Spoken Storyteller

We present a software to read texts with emotional expression. The software is developed as part ... more We present a software to read texts with emotional expression. The software is developed as part of the Emofilt open source emotional speech synthesis software. The affective storyteller consists of a text editor which offers a set of emotional speaking styles that can be used to mark up the text. The system was validated in a perception experiment and, although the number of participants wasn't very large, could show the general usability of the approach.

Method and apparatus for evaluating the emotional state of a person from speech utterances

Ontology Evolution

Handbook of Research on Social Dimensions of Semantic Technologies and Web Services, 2000

Verfahren zur Dialoganpassung und Dialogsystem zur Durchführung

The INTERSPEECH 2010 paralinguistic challenge

Interspeech, 2010

Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparabilit... more Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparability, in contrast to more 'traditional' disciplines in speech analysis. The INTERSPEECH 2010 Paralinguistic Challenge shall help overcome the usually low compatibility of results, by addressing three selected subchallenges. In the Age Sub-Challenge, the age of speakers has to be determined in four groups. In the Gender Sub-Challenge, a three-class classification task has to be solved and finally, the Affect Sub-Challenge asks for speakers' interest in ordinal representation. This paper introduces the conditions, the Challenge corpora "aGender" and "TUM AVIC" and standard feature sets that may be used. Further, baseline results are given.

Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age

The most successful systems in previous comparative studies on speaker age recognition used short... more The most successful systems in previous comparative studies on speaker age recognition used short-term cepstral features modeled with Gaussian Mixture Models (GMMs) or applied multiple phone recognizers trained with the data of speakers of the respective class. Acoustic analyses, however, indicate that certain features such as pitch extracted from a longer span of speech correlate clearly with the speaker age although the systems based on those features have been inferior to the before mentioned approaches. In this paper, three novel systems combining short-term cepstral features and long-term features for speaker age recognition are compared to each other. A system combining GMMs using frame-based MFCCs and Support-Vector-Machines using long-term pitch performs best. The results indicate that the combination of the two feature types is a promising approach, which corresponds to findings in related fields like speaker recognition.

Emotional Speech Synthesis: Applications, History and Possible Future

Emotional speech synthesis is an important part of the puzzle on the long way to human-like artif... more Emotional speech synthesis is an important part of the puzzle on the long way to human-like artificial human-machine interaction. During the way, lots of stations like emotional audio messages or believable characters in gaming will be reached. This paper discusses technical aspects of emotional speech synthesis, shows practical applications based on a higher level framework and highlights new developments concerning the realization of affective speech with non-uniform unit selection based synthesis and voice transformation techniques.

Elements of an EmotionML 1.0

Representing Emotions

Likability

Emotion markup language

A taxonomy of applications that utilize emotional awareness

This paper deals with human-computer interaction applications that utilize emotional awareness. W... more This paper deals with human-computer interaction applications that utilize emotional awareness. We will confine our discussion on speech-based applications. Prerequisites — training data, annotations — as well as state of the art in recognition and synthesis are addressed focusing on usability in possible applications and keeping restrictions in industrial environments in mind. We will present a taxonomy of applications using

Emofilt: the simulation of emotional speech by prosody-transformation

Annual Conference of the International Speech Communication Association, 2005

Emofilt is a software program intended to simulate emo- tional arousal with speech synthesis base... more

Verification of Acousical Correlates of Emotional Speech using FormantSynthesis

This paper explores the perceptual relevance of acousticalcorrelates of emotional speech by means... more This paper explores the perceptual relevance of acousticalcorrelates of emotional speech by means of speech synthesis.Besides, the research aims at the development of emotionrules which enable an optimized speech synthesis system togenerate emotional speech. Two investigations using thissynthesizer are described: 1) the systematic variation of selectedacoustical features to gain a preliminary impression regardingthe importance of certain acoustical features for emotionalexpression,

Verification of Acoustical Correlates of Emotional Speech using Formant Synthesis

This paper explores the perceptual relevance of acoustical correlates of emotional speech by mean... more This paper explores the perceptual relevance of acoustical correlates of emotional speech by means of speech synthesis. Besides, the research aims at the development of »emotion- rules« which enable an optimized speech synthesis system to generate emotional speech. Two investigations using this synthesizer are described: 1) the systematic variation of selected acoustical features to gain a preliminary impression regarding the

Application of EmotionML

Voice search in mobile applications with the rootvole framework

Voice search in mobile applications and the use of linked open data