Papers by Felix Burkhardt
Speaker Classification for Next Generation Voice Dialog Systems
Customer Relationship Management (CRM) is a growing business factor for medium and large enterpri... more Customer Relationship Management (CRM) is a growing business factor for medium and large enterprises. For cost reduction, the automation of business processes in call centers based on Interactive-Voice-Response (IVR) systems has been introduced in many companies. In state-of-the-art IVR systems automation based on automatic speech recognition (ASR) is mainly used for pre-qualifying of customers' requests with subsequent skill-based routing to a human agent or complete automation of simple business ...
Sprachdialogsystem und Verfahren zum Betreiben
We present a software to read texts with emotional expression. The software is developed as part ... more We present a software to read texts with emotional expression. The software is developed as part of the Emofilt open source emotional speech synthesis software. The affective storyteller consists of a text editor which offers a set of emotional speaking styles that can be used to mark up the text. The system was validated in a perception experiment and, although the number of participants wasn't very large, could show the general usability of the approach.
Method and apparatus for evaluating the emotional state of a person from speech utterances
Ontology Evolution
Handbook of Research on Social Dimensions of Semantic Technologies and Web Services, 2000
Verfahren zur Dialoganpassung und Dialogsystem zur Durchführung
Interspeech, 2010
Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparabilit... more Most paralinguistic analysis tasks are lacking agreed-upon evaluation procedures and comparability, in contrast to more 'traditional' disciplines in speech analysis. The INTERSPEECH 2010 Paralinguistic Challenge shall help overcome the usually low compatibility of results, by addressing three selected subchallenges. In the Age Sub-Challenge, the age of speakers has to be determined in four groups. In the Gender Sub-Challenge, a three-class classification task has to be solved and finally, the Affect Sub-Challenge asks for speakers' interest in ordinal representation. This paper introduces the conditions, the Challenge corpora "aGender" and "TUM AVIC" and standard feature sets that may be used. Further, baseline results are given.

The most successful systems in previous comparative studies on speaker age recognition used short... more The most successful systems in previous comparative studies on speaker age recognition used short-term cepstral features modeled with Gaussian Mixture Models (GMMs) or applied multiple phone recognizers trained with the data of speakers of the respective class. Acoustic analyses, however, indicate that certain features such as pitch extracted from a longer span of speech correlate clearly with the speaker age although the systems based on those features have been inferior to the before mentioned approaches. In this paper, three novel systems combining short-term cepstral features and long-term features for speaker age recognition are compared to each other. A system combining GMMs using frame-based MFCCs and Support-Vector-Machines using long-term pitch performs best. The results indicate that the combination of the two feature types is a promising approach, which corresponds to findings in related fields like speaker recognition.
Emotional speech synthesis is an important part of the puzzle on the long way to human-like artif... more Emotional speech synthesis is an important part of the puzzle on the long way to human-like artificial human-machine interaction. During the way, lots of stations like emotional audio messages or believable characters in gaming will be reached. This paper discusses technical aspects of emotional speech synthesis, shows practical applications based on a higher level framework and highlights new developments concerning the realization of affective speech with non-uniform unit selection based synthesis and voice transformation techniques.
Elements of an EmotionML 1.0
This paper deals with human-computer interaction applications that utilize emotional awareness. W... more This paper deals with human-computer interaction applications that utilize emotional awareness. We will confine our discussion on speech-based applications. Prerequisites — training data, annotations — as well as state of the art in recognition and synthesis are addressed focusing on usability in possible applications and keeping restrictions in industrial environments in mind. We will present a taxonomy of applications using
Annual Conference of the International Speech Communication Association, 2005
This paper explores the perceptual relevance of acousticalcorrelates of emotional speech by means... more This paper explores the perceptual relevance of acousticalcorrelates of emotional speech by means of speech synthesis.Besides, the research aims at the development of emotionrules which enable an optimized speech synthesis system togenerate emotional speech. Two investigations using thissynthesizer are described: 1) the systematic variation of selectedacoustical features to gain a preliminary impression regardingthe importance of certain acoustical features for emotionalexpression,
This paper explores the perceptual relevance of acoustical correlates of emotional speech by mean... more This paper explores the perceptual relevance of acoustical correlates of emotional speech by means of speech synthesis. Besides, the research aims at the development of »emotion- rules« which enable an optimized speech synthesis system to generate emotional speech. Two investigations using this synthesizer are described: 1) the systematic variation of selected acoustical features to gain a preliminary impression regarding the
Voice search in mobile applications with the rootvole framework
Voice search in mobile applications and the use of linked open data
Uploads
Papers by Felix Burkhardt