


default search action
INTERSPEECH 2011: Florence, Italy
- 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, Florence, Italy, August 27-31, 2011. ISCA 2011

Keynote Sessions
Keynote 1
- Julia Hirschberg:

Speaking More Like You: Entrainment in Conversational Speech. 4001
Keynote 2
- Tom M. Mitchell:

Neural Representations of Word Meanings. 4002
Keynote 3
- Alex Pentland:

Signals and Speech. 1-4
Keynote 4: Roundtable - Future and Applications of Speech and Language Technologies for the Good Health of Society
- Gabriele Miceli:

Language Disorders: Viewpoints on a Complex Object. - Björn Granström:

Speech Technology in (Re)Habilitation of Persons with Communication Disabilities. - Hiroshi Ishiguro:

From Teleoperated Androids to Cellphones as Surrogates.
Regular Oral Sessions
Speaker Recognition - Modeling
- Avi Matza:

Skew Gaussian Mixture Models for Speaker Recognition. 5-8 - Orith Toledo-Ronen, Hagai Aronowitz, Ron Hoory, Jason W. Pelecanos, David Nahamoo:

Towards Goat Detection in Text-Dependent Speaker Verification. 9-12 - Jean-François Bonastre, Xavier Anguera Miró, Gabriel Hernández Sierra, Pierre-Michel Bousquet:

Speaker Modeling Using Local Binary Decisions. 13-16 - Hagai Aronowitz, Ron Hoory, Jason W. Pelecanos, David Nahamoo:

New Developments in Voice Biometrics for User Authentication. 17-20 - Miranti Indar Mandasari, Mitchell McLaren, David A. van Leeuwen:

Evaluation of i-vector Speaker Recognition Systems for Forensic Application. 21-24 - Mohammed Senoussaoui, Patrick Kenny, Niko Brümmer, Edward de Villiers, Pierre Dumouchel:

Mixture of PLDA Models in i-vector Space for Gender-Independent Speaker Recognition. 25-28
Speech Perception - Speech Intelligibility
- Nandini Iyer, Douglas Brungart, Brian D. Simpson:

Segregation of Whispered Speech Interleaved with Noise or Speech Maskers. 29-32 - Roi Kliper, Hendrik Kayser, Daphna Weinshall, Israel Nelken, Jörn Anemüller:

Monaural Azimuth Localization Using Spectral Dynamics of Speech. 33-36 - Jan Rennies, Thomas Brand, Birger Kollmeier:

Prediction of Binaural Intelligibility Level Differences in Reverberation. 37-40 - Aurore Gautreau, Michel Hoen, Fanny Meunier:

Let's All Speak Together! Exploring the Impact of Various Languages on the Comprehension of Speech in Multi-Linguistic Babble. 41-44 - Valeriy Shafiro, Stanley Sheft, Robert Risley:

Cross-Rate Variation in the Intelligibility of Dual-Rate Gated Speech in Older Listeners. 45-48 - Chia-ying Lee, James R. Glass, Oded Ghitza:

An Efferent-Inspired Auditory Model Front-End for Speech Recognition. 49-52
Speech Representation and Modelling
- Faten Ben Ali, Laurent Girin, Sonia Djaziri Larbi:

A Long-Term Harmonic Plus Noise Model for Speech Signals. 53-56 - Alan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle:

A Frequency Domain Approach to ARX-LF Voiced Speech Parameterization and Synthesis. 57-60 - Vikram Ramanarayanan, Athanasios Katsamanis, Shrikanth S. Narayanan:

Automatic Data-Driven Learning of Articulatory Primitives from Real-Time MRI Data Using Convolutive NMF with Sparseness Constraints. 61-64 - Dong Wang, Ravichander Vipperla, Nicholas W. D. Evans:

Online Pattern Learning for Non-Negative Convolutive Sparse Coding. 65-68 - Nicolas Malyska, Thomas F. Quatieri, Robert B. Dunn:

Sinewave Representations of Nonmodality. 69-72 - Ch. Srikanth Raj, Thippur V. Sreenivas:

Time-Varying Signal Adaptive Transform and IHT Recovery of Compressive Sensed Speech. 73-76
Emotion, Speaking Style, and Social Behavior
- Martin Wöllmer, Felix Weninger, Florian Eyben, Björn W. Schuller:

Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM Nets. 77-80 - Mustafa Erden, Levent M. Arslan:

Automatic Detection of Anger in Human-Human Call Center Dialogs. 81-84 - Keng-hao Chang, Howard Lei, John F. Canny:

Improved Classification of Speaking Styles for Mental Health Monitoring Using Phoneme Dynamics. 85-88 - Matthew Black, Panayiotis G. Georgiou, Athanasios Katsamanis, Brian R. Baucom

, Shrikanth S. Narayanan:
"You made me do it": Classification of Blame in Married Couples' Interactions by Fusing Automatically Derived Speech and Language Information. 89-92 - Martijn Goudbeek, Marie Nilsenová:

Context and Priming Effects in the Recognition of Emotion of Old and Young Listeners. 93-96 - Agustín Gravano, Rivka Levitan, Laura Willson, Stefan Benus, Julia Hirschberg, Ani Nenkova:

Acoustic and Prosodic Correlates of Social Behavior. 97-100
HMM-based Speech Synthesis I
- Kyung Hwan Oh, June Sig Sung, Doo Hwa Hong, Nam Soo Kim:

Decision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis. 101-104 - Hanna Silén, Elina Helander, Moncef Gabbouj:

Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech Synthesis. 105-108 - Takashi Nose, Takao Kobayashi:

A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM. 109-112 - Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis. 113-116 - Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi:

Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-Based Speech Synthesis. 117-120 - Matt Shannon, Heiga Zen, William J. Byrne:

The Effect of Using Normalized Models in Statistical Speech Synthesis. 121-124
Speaker Recognition - Modeling, Automatic Procedures, Analysis I
- Ce Zhang, Rong Zheng, Bo Xu:

Restoring the Residual Speaker Information in Total Variability Modeling for Speaker Verification. 125-128 - Hagai Aronowitz, Oren Barkan:

New Developments in Joint Factor Analysis for Speaker Verification. 129-132 - Joaquin Gonzalez-Rodriguez:

Speaker Recognition Using Temporal Contours in Linguistic Units: The Case of Formant and Formant-Bandwidth Trajectories. 133-136 - Ondrej Glembek, Lukás Burget, Niko Brümmer, Oldrich Plchot, Pavel Matejka:

Discriminatively Trained i-vector Extractor for Speaker Verification. 137-140 - Michelle Hewlett Sanchez, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke:

Constrained Cepstral Speaker Recognition Using Matched UBM and JFA Training. 141-144 - Alan McCree, Douglas E. Sturim, Douglas A. Reynolds:

A New Perspective on GMM Subspace Compensation Based on PPCA and Wiener Filtering. 145-148
Speech Perception - Perceptual Learning and Cross-Language Perception
- Odette Scharenborg, Holger Mitterer, James M. McQueen:

Perceptual Learning of Liquids. 149-152 - Annelie Tuinman, Holger Mitterer, Anne Cutler:

The Efficiency of Cross-Dialectal Word Recognition. 153-156 - Minoru Tsuzaki, Keiichi Tokuda, Hisashi Kawai, Jinfu Ni:

Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task. 157-160 - Sharon Peperkamp

, Camillia Bouchon:
The Relation Between Perception and Production in L2 Phonological Processing. 161-164 - Maria Paola Bissiri, María Luisa García Lecumberri, Martin Cooke, Jan Volín

:
The Role of Word-Initial Glottal Stops in Recognizing English Words. 165-168 - Caicai Zhang, Gang Peng, William S.-Y. Wang:

Effect of Language Experience on the Categorical Perception of Cantonese Vowel Duration. 169-172
Speech Analysis
- Christian Fischer Pedersen, Ove Andersen, Paul Dalsgaard:

Adaptive Estimation of Zeros of Time-Varying Z-Transforms. 173-176 - John Kane, Christer Gobl:

Identifying Regions of Non-Modal Phonation Using Features of the Wavelet Transform. 177-180 - Xing Fan, Keith W. Godin, John H. L. Hansen:

Acoustic Analysis of Whispered Speech for Phoneme and Speaker Dependency. 181-184 - Afsaneh Asaei, Mohammad Javad Taghizadeh, Hervé Bourlard, Volkan Cevher:

Multi-Party Speech Recovery Exploiting Structured Sparsity Models. 185-188 - Sri Harish Reddy Mallidi, Sriram Ganapathy, Hynek Hermansky:

Modulation Spectrum Analysis for Recognition of Reverberant Speech. 189-192 - Petko Nikolov Petkov, W. Bastiaan Kleijn

, Bert de Vries:
Discrete Choice Models for Non-Intrusive Quality Assessment. 193-196
Speech Enhancement and Dereverberation
- Keisuke Kinoshita, Mehrez Souden, Marc Delcroix, Tomohiro Nakatani:

Single Channel Dereverberation Using Example-Based Speech Enhancement with Uncertainty Decoding Technique. 197-200 - Jan S. Erkelens, Richard Heusdens:

A Statistical Room Impulse Response Model with Frequency Dependent Reverberation Time for Single-Microphone Late Reverberation Suppression. 201-204 - Chenxi Zheng, Tiago H. Falk, Wai-Yip Chan:

An Assessment of the Improvement Potential of Time-Frequency Masking for Speech Dereverberation. 205-208 - Thiago de M. Prego, Amaro A. de Lima, Sergio L. Netto:

Perceptual Improvement of a Two-Stage Algorithm for Speech Dereverberation. 209-212 - Najib Hadir, Friedrich Faubel, Dietrich Klakow:

A Model-Based Spectral Envelope Wiener Filter for Perceptually Motivated Speech Enhancement. 213-216 - Jorge I. Marin-Hurtado, Devangi N. Parikh, David V. Anderson:

Binaural Noise-Reduction Method Based on Blind Source Separation and Perceptual Post Processing. 217-220
ASR - Feature Extraction II
- Tim Ng, Bing Zhang, Spyridon Matsoukas, Long Nguyen:

Region Dependent Transform on MLP Features for Speech Recognition. 221-224 - Martin Heckmann, Claudius Gläser:

Discriminant Sub-Space Projection of Spectro-Temporal Speech Features Based on Maximizing Mutual Information. 225-228 - Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura:

Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition. 229-232 - Sumit Chopra, Patrick Haffner, Dimitrios Dimitriadis:

Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification. 233236 - Dong Yu, Michael L. Seltzer:

Improved Bottleneck Features Using Pretrained Deep Neural Networks. 237-240 - Yuan-Fu Liao, Chia-Hsing Lin, We-Der Fang:

Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio Classification. 241-244
Speaker Recognition - Modeling, Automatic Procedures, Analysis II
- Ce Zhang, Rong Zheng, Bo Xu:

Data-Driven Gaussian Component Selection for Fast GMM-Based Speaker Verification. 245-248 - Daniel Garcia-Romero, Carol Y. Espy-Wilson:

Analysis of i-vector Length Normalization in Speaker Recognition Systems. 249-252 - Weiwu Jiang, Zhifeng Li, Helen M. Meng:

An Analysis Framework Based on Random Subspace Sampling for Speaker Verification. 253-256 - Nicolas Scheffer, Yun Lei, Luciana Ferrer:

Factor Analysis Back Ends for MLLR Transforms in Speaker Recognition. 257-260 - Craig S. Greenberg, Alvin F. Martin, Bradford Barr, George R. Doddington:

Report on Performance Results in the NIST 2010 Speaker Recognition Evaluation. 261-264 - Marcel Kockmann, Luciana Ferrer, Lukás Burget, Jan Cernocký:

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification. 265-268
Speech Production - Articulatory Measurements
- Yoon-Chul Kim, Michael I. Proctor, Shrikanth S. Narayanan, Krishna S. Nayak:

Visualization of Vocal Tract Shape Using Interleaved Real-Time MRI of Multiple Scan Planes. 269-272 - Ralf Winkler, Susanne Fuchs, Pascal Perrier, Mark Tiede:

Biomechanical Tongue Models: An Approach to Studying Inter-Speaker Variability. 273-276 - Jun Wang, Jordan R. Green, Ashok Samal, David Marx:

Quantifying Articulatory Distinctiveness of Vowels. 277-280 - Michael I. Proctor, Adam C. Lammert, Athanasios Katsamanis, Louis M. Goldstein, Christina Hagedorn, Shrikanth S. Narayanan:

Direct Estimation of Articulatory Kinematics from Real-Time Magnetic Resonance Image Sequences. 281-284 - Peter Birkholz, Christiane Neuschaefer-Rube:

Combined Optical Distance Sensing and Electropalatography to Measure Articulation. 285-288 - Santitham Prom-on, Yi Xu, Fang Liu:

Simulating Post-L F0 Bouncing by Modeling Articulatory Dynamics. 289-292
Acoustic Event Detection
- Jürgen T. Geiger, Mohamed Anouar Lakhal, Björn W. Schuller, Gerhard Rigoll:

Learning New Acoustic Events in an HMM-Based System Using MAP Adaptation. 293-296 - Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li:

Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition. 297-300 - Akinori Ito, Akihito Aiba, Masashi Ito, Shozo Makino:

Evaluation of Abnormal Sound Detection using Multi-Stage GMM in Various Environments. 301-304 - Joerg Schmalenstroeer, Markus Bartek, Reinhold Haeb-Umbach:

Unsupervised Learning of Acoustic Events Using Dynamic Time Warping and Hierarchical K-Means++ Clustering. 305-308 - Pradeep Natarajan, Stavros Tsakalidis, Vasant Manohar, Rohit Prasad, Premkumar Natarajan:

Unsupervised Audio Analysis for Categorizing Heterogeneous Consumer Domain Videos. 313-316
Speech Synthesis - Unit Selection and Hybrid Approaches
- Vivek Kumar Rangarajan Sridhar, Ann K. Syrdal, Alistair Conkie, Srinivas Bangalore:

Enriching Text-to-Speech Synthesis Using Automatic Dialog Act Tags. 317-320 - Lukas Latacz, Wesley Mattheyses, Werner Verhelst:

Joint Target and Join Cost Weight Training for Unit Selection Synthesis. 321-324 - Andreas Windmann, Igor Jauk, Fabio Tamburini, Petra Wagner:

Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis. 325-328 - Sathish Pammi, Marc Schröder:

Evaluating the Meaning of Synthesized Listener Vocalizations. 329-332 - Iñaki Sainz, Daniel Erro, Eva Navas, Inma Hernáez:

A Hybrid TTS Approach for Prosody and Acoustic Modules. 333-336 - Alexander Sorin, Slava Shechtman, Vincent Pollet:

Uniform Speech Parameterization for Multi-Form Segment Synthesis. 337-340
Speech Enhancement Analysis and Evaluation
- Ryoichi Miyazaki, Hiroshi Saruwatari, Kiyohiro Shikano:

Theoretical Analysis of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction Array. 341-344 - Yan Tang, Martin Cooke:

Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration Constraints. 345-348 - Nagarjuna Reddy Muraka, Chandra Sekhar Seelamantula:

A Risk-Estimation-Based Comparison of Mean Square Error and Itakura-Saito Distortion Measures for Speech Enhancement. 349-352 - Mahdi Triki:

On Noise Tracking for Noise Floor Estimation. 353-356 - Ben Milner:

Maximum a posteriori Estimation of Noise from Non-Acoustic Reference Signals in Very Low Signal-to-Noise Ratio Environments. 357-360 - Ryo Wakisaka, Hiroshi Saruwatari, Kiyohiro Shikano, Tomoya Takatani:

Blind Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator. 361-364
Speaker Recognition - Analysis and Statistics I
- Kornel Laskowski, Qin Jin:

Harmonic Structure Transform for Speaker Recognition. 365-368 - Hemant A. Patil, Maulik C. Madhavi, Keshab K. Parhi:

Combining Evidence from Spectral and Source-Like Features for Person Recognition from Humming. 369-372 - Yanhua Long, Zhi-Jie Yan, Frank K. Soong, Li-Rong Dai, Wu Guo:

Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model. 373-376 - Yosef A. Solewicz, Hagai Aronowitz:

Implicit Segmentation in Two-Wire Speaker Recognition. 377-380 - Sibel Yaman, Jason W. Pelecanos, Mohamed Kamal Omar:

Boosting Speaker Recognition Performance with Compact Representations. 381-384 - Carlos Vaquero, Alfonso Ortega

, Eduardo Lleida:
Partitioning of Two-Speaker Conversation Datasets. 385-388
Speech Production - Coarticulation and Speech Timing
- Stefan Benus, Marianne Pouplier:

Jaw Movement in Vowels and Liquids Forming the Syllable Nucleus. 389-392 - Barbara Gili Fivela, Antonio Stella, Sonia D'Apolito, Francesco Sigona:

Coarticulation Across Prosodic Domains in Italian: An Ultrasound Investigation. 393-396 - Juraj Simko, Fred Cummins, Stefan Benus:

Investigating the Stability of Intergestural Timing Relations. 397-400 - Claudio Zmarich, Barbara Gili Fivela, Pascal Perrier, Christophe Savariaux, Graziano Tisato:

Speech Timing Organization for the Phonological Length Contrast in Italian Consonants. 401-404 - Chiara Celata, Silvia Calamai:

Timing in Italian VNC Sequences at Different Speech Rates. 405-408 - Christina Hagedorn, Michael I. Proctor, Louis Goldstein:

Automatic Analysis of Singleton and Geminate Consonant Articulation Using Real-Time Magnetic Resonance Imaging. 409-412
Speech Segmentation
- Yih-Ru Wang:

A Two-Stage Sample-Based Phone Boundary Detector Using Segmental Similarity Features. 413-416 - Qiang Huang, Stephen J. Cox:

Iterative Improvement of Speaker Segmentation in a Noisy Environment Using High-Level Knowledge. 417-420 - Diego Castán, Carlos Vaquero, Alfonso Ortega, David Martínez González, Jesús Antonio Villalba López, Eduardo Lleida:

Hierarchical Audio Segmentation with HMM and Factor Analysis in Broadcast News Domain. 421-424 - Ozlem Kalinli:

Syllable Segmentation of Continuous Speech Using Auditory Attention Cues. 425-428 - Vijayaditya Peddinti, Kishore Prahallad:

Exploiting Phone-Class Specific Landmarks for Refinement of Segment Boundaries in TTS Databases. 429-432 - Agnès Pedone, Juan José Burred, Simon Maller, Pierre Leveau:

Phoneme-Level Text to Audio Synchronization on Speech Signals with Background Music. 433-436
ASR - Acoustic Models II
- Frank Seide, Gang Li, Dong Yu:

Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. 437-440 - Guangsen Wang, Khe Chai Sim:

Sequential Classification Criteria for NNs in Automatic Speech Recognition. 441-444 - Mathew Magimai-Doss, Ramya Rasipuram, Guillermo Aradilla, Hervé Bourlard:

Grapheme-Based Automatic Speech Recognition Using KL-HMM. 445-448 - Joseph Keshet, Chih-Chieh Cheng, Mark Stoehr, David A. McAllester:

Direct Error Rate Minimization of Hidden Markov Models. 449-452 - Xie Sun, Xin Chen, Yunxin Zhao:

On the Effectiveness of Statistical Modeling Based Template Matching Approach for Continuous Speech Recognition. 453-456 - Guangsen Wang, Khe Chai Sim:

Comparison of Smoothing Techniques for Robust Context Dependent Acoustic Modelling in Hybrid NN/HMM Systems. 457-460
Robust Speech Recognition II
- Ramón Fernandez Astudillo, João Paulo da Silva Neto:

Propagation of Uncertainty Through Multilayer Perceptrons for Robust Automatic Speech Recognition. 461-464 - Katariina Mahkonen, Antti Hurmalainen, Tuomas Virtanen, Jort F. Gemmeke:

Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition. 465-468 - Heikki Kallasjoki, Ulpu Remes, Jort F. Gemmeke, Tuomas Virtanen, Kalle J. Palomäki:

Uncertainty Measures for Improving Exemplar-Based Source Separation. 469-472 - Hsien-Cheng Liao, Yuan-Fu Liao, Chin-Hui Lee:

Maximum Confidence Measure Based Interaural Phase Difference Estimation for Noise Masking in Dual-Microphone Robust Speech Recognition. 473-476 - Shirin Badiezadegan, Richard C. Rose:

A Performance Monitoring Approach to Fusing Enhanced Spectrogram Channels in Robust Speech Recognition. 477-480 - Ning Cheng, Xunying Liu, Lan Wang:

Generalized Variable Parameter HMMs for Noise Robust Speech Recognition. 481-484
Speaker Recognition - Analysis and Statistics II
- Pierre-Michel Bousquet, Driss Matrouf, Jean-François Bonastre:

Intersession Compensation and Scoring Methods in the i-vectors Space for Speaker Recognition. 485-488 - Szymon Drgas, Adam Dabrowski:

Kernel Alignment Maximization for Speaker Recognition Based on High-Level Features. 489-492 - Balaji Vasan Srinivasan, Daniel Garcia-Romero, Dmitry N. Zotkin, Ramani Duraiswami:

Kernel Partial Least Squares for Speaker Recognition. 493-496 - Mohamed Kamal Omar, Jason W. Pelecanos:

Conversational-Side-Specific Inter-Session Variability Compensation. 497-500 - David A. van Leeuwen, Niko Brümmer:

A Speaker Line-Up for the Likelihood Ratio. 501-504 - Jesús Antonio Villalba López, Niko Brümmer:

Towards Fully Bayesian Speaker Recognition: Integrating Out the Between-Speaker Covariance. 505-508
Speaker Recognition - Analysis and Statistics II
- Hemant A. Patil, Pallavi N. Baljekar:

Novel VTEO Based Mel Cepstral Features for Classification of Normal and Pathological Voices. 509-512 - Eiji Shimura, Kazuhiko Kakehi:

Temporal Performance of Dysarthric Patients in Speech and Tapping Tasks. 513-516 - Xinhui Zhou, Maureen L. Stone, Carol Y. Espy-Wilson:

A Comparative Acoustic Study on Speech of Glossectomy Patients and Normal Subjects. 517-520 - Ali Alpan, Francis Grenez, Jean Schoentgen:

Dysperiodicity Analysis of Perceptually Assessed Synthetic Speech Stimuli. 521-524 - Alain Ghio, Frédérique Weisz, Giovanna Baracca, Giovanna Cantarella, Danièle Robert, Virginie Woisard, Franco Fussi, Antoine Giovanni:

Is the Perception of Voice Quality Language-Dependant? A Comparison of French and Italian Listeners and Dysphonic Speakers. 525-528 - Juan Rafael Orozco-Arroyave, S. Murillo Rendón, Andrés Marino Álvarez-Meza, Julián D. Arias-Londoño, Edilson Delgado-Trejos, Jesús Francisco Vargas-Bonilla, César Germán Castellanos-Domínguez:

Automatic Selection of Acoustic and Non-Linear Dynamic Features in Voice Signals for Hypernasality Detection. 529-532
ASR - Lexical, Prosodic and Multi-Lingual Models
- Sravana Reddy, Evandro B. Gouvêa

:
Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition Errors. 533-536 - David Imseng, Hervé Bourlard, John Dines, Philip N. Garner, Mathew Magimai-Doss:

Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space Transformations. 537-540 - Scott Novotney, Richard M. Schwartz, Sanjeev Khudanpur:

Unsupervised Arabic Dialect Adaptation with Self-Training. 541-544 - Dino Seppi, Kris Demuynck, Dirk Van Compernolle:

Template-Based Automatic Speech Recognition Meets Prosody. 545-548 - Ibrahim Badr, Ian McGraw, James R. Glass:

Pronunciation Learning from Continuous Speech. 549-552 - Yanmin Qian, Daniel Povey, Jia Liu:

State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs. 553-560
Source Separation
- Yasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy:

Blind Speech Separation in Multiple Environments Using a Frequency Oriented PCA Method for Convolutive Mixtures. 557-560 - Zbynek Koldovský, Jirí Málek, Petr Tichavský:

Blind Speech Separation in Time-Domain Using Block-Toeplitz Structure of Reconstructed Signal Matrices. 561-564 - Auxiliadora Sarmiento, Iván Durán-Díaz, Sergio Cruces, Pablo Aguilera:

Generalized Method for Solving the Permutation Problem in Frequency-Domain Blind Source Separation of Convolved Speech Signals. 565-568 - Emad M. Grais, Hakan Erdogan:

Adaptation of Speaker-Specific Bases in Non-Negative Matrix Factorization for Single Channel Speech-Music Separation. 569-572 - Shuhua Zhang, Laurent Girin:

An Informed Source Separation System for Speech Signals. 573-576 - Ngoc Thuy Tran, William G. Cowley, André Pollok:

Adaptive Blocking Beamformer for Speech Separation. 577-580
Multimodal Signal Processing
- Per Ola Kristensson, Keith Vertanen:

Asynchronous Multimodal Text Entry Using Speech and Gesture Keyboards. 581-584 - Niall McLaughlin, Ji Ming, Danny Crookes:

Robust Bimodal Person Identification Using Face and Speech with Limited Training Data and Corruption of Both Modalities. 585-588 - Atef Ben Youssef, Thomas Hueber, Pierre Badin, Gérard Bailly:

Toward a Multi-Speaker Visual Articulatory Feedback System. 589-592 - Thomas Hueber, Elie-Laurent Benaroya, Bruce Denby, Gérard Chollet:

Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech Interface. 593-596 - Joerg Schmalenstroeer, Florian Jacob, Reinhold Haeb-Umbach, Marius H. Hennecke, Gernot A. Fink:

Unsupervised Geometry Calibration of Acoustic Sensor Networks Using Source Correspondences. 597-600 - Michael Wand, Matthias Janke, Tanja Schultz:

Investigations on Speaking Mode Discrepancies in EMG-Based Speech Recognition. 601-604
ASR - Language Models II
- Tomás Mikolov, Anoop Deoras, Stefan Kombrink, Lukás Burget, Jan Cernocký:

Empirical Evaluation and Combination of Advanced Language Modeling Techniques. 605-608 - Geoffrey Zweig, Shuangyu Chang:

Personalizing Model M for Voice-Search. 609-612 - Takahiro Shinozaki, Yu Kubota, Sadaoki Furui, Eiji Utsunomiya, Yasutaka Shindoh:

Sentence Selection by Direct Likelihood Maximization for Language Model Adaptation. 613-616 - Ebru Arisoy, Bhuvana Ramabhadran, Hong-Kwang Jeff Kuo:

Feature Combination Approaches for Discriminative Language Models. 617-620 - Sankaranarayanan Ananthakrishnan, Stavros Tsakalidis, Rohit Prasad, Premkumar Natarajan:

On-Line Language Model Biasing for Multi-Pass Automatic Speech Recognition. 621-624 - Moonyoung Kang, Tim Ng, Long Nguyen:

Mandarin Word-Character Hybrid-Input Neural Network Language Model. 625-628
Phonology and Phonetics
- Vahid Sadeghi:

Laryngealization and Breathiness in Persian. 629-632 - Viola Müller, Jonathan Harrington, Felicitas Kleber, Ulrich Reubold:

Age-Dependent Differences in the Neutralization of the Intervocalic Voicing Contrast: Evidence from an Apparent-Time Study on East Franconian. 633-636 - Barbara Samlowski, Bernd Möbius, Petra Wagner:

Comparing Syllable Frequencies in Corpora of Written and Spoken Language. 637-640 - Luca Iacoponi, Renata Savy:

Sylli: Automatic Phonological Syllabification for Italian. 641-644 - André N. Xavier, Plínio A. Barbosa:

A Preliminary Study on the Production of Signs in Brazilian Sign Language when One of the Manual Articulators is Unavailable. 645-648 - Ho-hsien Pan, Mao-Hsu Chen, Shao-Ren Lyu:

Electroglottograph and Acoustic Cues for Phonation Contrasts in Taiwan Min Falling Tones. 649-652
Voice Conversion
- Daisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, Keikichi Hirose:

One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space. 653-656 - Yu Qiao, Tong Tong, Nobuaki Minematsu:

A Study on Bag of Gaussian Model with Application to Voice Conversion. 657-660 - Lei Li, Yoshihiko Nankaku, Keiichi Tokuda:

A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures. 661-664 - Mahdi Eslami, Hamid Sheikhzadeh, Abolghasem Sayadiyan:

Quality Improvement of Voice Conversion Systems Based on Trellis Structured Vector Quantization. 665-668 - Hadas Benisty, David Malah:

Voice Conversion Using GMM with Enhanced Global Variance. 669-672 - Elizabeth Godoy, Olivier Rosec, Thierry Chonavel:

Spectral Envelope Transformation Using DFW and Amplitude Scaling for Voice Conversion with Parallel or Nonparallel Corpora. 673-676
Robust Speech Recognition III
- Pejman Mowlaee, Rahim Saeidi, Zheng-Hua Tan, Mads Græsbøll Christensen, Tomi Kinnunen, Pasi Fränti, Søren Holdt Jensen:

Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge. 677-680 - Cemil Demir, A. Taylan Cemgil, Murat Saraclar:

Semi-Supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition. 681-684 - Hari Krishna Maganti, Marco Matassoni:

A Level-Dependent Auditory Filter-Bank for Speech Recognition in Reverberant Environments. 685-688 


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID