Minje Kim's Research - Minje Kim's Home

Haici Yang Defended

Haici Yang defended her dissertation ("Latent Variable Learning for Generative Neural Audio Codecs") successfully!

December 11, 2025

WASPAA 2025 in Lake Tahoe

My group in Illinois, former students at IU, and collaborators have had a strong presence at WASPAA 2025, which was one of the best conference experiences I've had so far.…

October 19, 2025

ISMIR 2025 in Daejeon, Korea

ISMIR 2025 in Daejeon, Korea was really fun. Learned a lot from the nice papers, how things are organized there, and enjoyed the music so much. Yutong Wen presented a…

September 28, 2025

Darius Petermann successfully defended his dissertation on "Efficient Native Neural Sub-band Coding through Residual Feature Representation within Hyper-Autoencoded Reconstruction Propagation Networks."

September 6, 2025

Fraunhofer IIS and AudioLabs in Erlangen

The visit to Fraunhofer IIS and International Audio Labs in Erlangen was so inspiring and heartwarming. Introducing my neural audio coding works to the leading experts in audio coding was…

July 10, 2025

University of Hamburg

It was such a nice visit to Timo Gerkmann's signal processing group at the University of Hamburg. I had a great time speaking with the bright students and researchers there,…

July 8, 2025

Ajou Honorary Alumni

I was recognized as "Ajou Honorary Alumni" by my alma mater, Ajou University, along with my then-girlfriend-now-wife Kahyun Choi.

June 30, 2025

Anastasia Kuznetsova Defended

Anastasia Kuznetsova successfully defended her dissertation on "Data Efficiency and Model Complexity Reduction for Speech Processing Systems." Congratulations!

June 12, 2025

ICASSP 2025 in Hyderabad, India

ICASSP 2025 was fun! I organized the Generative Data Augmentation (GenDA) workshop along with my colleagues (Dinesh Manocha at U. of Maryland, Johan Hershey at Google, and Trausti Kristjansson at…

April 24, 2025

Jackie Lin at ICASSP 2025 (GenDA 2025 Workshop)

Jackie participated in the Generative Data Augmentation (GenDA) workshop as the task captain for the challenge, "Room Acoustics and Speaker Distance Estimation." She also chaired a few sessions.

April 23, 2025

Jaesung Bae at ICASSP 2025 (GenDA 2025 Workshop)

Jaesung participated in the Generative Data Augmentation (GenDA) workshop as the task captain for the challenge, "Zero-Shot TTS and personalized speech enhancement." He also chaired a few sessions.

April 22, 2025

Interspeech 2024 Tutorial on Neural Speech and Audio Coding

September 24, 2024

Haici Yang at Interspeech 2024 (Kos, Greece)

September 24, 2024

Tsun-An Hsieh at Interspeech 2024 (Kos, Greece)

September 24, 2024

Invited Talk at MERL (July 2024)

July 24, 2024

HSCMA 2024 in Seoul (Apr. 2024)

April 24, 2024

Darius Petermann at ICASSP 2024 in Seoul

April 24, 2024

Tsun-An Hsieh at ICASSP 2024 in Seoul

April 24, 2024

Haici Yang at ICASSP 2024 in Seoul

April 24, 2024

ICASSP 2024 in Seoul

April 24, 2024

At Yamaha Innovation Road (Apr. 2024)

April 24, 2024

Invited Talk at Academia Sinica (Apr. 2024)

April 24, 2024

Minje Kim at AES Convention 2023 (NYC)

October 30, 2023

Anastasia Kuznetsova at SANE 2023 (NYU)

October 30, 2023

Minje Kim at ICASSP 2023 (Rhodes Island, Greece)

June 19, 2023

Sunwoo Kim at ICASSP 2023 (Rhodes Island, Greece)

June 19, 2023

Aswin Sivaraman at ICASSP 2023 (Rhodes Island, Greece)

June 19, 2023

Darius Petermann at ICASSP 2023 (Rhodes Island, Greece)

June 19, 2023

Haici Yang at ICASSP 2023 (Rhodes Island, Greece)

June 19, 2023

Haici Yang at Luddy AI Center Open House

March 9, 2023

Darius Petermann at Luddy AI Center Open House

March 9, 2023

Aswin Sivaraman at SANE 2022 (Cambridge, MA)

October 10, 2022

Minje’s Interspeech 2022 Tutorial on Personalized Speech Enhancement

September 26, 2022

David Badger at Autonomous 2.0 (Westgate Academy)

March 11, 2022

Kai Zhen at Interspeech 2019, Graz, Austria

September 25, 2019

Minje Kim at ICASSP 2017, New Orleans, LA

July 25, 2019

Sanna Wager at ICASSP 2017, New Orleans, LA

July 25, 2019

Minje Kim at ICASSP 2018, Calgary, Canada

July 25, 2019

Sanna Wager at ICASSP 2019, Brighton, UK

July 25, 2019

Sunwoo Kim at ICASSP 2019, Brighton, UK

July 25, 2019

Minje Kim at MMAD 2019, Bloomington, IN

July 25, 2019

Introduction

My research revolves around making audio and speech AI more practical and useful. I aim to introduce concepts such as efficiency, personalization, scalability, and collaboration into the AI systems I develop. With those goals in mind and by combining signal processing, generative modeling, and machine learning, I develop adaptive systems for learning efficient data representations (e.g., neural audio coding), intelligent signal processing (e.g., speech enhancement and source separation), and generative modeling of audio.

Featured Projects

TGIF: A Family-Owned Voice AI

Overview In everyday life, our devices run many speech/audio applications that can benefit from the target speaker…
(read more…)
Audio Coding for Machines

Machine-Learned Latent Features Are Codes for That Machine! When we think about compressing sound, we usually…
(read more…)
Personalized Neural Speech Codec

Have you ever wondered about a speech codec that’s dedicated to your speech trait? Why? Of…
(read more…)
Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion

As we’ve proposed in the BLOOM-Net project, scalability matters. Just to reiterate the argument here once…
(read more…)
LaDiffCodec: Generative De-Quantization for Neural Speech Codec via Latent Diffusion

Motivation We bring the cool generative power of a diffusion model to speech coding. We call…
(read more…)
Don’t Separate, Learn to Remix: End-to-End Neural Remixing

TLDR: In this project, we developed an end-to-end neural network system that takes a music mixture…
(read more…)
SpaIn-Net: Spatially Informed Music Source Separation

The spatial image of a music source is an essential feature in the stereophonic music listening…
(read more…)
BLOOM-Net: Scalability Matters

Scalability is a big deal when it comes to video coding. When you watch a movie…
(read more…)
Personalized Speech Enhancement

(Download Interspeech 2022 Tutorial Slides) The outstanding development in modern AI has relied greatly on the…
(read more…)
Psychoacoustic Loss Functions for Neural Audio Coding

Neural audio coding is an area where we want to compress an audio signal down to…
(read more…)
Neural Pitch Correction of Singing Voice

Have you ever wished if you were a good singer? Some people believe that it’s a…
(read more…)
Cross-Module Residual Learning for Neural Audio Coding

Speech/audio coding has traditionally involved substantial domain-specific knowledge such as speech generation models. If you haven’t…
(read more…)

News

New Ph.D. Candidates
April 3, 2026
Jaesung, Jackie, and Cameron passed their qual exams and became Ph.D. candidates. Congratulations!Read More »
Talk at Adobe Research
February 24, 2026
I gave a talk at Adobe Research on recent neural coding projects I’m working on. It was nice to meet good old friends, although it was a remote talk.Read More »
ICASSP 2026 Papers Accepted
January 20, 2026
I have been fortunate enough to work on the following interesting papers that were accepted for publication at ICASSP 2026.Read More »
Jaesung Passed Qual
December 11, 2025
Jaesung Bae successfully passed the qualifying exam.Read More »
Haici Yang Defended
December 11, 2025
Haici Yang defended her dissertation (“Latent Variable Learning for Generative Neural Audio Codecs”) successfully!Read More »
WASPAA 2025 in Lake Tahoe
October 19, 2025
My group in Illinois, former students at IU, and collaborators have had a strong presence at WASPAA 2025, which was one of the best conference experiences I’ve had so far.…Read More »
ISMIR 2025 in Daejeon, Korea
September 28, 2025
ISMIR 2025 in Daejeon, Korea was really fun. Learned a lot from the nice papers, how things are organized there, and enjoyed the music so much. Yutong Wen presented a…Read More »
Darius Defended
September 6, 2025
Darius Petermann successfully defended his dissertation on “Efficient Native Neural Sub-band Coding through Residual Feature Representation within Hyper-Autoencoded Reconstruction Propagation Networks.”Read More »
Fraunhofer IIS and AudioLabs in Erlangen
July 10, 2025
The visit to Fraunhofer IIS and International Audio Labs in Erlangen was so inspiring and heartwarming. Introducing my neural audio coding works to the leading experts in audio coding was…Read More »
University of Hamburg
July 8, 2025
It was such a nice visit to Timo Gerkmann’s signal processing group at the University of Hamburg. I had a great time speaking with the bright students and researchers there,…Read More »
WASPAA 2025
July 3, 2025
Four papers were accepted for publication at WASPAA 2025:Read More »
Ajou Honorary Alumni
June 30, 2025
I was recognized as “Ajou Honorary Alumni” by my alma mater, Ajou University, along with my then-girlfriend-now-wife Kahyun Choi.Read More »
Aalborg University
June 16, 2025
I will be spending June and July at Aalborg University, visiting the Audio Analysis Lab and the AI and Sound Section at the Department of Electronic Systems.Read More »
Anastasia Kuznetsova Defended
June 12, 2025
Anastasia Kuznetsova successfully defended her dissertation on “Data Efficiency and Model Complexity Reduction for Speech Processing Systems.” Congratulations!Read More »
Listed as a Teacher Ranked as Excellent by Students
June 7, 2025
I was listed in the List of Teachers Ranked as Excellent by Students for the course I taught in Spring 2025: “CS598 Generative Models for Audio.”Read More »
ISMIR 2025
June 7, 2025
Yutong Wen’s paper on “User-Guided Generative Source Separation” was accepted for publication at ISMIR 2025.Read More »
ICASSP 2025 in Hyderabad, India
April 24, 2025
ICASSP 2025 was fun! I organized the Generative Data Augmentation (GenDA) workshop along with my colleagues (Dinesh Manocha at U. of Maryland, Johan Hershey at Google, and Trausti Kristjansson at…Read More »
Talk at Conversational AI Reading Group
February 11, 2025
I’m giving a talk at the Conversational AI Reading Group, led by Pooneh Mousavi, a PhD student advised by Prof. Mirco Ravanelli at Mila/Concordia University. I’ll be talking about scalable…Read More »
Keynote at Codec-SUPERB
December 3, 2024
I was honored to be invited to the Codec-SUPERB workshop as a keynote speaker. It was on “Future Directions in Neural Speech Communication Codecs.” [slides][Youtube]Read More »
“Ji Seokyoung” Patent Award
November 5, 2024
I received the “Ji Seokyoung Award” for the patent “Encoding apparatus and decoding apparatus for transforming between modified discrete cosine transform-based coder and different coder (KR101670063B1)” that I filed as…Read More »
Research Talks in Montréal
October 10, 2024
I’m visiting Montréal to give a keynote speech at ANNPR 2024 (10/11) and an invited talk at Mila (10/14) on neural speech and audio coding.Read More »
IEEE Signal Processing Magazine
July 24, 2024
My paper with Jan Skoglund on “Neural Speech and Audio Coding” was accepted for publication in the IEEE Signal Processing Magazine.Read More »
Invited Talk at MERL
July 19, 2024
Gave an invited talk at Mitsubishi Electric Research Labs (MERL) on neural speech and audio coding.Read More »
Papers Accepted for Publication at Interspeech 2024
June 10, 2024
Two papers were accepted for publication at Interspeech 2024.Read More »
Aswin Sivaraman
May 7, 2024
Aswin Sivaraman defended his dissertation on “Resource-Efficient Model Adaptation Methods for Personalized Speech Enhancement Systems.” He is joining Apple after graduation. Congratulations!Read More »

Earlier News