Advancing Test-Time Adaptation in Wild Acoustic Test Settings

This repository contains code and analysis for the paper Advancing Test-Time Adaptation in Wild Acoustic Test Settings.

Overview

Acoustic foundation models, fine-tuned for Automatic Speech Recognition (ASR), suffer from performance degradation in wild acoustic test settings when deployed in real-world scenarios. Stabilizing online Test-Time Adaptation (TTA) under these conditions remains an open and unexplored question. Existing wild vision TTA methods often fail to handle speech data effectively due to the unique characteristics of high-entropy speech frames, which are unreliably filtered out even when containing crucial semantic content. Furthermore, unlike static vision data, speech signals follow short-term consistency, requiring specialized adaptation strategies. In this work, we propose a novel wild acoustic TTA method tailored for ASR fine-tuned acoustic foundation models. Our method, Confidence-Enhanced Adaptation, performs frame-level adaptation using a confidence-aware weight scheme to avoid filtering out essential information in high-entropy frames. Additionally, we apply consistency regularization during test-time optimization to leverage the inherent short-term consistency of speech signals. Our experiments on both synthetic and real-world datasets demonstrate that our approach outperforms existing baselines under various wild acoustic test settings, including Gaussian noise, environmental sounds, accent variations, and sung speech.

Quick Start

Installation

cd CEA
pip install -r requirements.txt

Data Source

Synthesize Data

To obtained the synthesized dataset LS-P, run the following scripts

python noisyspeech_synthesizer.py

Run CEA

Our main code include ./main_cea.py, ./utils.py.

To run experiments on different datasets, use scripts from scripts/

LS-C

bash ./scripts/ls-c.sh

LS-P

bash ./scripts/ls-p.sh

L2-Arctic

bash ./scripts/l2.sh

Singing (DSing-dev, DSing-test, Hansen)

bash ./scripts/dsing.sh

See more details about the parameters in each script.

Acknowledgements

This repository is based on Tent, SAR, and SUTA and SGEM.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
corpus		corpus
noise_type		noise_type
scripts		scripts
LICENSE		LICENSE
README.md		README.md
data.py		data.py
main_cea.py		main_cea.py
utils.py		utils.py
vocab.json		vocab.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advancing Test-Time Adaptation in Wild Acoustic Test Settings

Overview

Quick Start

Installation

Data Source

Synthesize Data

Run CEA

LS-C

LS-P

L2-Arctic

Singing (DSing-dev, DSing-test, Hansen)

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Advancing Test-Time Adaptation in Wild Acoustic Test Settings

Overview

Quick Start

Installation

Data Source

Synthesize Data

Run CEA

LS-C

LS-P

L2-Arctic

Singing (DSing-dev, DSing-test, Hansen)

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages