AV-DAR: Differentiable Room Acoustic Rendering with Multi-View Vision Priors

Project Page | Paper(arXiv)

Official implementation of the ICCV 2025 Oral paper "Differentiable Room Acoustic Rendering with Multi-View Vision Priors."

Updates

Oct 17, 2025: Released our training & evaluation code.

Installation

Environment

Tested on Python 3.10, with Torch version '2.4.1+cu118', other version should be fine.
Install dependencies

git clone https://github.com/HuMathe/av-dar.git
cd av-dar
conda create -n av-dar python=3.10 -y
conda activate av-dar
pip install --index-url https://download.pytorch.org/whl/cu118 \
    torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1
pip install -r requirements.txt

Datasets:

Download Hearing Anything Anywhere and Real Acoustic Field based on their instruction.

Please download the datasets following their official instructions:

Then update two lines in config/base.yaml:

haa_data_dir: /path/to/HAA
raf_data_dir: /path/to/RAF/archive

Preprocessed vision features

We use precomputed multi-view image features, and unproject them to the room's sample points. By default we expect them under:

preprocess/image-features/{haa,raf}/...

For EmptyRoom and FurnishedRoom, unzip the features.npy.zip files before training:

unzip features.npy.zip

Preprocessing scripts are provided:

preprocess/preprocess-haa.py
preprocess/preprocess-raf.py

We plan to release a full automatic preprocessing pipeline.

Usage

Train (Hydra)

We use Hydra to manage configs:

# HAA — ClassroomBase @ 16 kHz
python train.py dataset=classroomBase-16K train=HAA-ClassroomBase-16K device=cuda:0

# Other HAA room types
# dataset=complexBase-16K | dampenedBase-16K | hallwayBase-16K
# train=HAA-ComplexBase-16K | HAA-DampenedBase-16K | HAA-HallwayBase-16K

RAF examples (16 kHz, different sparsity splits in configs):

# Empty room
python train.py dataset=EmptyRoom-16K-0.1% train=RAF-Empty-16K-0.1% device=cuda:0
python train.py dataset=EmptyRoom-16K-1% train=RAF-Empty-16K-1% device=cuda:0

# Furnished room (0.1%)
python train.py dataset=FurnishedRoom-16K-0.1% train=RAF-Furnished-16K-0.1% device=cuda:0
python train.py dataset=FurnishedRoom-16K-1% train=RAF-Furnished-16K-1% device=cuda:0

Tip: HYDRA_FULL_ERROR=1 helps with debugging config merges.

Evaluate

# evaluate a trained run directory
python evaluate.py --config_dir /path/to/your/training/run

Repository Structure

|-- av-dar
|   |-- core/   # io/run/typing...
|   |-- data/   # dataset loaders
|   |-- geometry/ # beam tracer
|   |-- model/  # renderer & sub-components...
|   `-- utils/
|-- config
|   |-- base.yaml
|   |-- dataset/ # data configs
|   `-- train/  # training configs (including model configs)
|-- data-split/ # datasplit json files
|-- evaluate.py
|-- mesh/*.obj  # room geometries for beam tracing
|-- preprocess/ # preprocess image features...
|-- README.md
`-- train.py

TODOs

Release the checkpoints for trained models.

Data Attribution & Licenses

RAF-derived meshes → CC BY-NC 4.0 (non-commercial). See details and change notes in ATTRIBUTION.md.
HAA-derived meshes (format conversion to .obj only) → CC BY 4.0. See details in ATTRIBUTION.md.

No endorsement is implied by the original authors or licensors.

License (Code)

This repository’s code is released under the MIT License. See LICENSE.

Citation

If you think this work is useful, please cite our paper.

@InProceedings{Jin_2025_ICCV,
    author    = {Jin, Derong and Gao, Ruohan},
    title     = {Differentiable Room Acoustic Rendering with Multi-View Vision Priors},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {37-47}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AV-DAR: Differentiable Room Acoustic Rendering with Multi-View Vision Priors

Project Page | Paper(arXiv)

Updates

Installation

Environment

Datasets:

Preprocessed vision features

Usage

Train (Hydra)

Evaluate

Repository Structure

TODOs

Data Attribution & Licenses

License (Code)

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
avdar		avdar
config		config
data-split		data-split
mesh		mesh
preprocess		preprocess
.gitignore		.gitignore
ATTRIBUTION.md		ATTRIBUTION.md
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
requirements.txt		requirements.txt
train.py		train.py

License

HuMathe/av-dar

Folders and files

Latest commit

History

Repository files navigation

AV-DAR: Differentiable Room Acoustic Rendering with Multi-View Vision Priors

Project Page | Paper(arXiv)

Updates

Installation

Environment

Datasets:

Preprocessed vision features

Usage

Train (Hydra)

Evaluate

Repository Structure

TODOs

Data Attribution & Licenses

License (Code)

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages