Skip to content

Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling

Notifications You must be signed in to change notification settings

huckiyang/Interspeech23-Tutorial-Para-Efficient-Cross-Modal-Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 

Repository files navigation

Interspeech 2023 Tutorial

  • Interspeech 23 Resource Efficient and Cross-Modal Learning Toward Foundation Modeling Tutorial - Video

  • ICASSP 22 Tutorial Neural Model Reprogramming and Prompting for Speech Modeling - Video | Slide

  • ICASSP 23 Tutorial Parameter-Efficient Learning (PEL) for Speech and NLP: Adapters, Prompts, and Reprogramming - Slide

Part 1. Overview of Resource Efficient Learning, Dr. Huck Yang

9:00

1.1. Parameter-Efficient Learning

  • Background of Frozen Model Adaptation
  • Neural Adapter, Reprogramming, Prompting, and Low-Rank Adaptation (LoRA)
Title Authors Code Year
Differentially Private Adapters for Parameter Efficient Acoustic Modeling C.-W. Ho et al. code Interspeech 2023
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation L.-J. Yang et al. code Interspeech 2023
A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model S. Radhakrishnan et al. code Interspeech 2023

1.2. Memory-Efficient Learning

  • Reduce to GPU / TPU Memory During the Training (e.g., the Memory of Activation)
  • Model Serialization
  • Efficient On-Device Learning via Feature Reprogramming (CVPR 2022)
  • Ladder-Side Tuning (NeurIPS 2022)

1.3 How to Estimate which Layer or which Model to Tune?

  • Universal Approximation Theory (IEEE TIP 1993)
  • LogME: Practical Assessment of Pre-trained Models for Transfer Learning (ICML 2021)
  • Latent Space Alignment in "Reprogramming Acoustic Models for Time Series Classification" (ICML 2021)
Title Authors Code Year
How to Estimate Model Transferability of Pre-Trained Speech Models? Z.-C. Chen et al. code Interspeech 2023

1.4 Advanced Low-Rank Adaptation (LoRA) Techniques

  • Cross-Modal Merging
  • Low-Rank Adaptation (LoRA) for Foundation Modeling and Pre-Training

1.5 Community Service

  • Special Session in ICASSP 2024: In-Context Learning for Speech and Language Processing
  • [email protected]

Break: Hand-on Session 1 (5 min)

  • How to Train Your Whisper with Neural Adapter and LoRA

Part 2: Trustworthy AI and Cross-Modal Learning in the Era of Foundation Models, Dr. Pin-Yu Chen

11:00 to 11:45

Part 3: Multimodal Pre-Training for Automatic Speech Recognition and Vision Sharing, Dr. Shalini Ghosh

11:45 to 12:20

Spotlight Invited Talk, "Prompting LLM for ASR," by Dr. Chunyang Wu, Meta AI

12:20 to 12:30

QA and Plenary Discussion

12:40 to 12:45

About

Interspeech Tutorial - Resource Efficient and Cross-Modal Learning Toward Foundation Modeling

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published