uqmm-iclr24-notes

Ctrlk

ICLR'24 Conference Memo

For details of each chapter, please refer to the subpage;

Overview

Chap I. Efficient Adaptation

1. Sparse Attention and KV Cache Compression

(window attention?) LONGLORA: EFFICIENT FINE-TUNING OF LONGCONTEXT LARGE LANGUAGE MODELS
(attention sink) Efficient Streaming Language Models with Attention Sinks
(5 patterns) Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

2. Continual Learning

Meta Continual Learning Revisited: Implicitly Enhancing Online Hessian Approximation via Variance Reduction

3. Hyperparameter Optimization (HPO)

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

4. Continuous Shifts

Latent Trajectory Learning for Limited Timestamps under Distribution Shift over Time

Chap II. Generalization

1. Param Optimization

👍👍 [Teleportation] Improving Convergence and Generalization Using Parameter Symmetries

2. OOD Generalization

Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization

3. Augmentation

Effective Data Augmentation With Diffusion Models
👍👍 (Generalization bound & augmentation complexity?) Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

4. Data Selection

5. Pre-training

Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video
- pre-train on unlabeled ego videos

Last updated 1 year ago