Skip to content

Official repo for NeurIPS2024 paper "Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance"

Notifications You must be signed in to change notification settings

Ytchen981/Channel_Matthew_Effect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance

This is the official repo for the NeurIPS2024 paper "Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance"

Brief Introduction to The Paper

In this paper, we show that the Matthew effect exist between similar channels where channels with larger weights have larger gradient and could be trained faster. We further show that wide and narrow layers show two different patterns from the weight variance perspective. For narrow layers, when trained from scratch, the weight norm variance firstly increases and then decreases. For wide layers, the weight norm variance continuously increases until convergence.

image

Experiments are conducted over various architectures and datasets to verify our observation. We will provide codes we use to track the weight variance during training.

Quick Start

We provide an example in awd-lstm-lm-master modified from the repo awd-lstm-lm. In main.py, from line 330-430, we calculate the channel norm variance and plot it. By changing the nhid argument, different patterns could be observed.

Code Update on the Way

Other codes will be combed and updated soon.

About

Official repo for NeurIPS2024 paper "Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published