A random matrix approach to neural networks

Cosme Louart

A random matrix approach to neural networks

Cosme Louart

The Annals of Applied Probability

visibility

…

description

64 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

This article studies the Gram random matrix model G = 1 T Σ T Σ, Σ = σ(W X), classically found in the analysis of random feature maps and random neural networks, where X = [x1, . . . , xT ] ∈ R p×T is a (data) matrix of bounded norm, W ∈ R n×p is a matrix of independent zero-mean unit variance entries, and σ : R → R is a Lipschitz continuous (activation) function -σ(W X) being understood entrywise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as n, p, T grow large at the same rate, the resolvent Q = (G + γIT ) -1 , for γ > 0, has a similar behavior as that met in sample covariance matrix models, involving notably the moment Φ = T n E[G], which provides in passing a deterministic equivalent for the empirical spectral measure of G. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters. * Couillet's work is supported by the ANR Project RMT4GRAPH (ANR-14-CE28-0006).

Cosme Louart

2018

This article provides a theoretical analysis of the asymptotic performance of a regression or classification task performed by a simple random neural network. This result is obtained by leveraging a new framework at the crossroads between random matrix theory and the concentration of measure theory. This approach is of utmost interest for neural network analysis at large in that it naturally dismisses the difficulty induced by the non-linear activation functions, so long that these are Lipschitz functions. As an application, we provide formulas for the limiting law of the random neural network output and compare them conclusively to those obtained practically on handwritten digits databases.

Log In

A random matrix approach to neural networks

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics