Lecture 6. Neural Networks¶

How to train your neurons

Joaquin Vanschoren

Overview¶

Neural architectures
Training neural nets
- Forward pass: Tensor operations
- Backward pass: Backpropagation
Neural network design:
- Activation functions
- Weight initialization
- Optimizers
Neural networks in practice
Model selection
- Early stopping
- Memorization capacity and information bottleneck
- L1/L2 regularization
- Dropout
- Batch normalization

Architecture¶

Logistic regression, drawn in a different, neuro-inspired, way
- Linear model: inner product ($z$) of input vector $\mathbf{x}$ and weight vector $\mathbf{w}$, plus bias $w_0$
- Logistic (or sigmoid) function maps the output to a probability in [0,1]
- Uses log loss (cross-entropy) and gradient descent to learn the weights

$$\hat{y}(\mathbf{x}) = \text{sigmoid}(z) = \text{sigmoid}(w_0 + \mathbf{w}\mathbf{x}) = \text{sigmoid}(w_0 + w_1 * x_1 + w_2 * x_2 +... + w_p * x_p)$$