Lecture 6. Neural Networks¶

How to train your neurons

Joaquin Vanschoren

Overview¶

  • Neural architectures
  • Training neural nets
    • Forward pass: Tensor operations
    • Backward pass: Backpropagation
  • Neural network design:
    • Activation functions
    • Weight initialization
    • Optimizers
  • Neural networks in practice
  • Model selection
    • Early stopping
    • Memorization capacity and information bottleneck
    • L1/L2 regularization
    • Dropout
    • Batch normalization

Architecture¶

  • Logistic regression, drawn in a different, neuro-inspired, way
    • Linear model: inner product ($z$) of input vector $\mathbf{x}$ and weight vector $\mathbf{w}$, plus bias $w_0$
    • Logistic (or sigmoid) function maps the output to a probability in [0,1]
    • Uses log loss (cross-entropy) and gradient descent to learn the weights

$$\hat{y}(\mathbf{x}) = \text{sigmoid}(z) = \text{sigmoid}(w_0 + \mathbf{w}\mathbf{x}) = \text{sigmoid}(w_0 + w_1 * x_1 + w_2 * x_2 +... + w_p * x_p)$$