Convolutional Neural Networks (AlexNet)
Krizhevsky et al. (2012)
Prepared for Academic Purposes
Abstract & Background
In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton introduced AlexNet, a deep
convolutional neural network that won the ImageNet Large Scale Visual Recognition Challenge
(ILSVRC) by a huge margin. This achievement marked a turning point in computer vision, where
traditional feature engineering was overtaken by deep learning.
Key Contributions
AlexNet was notable for several innovations: the use of Rectified Linear Units (ReLU) as activation
functions, dropout regularization to combat overfitting, and leveraging GPU computation to
accelerate training. Its architecture consisted of five convolutional layers followed by three fully
connected layers, with max pooling and local response normalization. AlexNet reduced the top-5
error rate in ImageNet by over 10 percentage points compared to the previous best model.
Critical Analysis
AlexNet’s success demonstrated the scalability of deep learning. It showed that with large datasets,
appropriate regularization, and computational power, neural networks could outperform traditional
methods. However, the model was extremely resource-intensive for its time, and its interpretability
remained limited. Later architectures such as VGG, ResNet, and EfficientNet built on these
foundations, addressing scalability and efficiency.
Personal Reflection
To me, AlexNet symbolizes a paradigm shift in AI research. It showed that deep learning was not
just an academic curiosity but a practical tool capable of solving large-scale problems. Its influence
can still be felt today, as almost all modern vision models trace their lineage back to AlexNet. I
believe its greatest lesson is the importance of combining algorithmic innovation with computational
resources and large datasets.
References
[1] A. Krizhevsky, I. Sutskever and G. E. Hinton, 'ImageNet Classification with Deep Convolutional
Neural Networks,' Advances in Neural Information Processing Systems (NeurIPS), 2012.
[2] K. Simonyan and A. Zisserman, 'Very Deep Convolutional Networks for Large-Scale Image
Recognition,' International Conference on Learning Representations (ICLR), 2015.
[3] K. He, X. Zhang, S. Ren and J. Sun, 'Deep Residual Learning for Image Recognition,' IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), 2016.