DL Lecture Part1v1
DL Lecture Part1v1
a 2-weeks lecture
Part 1
9
Why GPU? (I)
More layers more training samples more execution time
Big NN
Success Rate
Medium NN
Small NN
Classic ML
Feature ML
Extraction Algorithm
Feature ML
Extraction Algorithm
X2 X2
X1 X1
Feature ML
Extraction Algorithm
??
X2 X2 X2
X1 X1 X1
Feature ML
Extraction Algorithm
https://github.com/terryum/awesome-deep-learning-papers
State-of-the-art
Computer Vision Speech Music
● Image Segmentation ● Speech Recognition ● Music Generation
● Image Classification ● Speech Synthesis ● Music Information Retrieval
● Object Detection ● Speech Enhancement ● Music Source Separation
● Image Generation ● Speaker Verification ● Music Modeling
19
AI in marketing & sales: Propensity to buy
The problem Opportunity for DL
● A lack of knowledge about a customer’s ● Model using a combination of semantic analysis:
propensity to buy ○ Text written by the customer,
● “Propensity to buy” is the likelihood of a ○ Demographic information,
customer to purchase a particular product. ○ Purchase history
What can be achieved? ○ Information about how they navigate the website
● Classify potential customers by their to make a prediction for that customer’s propensity
likelihood to purchase a particular product. to buy.
● This can be integrated into to marketing and Data requirements:
sales strategies. ● A model like this would need historical data of
demographics and pre-purchase behavior of
customers linked to if a purchase was made.
Source: https://peltarion.com/use-cases/propensity-to-buy 20
Using AI to detect fraud
Source: https://peltarion.com/use-cases/fraud-detection 21
Automated defect detection
The problem Opportunity for DL
● Product quality testing is slow and ● DL for fully automated production line and
enable more accurate analysis of the quality
inefficient (bottlenecks).
of each individual part.
● Traditional automated systems are Data requirements
both expensive and difficult to ● Trained on images of manufactured parts,
Source: https://peltarion.com/use-cases/defect-detection 22
Audio analysis for industrial
maintenance
A key part of smart manufacturing and a
modern factory approach involves real-time
monitoring of machinery operating
conditions.
What can be achieved?
● DL to detect malfunctioning machinery
in real-time will lead to increased
productivity and decreased costs.
Data requirements
● Audio recordings
○ functioning or malfunctioning
machinery.
Mel-spectrogram of an industrial solenoid valve
● Microphones mounted in key parts of
each machine.
Source: https://peltarion.com/use-cases/machinery-operating-conditions 23
Improving customer service
through sentiment
The problem
● Frustration associated with bad experiences
can have a significant impact on customer
retention.
Opportunity for DL
Automated customer service phone calls and ● Natural language processing (NLP) are ideal
chatbots are becoming increasingly easy to for gaining insight into the user experience in
interact with. customer service interactions.
Data requirements
● Text or audio from historical examples
○ successful and unsuccessful automated
customer service interactions
Source: https://peltarion.com/use-cases/customer-service-sentiment-analysis 24
Main researchers in Deep Learning
● Samy Bengio https://research.google.com/pubs/bengio.html
● Yoshua Bengio http://www.iro.umontreal.ca/~bengioy/yoshua_en/research.html
● Thomas Dean htps://research.google.com/pubs/author189.html
● Jeffrey Dean https://research.google.com/pubs/jeff.html
● Nando de Freitas https://www.cs.ox.ac.uk/people/nando.defreitas/
● Geoff Hilton http://www.cs.toronto.edu/~hinton/
● Yann LeCun http://yann.lecun.com/
● Andrew Ng http://www.andrewng.org/
● Quoc Le, Honglak Lee, Tommy Poggio, ...
25
Resources
● Aurélien Géron. Hands-On Machine Learning with Scikit-Learn and TensorFlow. 2017 (✰✰✰✰✰)
● François Chollet. Deep Learning with Python. 2017 (✰✰✰✰)
○ Practitioner’s approach. Keras implementation per topic
● Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning (Adaptive Computation and Machine
Learning series). 2015 (✰✰)
○ Theoretical book. There is no code covered in the book.
● Michael Nielsen. Neural Networks and Deep Learning
○ Theory-based learning approach. Some code snippets.
● Gulli and Kapoor. TensorFlow Deep Learning Cookbook.
○ Lots of code and explanations of what the code is doing
● Adrian Rosebrock. Deep Learning for Computer Vision with Python.
● Sandro Skansi. Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence. 2018
● Andriy Burkov. The Hundred-Page Machine Learning Book.
○ All started because of challenge accepted
● Andrew Ng. Machine Learning Yearning: Technical strategy for AI engineers, in the era of Deep Learning.
● Yaser S. Abu-Mostafa, Malik Magdon-Ismail, Hsuan-Tien Lin. Learning from Data: A short course.
○ Supplement with lectures and videos.
26
Libraries for Deep Learning
27
Lecture 01
Sebastian Raschka
http://stat.wisc.edu/~sraschka/teaching/stat453-ss2020/
Labeled data
Supervised Learning Direct feedback
Predict outcome/future
No labels/targets
Unsupervised Learning No feedback
Find hidden structure in data
Decision process
Reinforcement Learning Reward system
Learn series of actions
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning SS 2020 29
Machine Learning
Terminology and Notation
• supervised learning:
learn function to map input x (features) to output y
(targets)
• structured data:
databases, spreadsheets/csv files
• unstructured data:
features like image pixels, audio signals, text
sentences
(previous to DL, extensive feature engineering required)
"training examples"
Classification Regression
m m
h : ℝ → 𝒴, 𝒴 = {1,...,k} h:ℝ →ℝ
m= _____
n= _____
"traditional methods"
Labels
Training Data
Machine Learning
Algorithm
Labels
Training Dataset
Learning
Final Model New Data
Labels Algorithm
Model Selection
Cross-Validation
Performance Metrics
Hyperparameter Optimization
Source: Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Sebastian Raschka STAT 453: Intro to Deep Learning SS 2020 57
Lecture 05
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 1
Perceptron Recap
b
<latexit sha1_base64="s6L+Z+fhtGywXDdOyCIOKOnOTTA=">AAAB6HicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjC/YD2lA220m7drMJuxuhhP4CLx4U8epP8ua/cdvmoK0vLDy8M8POvEEiuDau++0UNja3tneKu6W9/YPDo/LxSVvHqWLYYrGIVTegGgWX2DLcCOwmCmkUCOwEk7t5vfOESvNYPphpgn5ER5KHnFFjrWYwKFfcqrsQWQcvhwrkagzKX/1hzNIIpWGCat3z3MT4GVWGM4GzUj/VmFA2oSPsWZQ0Qu1ni0Vn5MI6QxLGyj5pyML9PZHRSOtpFNjOiJqxXq3Nzf9qvdSEN37GZZIalGz5UZgKYmIyv5oMuUJmxNQCZYrbXQkbU0WZsdmUbAje6snr0L6qepab15X6bR5HEc7gHC7BgxrU4R4a0AIGCM/wCm/Oo/PivDsfy9aCk8+cwh85nz/E44zm</latexit>
<latexit
x1
X
mm !!
X
<latexit sha1_base64="Z7jxfJr8/pbKF9IEHv5u2p28PzU=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjRfsBbSib7aZdutmE3YlYQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+l6/XHGr7lxkFbwcKpCr0S9/9QYxSyOukElqTNdzE/QzqlEwyaelXmp4QtmYDnnXoqIRN342X3VKzqwzIGGs7VNI5u7viYxGxkyiwHZGFEdmuTYz/6t1Uwyv/EyoJEWu2OKjMJUEYzK7mwyE5gzlxAJlWthdCRtRTRnadEo2BG/55FVoXVQ9y3eXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzmfPwuyjaA=</latexit>
Activation T
w1 x1xwi1w+
i+b b = = xT wx+wb +
= bŷ = ŷ
<latexit sha1_base64="yC0dBIEl9qTv0X7X9wtBoMi5o3k=">AAAB+nicbZBNS8NAEIYnftb6lerRS7AInkoigh6rXjxWsB/QhrLZbtulm03YnVRL7E/x4kERr/4Sb/4bt2kO2vrCwsM7M8zsG8SCa3Tdb2tldW19Y7OwVdze2d3bt0sHDR0lirI6jUSkWgHRTHDJ6shRsFasGAkDwZrB6GZWb46Z0jyS9ziJmR+SgeR9Tgkaq2uXOsgeMb2iyMeZNe3aZbfiZnKWwcuhDLlqXfur04toEjKJVBCt254bo58ShZwKNi12Es1iQkdkwNoGJQmZ9tPs9KlzYpye04+UeRKdzP09kZJQ60kYmM6Q4FAv1mbmf7V2gv1LP+UyTpBJOl/UT4SDkTPLwelxxSiKiQFCFTe3OnRIFKFo0iqaELzFLy9D46ziGb47L1ev8zgKcATHcAoeXEAVbqEGdaDwAM/wCm/Wk/VivVsf89YVK585hD+yPn8ADeOUgA==</latexit>
<latexit sha1_base64="ozSIzVA/SGXegmac4XRXthOpvw0=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjRfsBbSib7aZdutmE3YlSQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+l6/XHGr7lxkFbwcKpCr0S9/9QYxSyOukElqTNdzE/QzqlEwyaelXmp4QtmYDnnXoqIRN342X3VKzqwzIGGs7VNI5u7viYxGxkyiwHZGFEdmuTYz/6t1Uwyv/EyoJEWu2OKjMJUEYzK7mwyE5gzlxAJlWthdCRtRTRnadEo2BG/55FVoXVQ9y3eXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzmfPwosjZ8=</latexit>
X i=1
i=1
x2 w2 ŷ <latexit sha1_base64="A6o7zW7lnOXrNRdOyoA7bzsujJ4=">AAACUXicbVFNS8QwEJ2t3+vXqkcvg4uwIkgrgl4E0YtHBVeF7VrSbNoNJm1JUnUp/Yse9OT/8OJBMdtdwa+BgZc375HJS5gJro3rvtScicmp6ZnZufr8wuLScmNl9VKnuaKsTVORquuQaCZ4wtqGG8GuM8WIDAW7Cm9PhvOrO6Y0T5MLM8hYV5I44RGnxFgqaPR9zWNJ/GMexy30dS6Dgh965U0hS3wIPLy3vY0hVootPMSRAf2Qxy1fEtMPo+KhvLnAr8N9OTJYQaXvE1MMyqDRdHfcqvAv8MagCeM6CxpPfi+luWSJoYJo3fHczHQLogyngpV1P9csI/SWxKxjYUIk092iSqTETcv0MEqV7cRgxX53FERqPZChVQ631r9nQ/K/WSc30UG34EmWG5bQ0UVRLtCkOIwXe1wxasTAAkIVt7si7RNFqLGfULcheL+f/Bdc7u54Fp/vNY+Ox3HMwjpsQAs82IcjOIUzaAOFR3iFd/ioPdfeHHCckdSpjT1r8KOc+U+YcbFc</latexit>
<latexit sha1_base64="5F4RtKSwN6gLVNG6+drmzi7FRow=">AAACfHicbVFNT9tAEF0bKJDSksKRAyNSpCBUZKOq7aVSVC4cg5SQSHGI1pt1ssqube2OGyLLv6L/jFt/Si8Vm8TiI2Gk1Ty992Z2djZMpTDoeX8dd2Nz6932zm7l/d6Hj/vVTwe3Jsk0422WyER3Q2q4FDFvo0DJu6nmVIWSd8LJ1Vzv/ObaiCRu4SzlfUVHsYgEo2ipQfVPYMRIUQgkj7AOgckUDCAHAT/BhwLuLFY235dsAdMndA4hBFqMxnhm3SuNFMVxGFnnfdmlZfMzO12vH1O0wgyKQbXmXXiLgHXgl6BGymgOqg/BMGGZ4jEySY3p+V6K/ZxqFEzyohJkhqeUTeiI9yyMqeKmny+WV8CpZYYQJdqeGGHBvqzIqTJmpkLrnE9vVrU5+ZbWyzD60c9FnGbIY7a8KMokYALzn4Ch0JyhnFlAmRZ2VmBjqilD+18VuwR/9cnr4Pbywrf45mut8atcxw45IiekTnzynTTINWmSNmHkn3Ps1J0z57/72T13vyytrlPWHJJX4X57BMCwuMI=</latexit>
((
<latexit sha1_base64="Vi95YwknFrFzcB5LyqgiYSoMf0U=">AAAB7nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjBfsBbSib7aZdutmE3YkQQn+EFw+KePX3ePPfuG1z0NYXFh7emWFn3iCRwqDrfjuljc2t7Z3ybmVv/+DwqHp80jFxqhlvs1jGuhdQw6VQvI0CJe8lmtMokLwbTO/m9e4T10bE6hGzhPsRHSsRCkbRWt3BhGKezYbVmlt3FyLr4BVQg0KtYfVrMIpZGnGFTFJj+p6boJ9TjYJJPqsMUsMTyqZ0zPsWFY248fPFujNyYZ0RCWNtn0KycH9P5DQyJosC2xlRnJjV2tz8r9ZPMbzxc6GSFLliy4/CVBKMyfx2MhKaM5SZBcq0sLsSNqGaMrQJVWwI3urJ69C5qnuWH65rzdsijjKcwTlcggcNaMI9tKANDKbwDK/w5iTOi/PufCxbS04xcwp/5Hz+ALHDj8o=</latexit>
<latexit sha1_base64="8ur8Qnjf68veizOKVqkUmBXGiPw=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FSSIuix6MVjRfsBbSib7aZdutmE3YlYQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+rV+ueJW3bnIKng5VCBXo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKuRUUjbvxsvuqUnFlnQMJY26eQzN3fExmNjJlEge2MKI7Mcm1m/lfrphhe+ZlQSYpcscVHYSoJxmR2NxkIzRnKiQXKtLC7EjaimjK06ZRsCN7yyavQqlU9y3cXlfp1HkcRTuAUzsGDS6jDLTSgCQyG8Ayv8OZI58V5dz4WrQUnnzmGP3I+fwANNo2h</latexit>
<latexit sha1_base64="dwm2weui/za/rSvUQ73K/Rk4/5k=">AAAB7XicbZDLSgMxFIbP1Futt1GXboJFcFVmRNBl0Y3LCvYC7VAyaaaNzWVIMkIZ+g5uXCji1vdx59uYtrPQ1h8CH/85h5zzxylnxgbBt1daW9/Y3CpvV3Z29/YP/MOjllGZJrRJFFe6E2NDOZO0aZnltJNqikXMaTse387q7SeqDVPywU5SGgk8lCxhBFtntXqGDQXu+9WgFsyFViEsoAqFGn3/qzdQJBNUWsKxMd0wSG2UY20Z4XRa6WWGppiM8ZB2HUosqIny+bZTdOacAUqUdk9aNHd/T+RYGDMRsesU2I7Mcm1m/lfrZja5jnIm08xSSRYfJRlHVqHZ6WjANCWWTxxgopnbFZER1phYF1DFhRAun7wKrYta6Pj+slq/KeIowwmcwjmEcAV1uIMGNIHAIzzDK7x5ynvx3r2PRWvJK2aO4Y+8zx+cYY8j</latexit>
Output
.. <latexit sha1_base64="gvQE9cb1Dja5lCiXX5pMwfJapj8=">AAAB9HicbZBNS8NAEIY3ftb6VfXoZbEInkoigh6LXrxZwX5AG8pmO2mXbjZxd1Isob/DiwdFvPpjvPlv3LY5aOsLCw/vzLAzb5BIYdB1v52V1bX1jc3CVnF7Z3dvv3Rw2DBxqjnUeSxj3QqYASkU1FGghFaigUWBhGYwvJnWmyPQRsTqAccJ+BHrKxEKztBafgfhCbO7FJMUJ91S2a24M9Fl8HIok1y1bumr04t5GoFCLpkxbc9N0M+YRsElTIqd1EDC+JD1oW1RsQiMn82WntBT6/RoGGv7FNKZ+3siY5Ex4yiwnRHDgVmsTc3/au0Uwys/E8qeBIrPPwpTSTGm0wRoT2jgKMcWGNfC7kr5gGnG0eZUtCF4iycvQ+O84lm+vyhXr/M4CuSYnJAz4pFLUiW3pEbqhJNH8kxeyZszcl6cd+dj3rri5DNH5I+czx+i4ZKm</latexit>
00
00,if z <
. Net input
<latexit sha1_base64="vnW9SOTDG2wSeqwpvMYjb0pYOfc=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJVEBF0W3biSCvYCbSiT6Uk7dHJh5qRYQt/EjQtF3Pom7nwbp2kW2vrDwMd/zmHO+f1ECo2O822V1tY3NrfK25Wd3b39A/vwqKXjVHFo8ljGquMzDVJE0ESBEjqJAhb6Etr++HZeb09AaRFHjzhNwAvZMBKB4AyN1bftHsITZveAVERJirO+XXVqTi66Cm4BVVKo0be/eoOYpyFEyCXTuus6CXoZUyi4hFmll2pIGB+zIXQNRiwE7WX55jN6ZpwBDWJlXoQ0d39PZCzUehr6pjNkONLLtbn5X62bYnDtZflJEPHFR0EqKcZ0HgMdCAUc5dQA40qYXSkfMcU4mrAqJgR3+eRVaF3UXMMPl9X6TRFHmZyQU3JOXHJF6uSONEiTcDIhz+SVvFmZ9WK9Wx+L1pJVzByTP7I+fwD13ZPb</latexit>
(z)==
(z)
wm
<latexit sha1_base64="3SltFZgdSbEccduFdJMJ4sVJM+s=">AAAB6nicbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHirYW2qVk07QNTbJLMquUpT/BiwdFvPqLvPlvTNs9aOsLgYd3ZsjMGyVSWPT9b6+wsrq2vlHcLG1t7+zulfcPmjZODeMNFsvYtCJquRSaN1Cg5K3EcKoiyR+i0fW0/vDIjRWxvsdxwkNFB1r0BaPorLunruqWK37Vn4ksQ5BDBXLVu+WvTi9mqeIamaTWtgM/wTCjBgWTfFLqpJYnlI3ogLcdaqq4DbPZqhNy4pwe6cfGPY1k5v6eyKiydqwi16koDu1ibWr+V2un2L8MM6GTFLlm84/6qSQYk+ndpCcMZyjHDigzwu1K2JAaytClU3IhBIsnL0PzrBo4vj2v1K7yOIpwBMdwCgFcQA1uoA4NYDCAZ3iFN096L9679zFvLXj5zCH8kff5A2Ucjds=</latexit>
11,if z > 10
<latexit sha1_base64="m1vO6djD/2/n5RhOD3QM1YDZNMw=">AAACTnicbVHPaxNBGJ1NtY1R21SPXj4MQr2UXRHqQSHUi8cUzA/IhjA7+XYzZHZ2nfm2mCz5C3spvflnePFgEZ1N9qCJDwYe730/Zt5EuZKWfP+b1zh48PDwqPmo9fjJ0+OT9umzgc0KI7AvMpWZUcQtKqmxT5IUjnKDPI0UDqPFx8ofXqOxMtOfaZnjJOWJlrEUnJw0bWNoZZLys9Vr+ABhhInUpXDz7BpasIEPIeFXKkHGsAZYwftKCms32HHDBL84sXJD1LN61rTd8c/9DWCfBDXpsBq9afsunGWiSFGTUNzaceDnNCm5ISkUrlthYTHnYsETHDuqeYp2Um7iWMMrp8wgzow7mmCj/t1R8tTaZRq5ypTT3O56lfg/b1xQ/G5SSp0XhFpsF8WFAsqgyhZm0qAgtXSECyPdXUHMueGC3A+0XAjB7pP3yeDNeeD41dtO97KOo8lesJfsjAXsgnXZJ9ZjfSbYDfvOfrJ779b74f3yfm9LG17d85z9g0bzDz4prWY=</latexit>
<latexit sha1_base64="T1uilhZzWsdKA050UzjQrhpqtEw=">AAACLnicbVBdS8MwFE3nd/2a+uhLcAgKIq0ICqKIIvio4JywjJGmt1tYmpYkFbayX+SLf0UfBBXx1Z9htvVBpwcCJ+fce5N7glRwbTzv1SlNTE5Nz8zOufMLi0vL5ZXVW51kikGVJSJRdwHVILiEquFGwF2qgMaBgFrQOR/4tXtQmifyxnRTaMS0JXnEGTVWapYviOatmG71tvExJgG0uMyZnaf7rreDyRHuYSIAe5gQ1y+Ek9GVgAyL2ma54u16Q+C/xC9IBRW4apafSZiwLAZpmKBa130vNY2cKsOZgL5LMg0pZR3agrqlksagG/lw3T7etEqIo0TZIw0eqj87chpr3Y0DWxlT09bj3kD8z6tnJjps5FymmQHJRg9FmcAmwYPscMgVMCO6llCmuP0rZm2qKDM2YdeG4I+v/Jfc7u36ll/vV07Pijhm0TraQFvIRwfoFF2iK1RFDD2gJ/SG3p1H58X5cD5HpSWn6FlDv+B8fQN9YKSI</latexit>
xm <latexit sha1_base64="UZ/Cq01CQU77ibJgEHsrgiYApIY=">AAAB6nicbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHirYW2qVk07QNTbJLMiuWpT/BiwdFvPqLvPlvTNs9aOsLgYd3ZsjMGyVSWPT9b6+wsrq2vlHcLG1t7+zulfcPmjZODeMNFsvYtCJquRSaN1Cg5K3EcKoiyR+i0fW0/vDIjRWxvsdxwkNFB1r0BaPorLunruqWK37Vn4ksQ5BDBXLVu+WvTi9mqeIamaTWtgM/wTCjBgWTfFLqpJYnlI3ogLcdaqq4DbPZqhNy4pwe6cfGPY1k5v6eyKiydqwi16koDu1ibWr+V2un2L8MM6GTFLlm84/6qSQYk+ndpCcMZyjHDigzwu1K2JAaytClU3IhBIsnL0PzrBo4vj2v1K7yOIpwBMdwCgFcQA1uoA4NYDCAZ3iFN096L9679zFvLXj5zCH8kff5A2aijdw=</latexit>
b=
<latexit sha1_base64="DCslSBKTzJPCossa3lI7E6Re1xA=">AAAB83icbZBNS8NAEIY39avWr6pHL4tF8GJJRNCLUPTisYK1hSaUzXbSLt1swu5EKKF/w4sHRbz6Z7z5b9y2OWjrCwsP78wws2+YSmHQdb+d0srq2vpGebOytb2zu1fdP3g0SaY5tHgiE90JmQEpFLRQoIROqoHFoYR2OLqd1ttPoI1I1AOOUwhiNlAiEpyhtfyQXtMz6uMQkPWqNbfuzkSXwSugRgo1e9Uvv5/wLAaFXDJjup6bYpAzjYJLmFT8zEDK+IgNoGtRsRhMkM9untAT6/RplGj7FNKZ+3siZ7Ex4zi0nTHDoVmsTc3/at0Mo6sgFyrNEBSfL4oySTGh0wBoX2jgKMcWGNfC3kr5kGnG0cZUsSF4i19ehsfzumf5/qLWuCniKJMjckxOiUcuSYPckSZpEU5S8kxeyZuTOS/Ou/Mxby05xcwh+SPn8wdTHpCQ</latexit>
✓
Inputs
<latexit sha1_base64="kW2ZbIA+FSvwKPaKbZPllX8WYNo=">AAAB9HicbZBNS8NAEIYnftb6VfXoZbEInkoigh6LXvRWwX5AG8pmu2mXbjZxd1Isob/DiwdFvPpjvPlv3LY5aOsLCw/vzLAzb5BIYdB1v52V1bX1jc3CVnF7Z3dvv3Rw2DBxqhmvs1jGuhVQw6VQvI4CJW8lmtMokLwZDG+m9eaIayNi9YDjhPsR7SsRCkbRWn4H+RNmdypJ0Uy6pbJbcWciy+DlUIZctW7pq9OLWRpxhUxSY9qem6CfUY2CST4pdlLDE8qGtM/bFhWNuPGz2dITcmqdHgljbZ9CMnN/T2Q0MmYcBbYzojgwi7Wp+V+tnWJ45WdiehNXbP5RmEqCMZkmQHpCc4ZybIEyLeyuhA2opgxtTkUbgrd48jI0ziue5fuLcvU6j6MAx3ACZ+DBJVThFmpQBwaP8Ayv8OaMnBfn3fmYt644+cwR/JHz+QONXpKY</latexit>
Let D = (hx[1] , y [1] i, hx[2] , y [2] i, ..., hx[n] , y [n] i) 2 (Rm ⇥ {0, 1})n
<latexit sha1_base64="Wal6cJLWU5bzcyF5u646zqqA+4A=">AAADVXiclZLfbtMwFMZP0jFGB6ywS24sKiSQpiqpkOAGaYJd7HIguk1qs8lxndaa7UT2CaKK+lo8yG4m3oQbJE7dSvvTIYGjyJ+/851fEsd5pZXHJPkZxa2NB5sPtx61tx8/ebrTefb82Je1E3IgSl2605x7qZWVA1So5WnlJDe5lif5xadF/eSbdF6V9ivOKpkZPrGqUIIjWWXnB4zAAAeEKQiaNTRwAHNg8IHu11TV5FqY0CzJuU7nUFD2O2XPaB5CChnpPcrMbjkjcDcIe4Hxr8z+GrP/F2YvXP9Ht2t0ew/9TWAoWi/345qWU8+XQDMhg5Qy1OHDqoGE2CmpOTHOwLLt80436SVhsHWRrkQXVuPovHM5GpeiNtKi0Nz7YZpUmDXcoRJaztuj2suKiws+kUOSlhvpsyacijl7Rc6YFaWj2yIL7s2OhhvvZyanpOE49XdrC/O+2rDG4n3WKFvVKK1YPqioNcOSLY4YGysnBeoZCS6condlYsodF0gHsU2bkN795HVx3O+lpD+/7e5/XG3HFryAl/QLUngH+3AIRzAAEV1Gv+IojuOr+Hdro7W5jMbRqmcXbo3Wzh+cKM3Z</latexit>
Let D = (hx[1] , y [1] i, hx[2] , y [2] i, ..., hx[n] , y [n] i) 2 (Rm ⇥ {0, 1})n
<latexit sha1_base64="Wal6cJLWU5bzcyF5u646zqqA+4A=">AAADVXiclZLfbtMwFMZP0jFGB6ywS24sKiSQpiqpkOAGaYJd7HIguk1qs8lxndaa7UT2CaKK+lo8yG4m3oQbJE7dSvvTIYGjyJ+/851fEsd5pZXHJPkZxa2NB5sPtx61tx8/ebrTefb82Je1E3IgSl2605x7qZWVA1So5WnlJDe5lif5xadF/eSbdF6V9ivOKpkZPrGqUIIjWWXnB4zAAAeEKQiaNTRwAHNg8IHu11TV5FqY0CzJuU7nUFD2O2XPaB5CChnpPcrMbjkjcDcIe4Hxr8z+GrP/F2YvXP9Ht2t0ew/9TWAoWi/345qWU8+XQDMhg5Qy1OHDqoGE2CmpOTHOwLLt80436SVhsHWRrkQXVuPovHM5GpeiNtKi0Nz7YZpUmDXcoRJaztuj2suKiws+kUOSlhvpsyacijl7Rc6YFaWj2yIL7s2OhhvvZyanpOE49XdrC/O+2rDG4n3WKFvVKK1YPqioNcOSLY4YGysnBeoZCS6condlYsodF0gHsU2bkN795HVx3O+lpD+/7e5/XG3HFryAl/QLUngH+3AIRzAAEV1Gv+IojuOr+Hdro7W5jMbRqmcXbo3Wzh+cKM3Z</latexit>
"On-line" mode
m This applies to all common neuron
1. Initialize w := 0 2 R , b := 0
models and (deep) neural network
<latexit sha1_base64="NlqcCPP+x7/BYO7u8+fZM6fVAMw=">AAAB+HicbVDLSsNAFL2pr1ofjbp0M1gKrkqiBUUQim5cVrAPaEOZTCft0MkkzEyEGvolblwo4tZPceffOGmz0NYDFw7n3MucOX7MmdKO820V1tY3NreK26Wd3b39sn1w2FZRIgltkYhHsutjRTkTtKWZ5rQbS4pDn9OOP7nN/M4jlYpF4kFPY+qFeCRYwAjWRhrY5X6I9dgPUn+Grq6RM7ArTs2ZA60SNycVyNEc2F/9YUSSkApNOFaq5zqx9lIsNSOczkr9RNEYkwke0Z6hAodUeek8+AxVjTJEQSTNCI3m6u+LFIdKTUOTrZrFVMteJv7n9RIdXHopE3GiqSCLh4KEIx2hrAU0ZJISzaeGYCKZyYrIGEtMtOmqZEpwl7+8StpnNfe85tzXK42bvI4iHMMJnIILF9CAO2hCCwgk8Ayv8GY9WS/Wu/WxWC1Y+c0R/IH1+QOB5ZJS</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 9
General Learning Principle
Let D = (hx[1] , y[1] i, hx[2] , y[2] i, ..., hx[n] , y[n] i) 2 (Rm ⇥ {0, 1})n <latexit sha1_base64="Wal6cJLWU5bzcyF5u646zqqA+4A=">AAADVXiclZLfbtMwFMZP0jFGB6ywS24sKiSQpiqpkOAGaYJd7HIguk1qs8lxndaa7UT2CaKK+lo8yG4m3oQbJE7dSvvTIYGjyJ+/851fEsd5pZXHJPkZxa2NB5sPtx61tx8/ebrTefb82Je1E3IgSl2605x7qZWVA1So5WnlJDe5lif5xadF/eSbdF6V9ivOKpkZPrGqUIIjWWXnB4zAAAeEKQiaNTRwAHNg8IHu11TV5FqY0CzJuU7nUFD2O2XPaB5CChnpPcrMbjkjcDcIe4Hxr8z+GrP/F2YvXP9Ht2t0ew/9TWAoWi/345qWU8+XQDMhg5Qy1OHDqoGE2CmpOTHOwLLt80436SVhsHWRrkQXVuPovHM5GpeiNtKi0Nz7YZpUmDXcoRJaztuj2suKiws+kUOSlhvpsyacijl7Rc6YFaWj2yIL7s2OhhvvZyanpOE49XdrC/O+2rDG4n3WKFvVKK1YPqioNcOSLY4YGysnBeoZCS6condlYsodF0gHsU2bkN795HVx3O+lpD+/7e5/XG3HFryAl/QLUngH+3AIRzAAEV1Gv+IojuOr+Hdro7W5jMbRqmcXbo3Wzh+cKM3Z</latexit>
b
<latexit sha1_base64="s6L+Z+fhtGywXDdOyCIOKOnOTTA=">AAAB6HicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjC/YD2lA220m7drMJuxuhhP4CLx4U8epP8ua/cdvmoK0vLDy8M8POvEEiuDau++0UNja3tneKu6W9/YPDo/LxSVvHqWLYYrGIVTegGgWX2DLcCOwmCmkUCOwEk7t5vfOESvNYPphpgn5ER5KHnFFjrWYwKFfcqrsQWQcvhwrkagzKX/1hzNIIpWGCat3z3MT4GVWGM4GzUj/VmFA2oSPsWZQ0Qu1ni0Vn5MI6QxLGyj5pyML9PZHRSOtpFNjOiJqxXq3Nzf9qvdSEN37GZZIalGz5UZgKYmIyv5oMuUJmxNQCZYrbXQkbU0WZsdmUbAje6snr0L6qepab15X6bR5HEc7gHC7BgxrU4R4a0AIGCM/wCm/Oo/PivDsfy9aCk8+cwh85nz/E44zm</latexit>
<latexit
x1
<latexit sha1_base64="Z7jxfJr8/pbKF9IEHv5u2p28PzU=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjRfsBbSib7aZdutmE3YlYQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+l6/XHGr7lxkFbwcKpCr0S9/9QYxSyOukElqTNdzE/QzqlEwyaelXmp4QtmYDnnXoqIRN342X3VKzqwzIGGs7VNI5u7viYxGxkyiwHZGFEdmuTYz/6t1Uwyv/EyoJEWu2OKjMJUEYzK7mwyE5gzlxAJlWthdCRtRTRnadEo2BG/55FVoXVQ9y3eXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzmfPwuyjaA=</latexit>
Activation
w1 You can think of linear
<latexit sha1_base64="yC0dBIEl9qTv0X7X9wtBoMi5o3k=">AAAB+nicbZBNS8NAEIYnftb6lerRS7AInkoigh6rXjxWsB/QhrLZbtulm03YnVRL7E/x4kERr/4Sb/4bt2kO2vrCwsM7M8zsG8SCa3Tdb2tldW19Y7OwVdze2d3bt0sHDR0lirI6jUSkWgHRTHDJ6shRsFasGAkDwZrB6GZWb46Z0jyS9ziJmR+SgeR9Tgkaq2uXOsgeMb2iyMeZNe3aZbfiZnKWwcuhDLlqXfur04toEjKJVBCt254bo58ShZwKNi12Es1iQkdkwNoGJQmZ9tPs9KlzYpye04+UeRKdzP09kZJQ60kYmM6Q4FAv1mbmf7V2gv1LP+UyTpBJOl/UT4SDkTPLwelxxSiKiQFCFTe3OnRIFKFo0iqaELzFLy9D46ziGb47L1ev8zgKcATHcAoeXEAVbqEGdaDwAM/wCm/Wk/VivVsf89YVK585hD+yPn8ADeOUgA==</latexit>
<latexit sha1_base64="ozSIzVA/SGXegmac4XRXthOpvw0=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjRfsBbSib7aZdutmE3YlSQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+l6/XHGr7lxkFbwcKpCr0S9/9QYxSyOukElqTNdzE/QzqlEwyaelXmp4QtmYDnnXoqIRN342X3VKzqwzIGGs7VNI5u7viYxGxkyiwHZGFEdmuTYz/6t1Uwyv/EyoJEWu2OKjMJUEYzK7mwyE5gzlxAJlWthdCRtRTRnadEo2BG/55FVoXVQ9y3eXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzmfPwosjZ8=</latexit>
X
x2
<latexit sha1_base64="8ur8Qnjf68veizOKVqkUmBXGiPw=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FSSIuix6MVjRfsBbSib7aZdutmE3YlYQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+rV+ueJW3bnIKng5VCBXo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKuRUUjbvxsvuqUnFlnQMJY26eQzN3fExmNjJlEge2MKI7Mcm1m/lfrphhe+ZlQSYpcscVHYSoJxmR2NxkIzRnKiQXKtLC7EjaimjK06ZRsCN7yyavQqlU9y3cXlfp1HkcRTuAUzsGDS6jDLTSgCQyG8Ayv8OZI58V5dz4WrQUnnzmGP3I+fwANNo2h</latexit>
w2
<latexit sha1_base64="sAAe226MpFncoK5AcSpzzUnkA9I=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FSSIuix6MVjRfsBbSib7aZdutmE3YlSQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+rV+ueJW3bnIKng5VCBXo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKuRUUjbvxsvuqUnFlnQMJY26eQzN3fExmNjJlEge2MKI7Mcm1m/lfrphhe+ZlQSYpcscVHYSoJxmR2NxkIzRnKiQXKtLC7EjaimjK06ZRsCN7yyavQqlU9y3cXlfp1HkcRTuAUzsGDS6jDLTSgCQyG8Ayv8OZI58V5dz4WrQUnnzmGP3I+fwALsI2g</latexit> <latexit sha1_base64="0Hc81E1zKREVtkdUaco0RSs+Ymk=">AAAB63icbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHCrYW2qVk02wbmmSXZFYoS/+CFw+KePUPefPfmLZ70NYXAg/vzJCZN0qlsOj7315pbX1jc6u8XdnZ3ds/qB4etW2SGcZbLJGJ6UTUcik0b6FAyTup4VRFkj9G49tZ/fGJGysS/YCTlIeKDrWIBaM4s3o2U/1qza/7c5FVCAqoQaFmv/rVGyQsU1wjk9TabuCnGObUoGCSTyu9zPKUsjEd8q5DTRW3YT7fdUrOnDMgcWLc00jm7u+JnCprJypynYriyC7XZuZ/tW6G8XWYC51myDVbfBRnkmBCZoeTgTCcoZw4oMwItythI2ooQxdPxYUQLJ+8Cu2LeuD4/rLWuCniKMMJnMI5BHAFDbiDJrSAwQie4RXePOW9eO/ex6K15BUzx/BH3ucPMe2OUw==</latexit>
<latexit sha1_base64="dwm2weui/za/rSvUQ73K/Rk4/5k=">AAAB7XicbZDLSgMxFIbP1Futt1GXboJFcFVmRNBl0Y3LCvYC7VAyaaaNzWVIMkIZ+g5uXCji1vdx59uYtrPQ1h8CH/85h5zzxylnxgbBt1daW9/Y3CpvV3Z29/YP/MOjllGZJrRJFFe6E2NDOZO0aZnltJNqikXMaTse387q7SeqDVPywU5SGgk8lCxhBFtntXqGDQXu+9WgFsyFViEsoAqFGn3/qzdQJBNUWsKxMd0wSG2UY20Z4XRa6WWGppiM8ZB2HUosqIny+bZTdOacAUqUdk9aNHd/T+RYGDMRsesU2I7Mcm1m/lfrZja5jnIm08xSSRYfJRlHVqHZ6WjANCWWTxxgopnbFZER1phYF1DFhRAun7wKrYta6Pj+slq/KeIowwmcwjmEcAV1uIMGNIHAIzzDK7x5ynvx3r2PRWvJK2aO4Y+8zx+cYY8j</latexit>
ŷ
<latexit sha1_base64="Vi95YwknFrFzcB5LyqgiYSoMf0U=">AAAB7nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjBfsBbSib7aZdutmE3YkQQn+EFw+KePX3ePPfuG1z0NYXFh7emWFn3iCRwqDrfjuljc2t7Z3ybmVv/+DwqHp80jFxqhlvs1jGuhdQw6VQvI0CJe8lmtMokLwbTO/m9e4T10bE6hGzhPsRHSsRCkbRWt3BhGKezYbVmlt3FyLr4BVQg0KtYfVrMIpZGnGFTFJj+p6boJ9TjYJJPqsMUsMTyqZ0zPsWFY248fPFujNyYZ0RCWNtn0KycH9P5DQyJosC2xlRnJjV2tz8r9ZPMbzxc6GSFLliy4/CVBKMyfx2MhKaM5SZBcq0sLsSNqGaMrQJVWwI3urJ69C5qnuWH65rzdsijjKcwTlcggcNaMI9tKANDKbwDK/w5iTOi/PufCxbS04xcwp/5Hz+ALHDj8o=</latexit>
regression as
..
Net input
Output
<latexit sha1_base64="gvQE9cb1Dja5lCiXX5pMwfJapj8=">AAAB9HicbZBNS8NAEIY3ftb6VfXoZbEInkoigh6LXrxZwX5AG8pmO2mXbjZxd1Isob/DiwdFvPpjvPlv3LY5aOsLCw/vzLAzb5BIYdB1v52V1bX1jc3CVnF7Z3dvv3Rw2DBxqjnUeSxj3QqYASkU1FGghFaigUWBhGYwvJnWmyPQRsTqAccJ+BHrKxEKztBafgfhCbO7FJMUJ91S2a24M9Fl8HIok1y1bumr04t5GoFCLpkxbc9N0M+YRsElTIqd1EDC+JD1oW1RsQiMn82WntBT6/RoGGv7FNKZ+3siY5Ex4yiwnRHDgVmsTc3/au0Uwys/E8qeBIrPPwpTSTGm0wRoT2jgKMcWGNfC7kr5gGnG0eZUtCF4iycvQ+O84lm+vyhXr/M4CuSYnJAz4pFLUiW3pEbqhJNH8kxeyZszcl6cd+dj3rri5DNH5I+czx+i4ZKm</latexit>
a linear neuron!
. <latexit sha1_base64="vnW9SOTDG2wSeqwpvMYjb0pYOfc=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJVEBF0W3biSCvYCbSiT6Uk7dHJh5qRYQt/EjQtF3Pom7nwbp2kW2vrDwMd/zmHO+f1ECo2O822V1tY3NrfK25Wd3b39A/vwqKXjVHFo8ljGquMzDVJE0ESBEjqJAhb6Etr++HZeb09AaRFHjzhNwAvZMBKB4AyN1bftHsITZveAVERJirO+XXVqTi66Cm4BVVKo0be/eoOYpyFEyCXTuus6CXoZUyi4hFmll2pIGB+zIXQNRiwE7WX55jN6ZpwBDWJlXoQ0d39PZCzUehr6pjNkONLLtbn5X62bYnDtZflJEPHFR0EqKcZ0HgMdCAUc5dQA40qYXSkfMcU4mrAqJgR3+eRVaF3UXMMPl9X6TRFHmZyQU3JOXHJF6uSONEiTcDIhz+SVvFmZ9WK9Wx+L1pJVzByTP7I+fwD13ZPb</latexit>
wm
<latexit sha1_base64="3SltFZgdSbEccduFdJMJ4sVJM+s=">AAAB6nicbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHirYW2qVk07QNTbJLMquUpT/BiwdFvPqLvPlvTNs9aOsLgYd3ZsjMGyVSWPT9b6+wsrq2vlHcLG1t7+zulfcPmjZODeMNFsvYtCJquRSaN1Cg5K3EcKoiyR+i0fW0/vDIjRWxvsdxwkNFB1r0BaPorLunruqWK37Vn4ksQ5BDBXLVu+WvTi9mqeIamaTWtgM/wTCjBgWTfFLqpJYnlI3ogLcdaqq4DbPZqhNy4pwe6cfGPY1k5v6eyKiydqwi16koDu1ibWr+V2un2L8MM6GTFLlm84/6qSQYk+ndpCcMZyjHDigzwu1K2JAaytClU3IhBIsnL0PzrBo4vj2v1K7yOIpwBMdwCgFcQA1uoA4NYDCAZ3iFN096L9679zFvLXj5zCH8kff5A2Ucjds=</latexit>
xm <latexit sha1_base64="UZ/Cq01CQU77ibJgEHsrgiYApIY=">AAAB6nicbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHirYW2qVk07QNTbJLMiuWpT/BiwdFvPqLvPlvTNs9aOsLgYd3ZsjMGyVSWPT9b6+wsrq2vlHcLG1t7+zulfcPmjZODeMNFsvYtCJquRSaN1Cg5K3EcKoiyR+i0fW0/vDIjRWxvsdxwkNFB1r0BaPorLunruqWK37Vn4ksQ5BDBXLVu+WvTi9mqeIamaTWtgM/wTCjBgWTfFLqpJYnlI3ogLcdaqq4DbPZqhNy4pwe6cfGPY1k5v6eyKiydqwi16koDu1ibWr+V2un2L8MM6GTFLlm84/6qSQYk+ndpCcMZyjHDigzwu1K2JAaytClU3IhBIsnL0PzrBo4vj2v1K7yOIpwBMdwCgFcQA1uoA4NYDCAZ3iFN096L9679zFvLXj5zCH8kff5A2aijdw=</latexit>
Inputs
<latexit sha1_base64="kW2ZbIA+FSvwKPaKbZPllX8WYNo=">AAAB9HicbZBNS8NAEIYnftb6VfXoZbEInkoigh6LXvRWwX5AG8pmu2mXbjZxd1Isob/DiwdFvPpjvPlv3LY5aOsLCw/vzLAzb5BIYdB1v52V1bX1jc3CVnF7Z3dvv3Rw2DBxqhmvs1jGuhVQw6VQvI4CJW8lmtMokLwZDG+m9eaIayNi9YDjhPsR7SsRCkbRWn4H+RNmdypJ0Uy6pbJbcWciy+DlUIZctW7pq9OLWRpxhUxSY9qem6CfUY2CST4pdlLDE8qGtM/bFhWNuPGz2dITcmqdHgljbZ9CMnN/T2Q0MmYcBbYzojgwi7Wp+V+tnWJ45WdiehNXbP5RmEqCMZkmQHpCc4ZybIEyLeyuhA2opgxtTkUbgrd48jI0ziue5fuLcvU6j6MAx3ACZ+DBJVThFmpQBwaP8Ayv8OaMnBfn3fmYt644+cwR/JHz+QONXpKY</latexit>
Linear Regression: Activation function is the identity function
<latexit sha1_base64="Ydlldv6MD7qsnCb2aSsw8EVIUWA=">AAAB9HicbVBNSwMxEJ31s9avqkcvwSLUS9lVQS9C0YvHCvYD2qVk02wbmmTXJFtalv4OLx4U8eqP8ea/MW33oK0PBh7vzTAzL4g508Z1v52V1bX1jc3cVn57Z3dvv3BwWNdRogitkYhHqhlgTTmTtGaY4bQZK4pFwGkjGNxN/caQKs0i+WjGMfUF7kkWMoKNlfy2Zj2BS6MzdINGnULRLbszoGXiZaQIGaqdwle7G5FEUGkIx1q3PDc2foqVYYTTSb6daBpjMsA92rJUYkG1n86OnqBTq3RRGClb0qCZ+nsixULrsQhsp8Cmrxe9qfif10pMeO2nTMaJoZLMF4UJRyZC0wRQlylKDB9bgoli9lZE+lhhYmxOeRuCt/jyMqmfl72LsvtwWazcZnHk4BhOoAQeXEEF7qEKNSDwBM/wCm/O0Hlx3p2PeeuKk80cwR84nz9ZypEp</latexit>
(x) = x
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 16
(Least-Squares) Linear Regression iteratively
• A very naive way to fit a linear regression model (and any neural net)
is to start with initializing the parameters to 0's or small random values
• Then, for k rounds
• Choose another random set of weights
• If the model performs better, keep those weights
• If the model performs worse, discard the weights
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 20
(Least-Squares) Linear Regression
The update rule turns out to be this:
"On-line" mode
Perceptron learning rule Stochastic gradient descent
m 1. Initialize w := 0 2 Rm , b := 0
1. Initialize w := 0 2 R , b := 0
<latexit sha1_base64="NlqcCPP+x7/BYO7u8+fZM6fVAMw=">AAAB+HicbVDLSsNAFL2pr1ofjbp0M1gKrkqiBUUQim5cVrAPaEOZTCft0MkkzEyEGvolblwo4tZPceffOGmz0NYDFw7n3MucOX7MmdKO820V1tY3NreK26Wd3b39sn1w2FZRIgltkYhHsutjRTkTtKWZ5rQbS4pDn9OOP7nN/M4jlYpF4kFPY+qFeCRYwAjWRhrY5X6I9dgPUn+Grq6RM7ArTs2ZA60SNycVyNEc2F/9YUSSkApNOFaq5zqx9lIsNSOczkr9RNEYkwke0Z6hAodUeek8+AxVjTJEQSTNCI3m6u+LFIdKTUOTrZrFVMteJv7n9RIdXHopE3GiqSCLh4KEIx2hrAU0ZJISzaeGYCKZyYrIGEtMtOmqZEpwl7+8StpnNfe85tzXK42bvI4iHMMJnIILF9CAO2hCCwgk8Ayv8GY9WS/Wu/WxWC1Y+c0R/IH1+QOB5ZJS</latexit>
<latexit sha1_base64="NlqcCPP+x7/BYO7u8+fZM6fVAMw=">AAAB+HicbVDLSsNAFL2pr1ofjbp0M1gKrkqiBUUQim5cVrAPaEOZTCft0MkkzEyEGvolblwo4tZPceffOGmz0NYDFw7n3MucOX7MmdKO820V1tY3NreK26Wd3b39sn1w2FZRIgltkYhHsutjRTkTtKWZ5rQbS4pDn9OOP7nN/M4jlYpF4kFPY+qFeCRYwAjWRhrY5X6I9dgPUn+Grq6RM7ArTs2ZA60SNycVyNEc2F/9YUSSkApNOFaq5zqx9lIsNSOczkr9RNEYkwke0Z6hAodUeek8+AxVjTJEQSTNCI3m6u+LFIdKTUOTrZrFVMteJv7n9RIdXHopE3GiqSCLh4KEIx2hrAU0ZJISzaeGYCKZyYrIGEtMtOmqZEpwl7+8StpnNfe85tzXK42bvI4iHMMJnIILF9CAO2hCCwgk8Ayv8GY9WS/Wu/WxWC1Y+c0R/IH1+QOB5ZJS</latexit>
<latexit sha1_base64="KiRtzBWzB6OcgzuVeAuk4oyYAnU=">AAACJHicbZDLSsNAFIYn9VbrLerSzWARXEhJRFBwU9SFywr2Akksk+mkHTqZhJmJGEIexo2v4saFF1y48VmctFlo64GBj//8hznn92NGpbKsL6OysLi0vFJdra2tb2xumds7HRklApM2jlgkej6ShFFO2ooqRnqxICj0Gen648ui370nQtKI36o0Jl6IhpwGFCOlpb557jLEh4xAN0Rq5AfZQ36XOdTLj2CWlghdUXoon/owYtlV3jfrVsOaFJwHu4Q6KKvVN9/dQYSTkHCFGZLSsa1YeRkSimJG8pqbSBIjPEZD4mjkKCTSyyZH5vBAKwMYREI/ruBE/T2RoVDKNPS1s1hRzvYK8b+ek6jgzMsojxNFOJ5+FCQMqggWicEBFQQrlmpAWFC9K8QjJBBWOteaDsGePXkeOscNW/PNSb15UcZRBXtgHxwCG5yCJrgGLdAGGDyCZ/AK3own48X4MD6n1opRzuyCP2V8/wAUwKWr</latexit>
[i]
(a) ŷ := x[i]T w + b
(a) ŷ [i] := x[i]T w + b <latexit sha1_base64="oOGy76Ku6BuYykfyTG/uo81hVcc=">AAACLHicbVDLSsNAFJ34rPUVdelmsAgVoSQqKIJQ7MZlhb4giWUynbRDJw9mJmoI+SA3/oogLizi1u9w0lbQ1gMDh3POZe49bsSokIYx0hYWl5ZXVgtrxfWNza1tfWe3JcKYY9LEIQt5x0WCMBqQpqSSkU7ECfJdRtrusJb77XvCBQ2Dhkwi4vioH1CPYiSV1NVr9gDJNMnuUos6Gby8gragfR/ZLu2XbR/JgeuljxO7kcEf5SGD8Bi6MI8ddfWSUTHGgPPEnJISmKLe1V/tXohjnwQSMySEZRqRdFLEJcWMZEU7FiRCeIj6xFI0QD4RTjo+NoOHSulBL+TqBRKO1d8TKfKFSHxXJfNlxayXi/95Viy9CyelQRRLEuDJR17MoAxh3hzsUU6wZIkiCHOqdoV4gDjCUvVbVCWYsyfPk9ZJxTytGLdnper1tI4C2AcHoAxMcA6q4AbUQRNg8ARewDsYac/am/ahfU6iC9p0Zg/8gfb1Dbbkp9A=</latexit>
rb L = y [i] ŷ [i]
<latexit sha1_base64="wYimCRAo97FH6nzYS/fz3/CUouY=">AAACEHicbZDLSsNAFIYn3q23qEs3g0XUhZKIoAhC0Y1LBWuFJpbJ9NQOnVyYORFDyCO48VXcuFDErUt3vo3TNgut/jDwzX/OYeb8QSKFRsf5ssbGJyanpmdmK3PzC4tL9vLKlY5TxaHOYxmr64BpkCKCOgqUcJ0oYGEgoRH0Tvv1xh0oLeLoErME/JDdRqIjOENjtexND+Eec1CqoEfHdCu7yZvCL+gO9boM86wY3rdbdtXZdQaif8EtoUpKnbfsT68d8zSECLlkWjddJ0E/ZwoFl1BUvFRDwniP3ULTYMRC0H4+WKigG8Zp006szImQDtyfEzkLtc7CwHSGDLt6tNY3/6s1U+wc+rmIkhQh4sOHOqmkGNN+OrQtFHCUmQHGlTB/pbzLFONoMqyYENzRlf/C1d6ua/hiv1o7KeOYIWtknWwRlxyQGjkj56ROOHkgT+SFvFqP1rP1Zr0PW8escmaV/JL18Q3haZx5</latexit>
b := b + ⌘ ⇥ ( rb L)
<latexit sha1_base64="4UyIQU5LvSnufAbl66XgeInFnL8=">AAACFnicbVDLSgNBEJz1bXxFPXoZDEJEEnZVUARB9OLBQwSjgWwIPZOJDpmdXWZ6hbDkK7z4K148KOJVvPk3Th4HNRY0FFXddHexREmLvv/lTUxOTc/Mzs3nFhaXllfyq2vXNk4NF1Ueq9jUGFihpBZVlKhELTECIqbEDeuc9f2be2GsjPUVdhPRiOBWy7bkgE5q5kuMHh1TtkNDgUBDlJGwtFiioQamoJmxXhgB3nFQ2UVvu5kv+GV/ADpOghEpkBEqzfxn2Ip5GgmNXIG19cBPsJGBQcmV6OXC1IoEeAduRd1RDW59Ixu81aNbTmnRdmxcaaQD9edEBpG13Yi5zv6N9q/XF//z6im2DxuZ1EmKQvPhonaqKMa0nxFtSSM4qq4jwI10t1J+BwY4uiRzLoTg78vj5Hq3HOyV/cv9wsnpKI45skE2SZEE5ICckHNSIVXCyQN5Ii/k1Xv0nr03733YOuGNZtbJL3gf32YVnbQ=</latexit>
{ <latexit sha1_base64="HqmpBwZTVZzQ+LXmwMEWLfT2Iq0=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0oMeiF49V7Ae0oWy2k3bpZhN2N0IJ/QdePCji1X/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHM0nQj+hQ8pAzaqz00Mv65Ypbdecgq8TLSQVyNPrlr94gZmmE0jBBte56bmL8jCrDmcBpqZdqTCgb0yF2LZU0Qu1n80un5MwqAxLGypY0ZK7+nshopPUkCmxnRM1IL3sz8T+vm5rw2s+4TFKDki0WhakgJiazt8mAK2RGTCyhTHF7K2EjqigzNpySDcFbfnmVtC6q3mXVva9V6jd5HEU4gVM4Bw+uoA530IAmMAjhGV7hzRk7L86787FoLTj5zDH8gfP5A5wujWc=</latexit>
learning rate
negative gradient
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 21
(Least-Squares) Linear Regression
The update rule turns out to be this:
"On-line" mode
[i]
(a) ŷ := <latexit sha1_base64="oOGy76Ku6BuYykfyTG/uo81hVcc=">AAACLHicbVDLSsNAFJ34rPUVdelmsAgVoSQqKIJQ7MZlhb4giWUynbRDJw9mJmoI+SA3/oogLizi1u9w0lbQ1gMDh3POZe49bsSokIYx0hYWl5ZXVgtrxfWNza1tfWe3JcKYY9LEIQt5x0WCMBqQpqSSkU7ECfJdRtrusJb77XvCBQ2Dhkwi4vioH1CPYiSV1NVr9gDJNMnuUos6Gby8gragfR/ZLu2XbR/JgeuljxO7kcEf5SGD8Bi6MI8ddfWSUTHGgPPEnJISmKLe1V/tXohjnwQSMySEZRqRdFLEJcWMZEU7FiRCeIj6xFI0QD4RTjo+NoOHSulBL+TqBRKO1d8TKfKFSHxXJfNlxayXi/95Viy9CyelQRRLEuDJR17MoAxh3hzsUU6wZIkiCHOqdoV4gDjCUvVbVCWYsyfPk9ZJxTytGLdnper1tI4C2AcHoAxMcA6q4AbUQRNg8ARewDsYac/am/ahfU6iC9p0Zg/8gfb1Dbbkp9A=</latexit>
x[i]T w + b perceptron rule,
except that the
B. For weight j in {1, ..., m}:
prediction is a real
(b) @L [i] number
= y [i] ŷ [i] xj
@wj and we have a learning
@L
(c) wj := wj + ⌘ ⇥ ( ) rate
@wj <latexit sha1_base64="nF1f4UxlvznbL34ywCUov53skk4=">AAACMHicbVDLSgMxFM34tr6qLt0Ei6CIZUYFRRBEF7pwoWBroVPKnTRjo5kHyR2lDPNJbvwU3Sgo4tavMNMW0eqBkMO59ybnHi+WQqNtv1hDwyOjY+MTk4Wp6ZnZueL8QlVHiWK8wiIZqZoHmksR8goKlLwWKw6BJ/mld3OU1y9vudIiCi+wE/NGAFeh8AUDNFKzeHzXvKZ7+zS/1qnLEaiLIuCarm5Q11fAUjcGhQIkdQPANgOZnmZZSr9lM5qtNYslu2x3Qf8Sp09KpI+zZvHRbUUsCXiITILWdceOsZHmbzLJs4KbaB4Du4ErXjc0BOOpkXYXzuiKUVrUj5Q5IdKu+nMihUDrTuCZztyzHqzl4n+1eoL+biMVYZwgD1nvIz+RFCOap0dbQnGGsmMIMCWMV8raYFJCk3HBhOAMrvyXVDfLzlbZPt8uHRz245ggS2SZrBKH7JADckLOSIUwck+eyCt5sx6sZ+vd+ui1Dln9mUXyC9bnF0D0qJs=</latexit>
C. @L
= y [i] ŷ [i]
@b
@L
b := b + ⌘ ⇥ ( )
@b
<latexit sha1_base64="B3uxUTIFObznd8Z1Q4hI5WItBVg=">AAACKnicbVDJSgNBFOxxN25Rj14eBkERw4wKiiC4XDx4UDAqZEJ40+kxjT0L3W+EMMz3ePFXvHhQxKsfYk8M4lbQUNRbul4FqZKGXPfVGRoeGR0bn5isTE3PzM5V5xcuTZJpLho8UYm+DtAIJWPRIElKXKdaYBQocRXcHpf1qzuhjUziC+qlohXhTSxDyZGs1K4eBrC3DwGsgy8IwScZCQOrG+CHGnnup6hJogI/QupyVPlpUeTwJQfFWrtac+tuH/CXeANSYwOctatPfifhWSRi4gqNaXpuSq283MiVKCp+ZkSK/BZvRNPSGK2jVt4/tYAVq3QgTLR9MUFf/T6RY2RMLwpsZ+nY/K6V4n+1ZkbhbiuXcZqRiPnnR2GmgBIoc4OO1IKT6lmCXEvrFXgXbUZk063YELzfJ/8ll5t1b6vunm/XDo4GcUywJbbMVpnHdtgBO2FnrME4u2eP7Jm9OA/Ok/PqvH22DjmDmUX2A877B8tDpcU=</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 23
This learning rule (from the previous slide)
is called (stochastic) gradient descent.
So, how did we get there?
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 24
Back to Linear Regression
b
<latexit sha1_base64="s6L+Z+fhtGywXDdOyCIOKOnOTTA=">AAAB6HicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjC/YD2lA220m7drMJuxuhhP4CLx4U8epP8ua/cdvmoK0vLDy8M8POvEEiuDau++0UNja3tneKu6W9/YPDo/LxSVvHqWLYYrGIVTegGgWX2DLcCOwmCmkUCOwEk7t5vfOESvNYPphpgn5ER5KHnFFjrWYwKFfcqrsQWQcvhwrkagzKX/1hzNIIpWGCat3z3MT4GVWGM4GzUj/VmFA2oSPsWZQ0Qu1ni0Vn5MI6QxLGyj5pyML9PZHRSOtpFNjOiJqxXq3Nzf9qvdSEN37GZZIalGz5UZgKYmIyv5oMuUJmxNQCZYrbXQkbU0WZsdmUbAje6snr0L6qepab15X6bR5HEc7gHC7BgxrU4R4a0AIGCM/wCm/Oo/PivDsfy9aCk8+cwh85nz/E44zm</latexit>
<latexit
x1
<latexit sha1_base64="Z7jxfJr8/pbKF9IEHv5u2p28PzU=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjRfsBbSib7aZdutmE3YlYQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+l6/XHGr7lxkFbwcKpCr0S9/9QYxSyOukElqTNdzE/QzqlEwyaelXmp4QtmYDnnXoqIRN342X3VKzqwzIGGs7VNI5u7viYxGxkyiwHZGFEdmuTYz/6t1Uwyv/EyoJEWu2OKjMJUEYzK7mwyE5gzlxAJlWthdCRtRTRnadEo2BG/55FVoXVQ9y3eXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzmfPwuyjaA=</latexit>
Activation
w1
<latexit sha1_base64="yC0dBIEl9qTv0X7X9wtBoMi5o3k=">AAAB+nicbZBNS8NAEIYnftb6lerRS7AInkoigh6rXjxWsB/QhrLZbtulm03YnVRL7E/x4kERr/4Sb/4bt2kO2vrCwsM7M8zsG8SCa3Tdb2tldW19Y7OwVdze2d3bt0sHDR0lirI6jUSkWgHRTHDJ6shRsFasGAkDwZrB6GZWb46Z0jyS9ziJmR+SgeR9Tgkaq2uXOsgeMb2iyMeZNe3aZbfiZnKWwcuhDLlqXfur04toEjKJVBCt254bo58ShZwKNi12Es1iQkdkwNoGJQmZ9tPs9KlzYpye04+UeRKdzP09kZJQ60kYmM6Q4FAv1mbmf7V2gv1LP+UyTpBJOl/UT4SDkTPLwelxxSiKiQFCFTe3OnRIFKFo0iqaELzFLy9D46ziGb47L1ev8zgKcATHcAoeXEAVbqEGdaDwAM/wCm/Wk/VivVsf89YVK585hD+yPn8ADeOUgA==</latexit>
<latexit sha1_base64="ozSIzVA/SGXegmac4XRXthOpvw0=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjRfsBbSib7aZdutmE3YlSQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+l6/XHGr7lxkFbwcKpCr0S9/9QYxSyOukElqTNdzE/QzqlEwyaelXmp4QtmYDnnXoqIRN342X3VKzqwzIGGs7VNI5u7viYxGxkyiwHZGFEdmuTYz/6t1Uwyv/EyoJEWu2OKjMJUEYzK7mwyE5gzlxAJlWthdCRtRTRnadEo2BG/55FVoXVQ9y3eXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzmfPwosjZ8=</latexit>
X
x2
<latexit sha1_base64="8ur8Qnjf68veizOKVqkUmBXGiPw=">AAAB6nicbZBNS8NAEIYn9avWr6pHL4tF8FSSIuix6MVjRfsBbSib7aZdutmE3YlYQn+CFw+KePUXefPfuG1z0NYXFh7emWFn3iCRwqDrfjuFtfWNza3idmlnd2//oHx41DJxqhlvsljGuhNQw6VQvIkCJe8kmtMokLwdjG9m9fYj10bE6gEnCfcjOlQiFIyite6f+rV+ueJW3bnIKng5VCBXo1/+6g1ilkZcIZPUmK7nJuhnVKNgkk9LvdTwhLIxHfKuRUUjbvxsvuqUnFlnQMJY26eQzN3fExmNjJlEge2MKI7Mcm1m/lfrphhe+ZlQSYpcscVHYSoJxmR2NxkIzRnKiQXKtLC7EjaimjK06ZRsCN7yyavQqlU9y3cXlfp1HkcRTuAUzsGDS6jDLTSgCQyG8Ayv8OZI58V5dz4WrQUnnzmGP3I+fwANNo2h</latexit>
w2 <latexit sha1_base64="dwm2weui/za/rSvUQ73K/Rk4/5k=">AAAB7XicbZDLSgMxFIbP1Futt1GXboJFcFVmRNBl0Y3LCvYC7VAyaaaNzWVIMkIZ+g5uXCji1vdx59uYtrPQ1h8CH/85h5zzxylnxgbBt1daW9/Y3CpvV3Z29/YP/MOjllGZJrRJFFe6E2NDOZO0aZnltJNqikXMaTse387q7SeqDVPywU5SGgk8lCxhBFtntXqGDQXu+9WgFsyFViEsoAqFGn3/qzdQJBNUWsKxMd0wSG2UY20Z4XRa6WWGppiM8ZB2HUosqIny+bZTdOacAUqUdk9aNHd/T+RYGDMRsesU2I7Mcm1m/lfrZja5jnIm08xSSRYfJRlHVqHZ6WjANCWWTxxgopnbFZER1phYF1DFhRAun7wKrYta6Pj+slq/KeIowwmcwjmEcAV1uIMGNIHAIzzDK7x5ynvx3r2PRWvJK2aO4Y+8zx+cYY8j</latexit>
ŷ
<latexit sha1_base64="Vi95YwknFrFzcB5LyqgiYSoMf0U=">AAAB7nicbZBNS8NAEIYn9avWr6pHL4tF8FQSEeqx6MVjBfsBbSib7aZdutmE3YkQQn+EFw+KePX3ePPfuG1z0NYXFh7emWFn3iCRwqDrfjuljc2t7Z3ybmVv/+DwqHp80jFxqhlvs1jGuhdQw6VQvI0CJe8lmtMokLwbTO/m9e4T10bE6hGzhPsRHSsRCkbRWt3BhGKezYbVmlt3FyLr4BVQg0KtYfVrMIpZGnGFTFJj+p6boJ9TjYJJPqsMUsMTyqZ0zPsWFY248fPFujNyYZ0RCWNtn0KycH9P5DQyJosC2xlRnJjV2tz8r9ZPMbzxc6GSFLliy4/CVBKMyfx2MhKaM5SZBcq0sLsSNqGaMrQJVWwI3urJ69C5qnuWH65rzdsijjKcwTlcggcNaMI9tKANDKbwDK/w5iTOi/PufCxbS04xcwp/5Hz+ALHDj8o=</latexit>
Output
.. <latexit sha1_base64="gvQE9cb1Dja5lCiXX5pMwfJapj8=">AAAB9HicbZBNS8NAEIY3ftb6VfXoZbEInkoigh6LXrxZwX5AG8pmO2mXbjZxd1Isob/DiwdFvPpjvPlv3LY5aOsLCw/vzLAzb5BIYdB1v52V1bX1jc3CVnF7Z3dvv3Rw2DBxqjnUeSxj3QqYASkU1FGghFaigUWBhGYwvJnWmyPQRsTqAccJ+BHrKxEKztBafgfhCbO7FJMUJ91S2a24M9Fl8HIok1y1bumr04t5GoFCLpkxbc9N0M+YRsElTIqd1EDC+JD1oW1RsQiMn82WntBT6/RoGGv7FNKZ+3siY5Ex4yiwnRHDgVmsTc3/au0Uwys/E8qeBIrPPwpTSTGm0wRoT2jgKMcWGNfC7kr5gGnG0eZUtCF4iycvQ+O84lm+vyhXr/M4CuSYnJAz4pFLUiW3pEbqhJNH8kxeyZszcl6cd+dj3rri5DNH5I+czx+i4ZKm</latexit>
. Net input
<latexit sha1_base64="vnW9SOTDG2wSeqwpvMYjb0pYOfc=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJVEBF0W3biSCvYCbSiT6Uk7dHJh5qRYQt/EjQtF3Pom7nwbp2kW2vrDwMd/zmHO+f1ECo2O822V1tY3NrfK25Wd3b39A/vwqKXjVHFo8ljGquMzDVJE0ESBEjqJAhb6Etr++HZeb09AaRFHjzhNwAvZMBKB4AyN1bftHsITZveAVERJirO+XXVqTi66Cm4BVVKo0be/eoOYpyFEyCXTuus6CXoZUyi4hFmll2pIGB+zIXQNRiwE7WX55jN6ZpwBDWJlXoQ0d39PZCzUehr6pjNkONLLtbn5X62bYnDtZflJEPHFR0EqKcZ0HgMdCAUc5dQA40qYXSkfMcU4mrAqJgR3+eRVaF3UXMMPl9X6TRFHmZyQU3JOXHJF6uSONEiTcDIhz+SVvFmZ9WK9Wx+L1pJVzByTP7I+fwD13ZPb</latexit>
wm
<latexit sha1_base64="3SltFZgdSbEccduFdJMJ4sVJM+s=">AAAB6nicbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHirYW2qVk07QNTbJLMquUpT/BiwdFvPqLvPlvTNs9aOsLgYd3ZsjMGyVSWPT9b6+wsrq2vlHcLG1t7+zulfcPmjZODeMNFsvYtCJquRSaN1Cg5K3EcKoiyR+i0fW0/vDIjRWxvsdxwkNFB1r0BaPorLunruqWK37Vn4ksQ5BDBXLVu+WvTi9mqeIamaTWtgM/wTCjBgWTfFLqpJYnlI3ogLcdaqq4DbPZqhNy4pwe6cfGPY1k5v6eyKiydqwi16koDu1ibWr+V2un2L8MM6GTFLlm84/6qSQYk+ndpCcMZyjHDigzwu1K2JAaytClU3IhBIsnL0PzrBo4vj2v1K7yOIpwBMdwCgFcQA1uoA4NYDCAZ3iFN096L9679zFvLXj5zCH8kff5A2Ucjds=</latexit>
xm <latexit sha1_base64="UZ/Cq01CQU77ibJgEHsrgiYApIY=">AAAB6nicbZBNSwMxEIZn61etX1WPXoJF8FR2RdBj0YvHirYW2qVk07QNTbJLMiuWpT/BiwdFvPqLvPlvTNs9aOsLgYd3ZsjMGyVSWPT9b6+wsrq2vlHcLG1t7+zulfcPmjZODeMNFsvYtCJquRSaN1Cg5K3EcKoiyR+i0fW0/vDIjRWxvsdxwkNFB1r0BaPorLunruqWK37Vn4ksQ5BDBXLVu+WvTi9mqeIamaTWtgM/wTCjBgWTfFLqpJYnlI3ogLcdaqq4DbPZqhNy4pwe6cfGPY1k5v6eyKiydqwi16koDu1ibWr+V2un2L8MM6GTFLlm84/6qSQYk+ndpCcMZyjHDigzwu1K2JAaytClU3IhBIsnL0PzrBo4vj2v1K7yOIpwBMdwCgFcQA1uoA4NYDCAZ3iFN096L9679zFvLXj5zCH8kff5A2aijdw=</latexit>
Inputs
<latexit sha1_base64="kW2ZbIA+FSvwKPaKbZPllX8WYNo=">AAAB9HicbZBNS8NAEIYnftb6VfXoZbEInkoigh6LXvRWwX5AG8pmu2mXbjZxd1Isob/DiwdFvPpjvPlv3LY5aOsLCw/vzLAzb5BIYdB1v52V1bX1jc3CVnF7Z3dvv3Rw2DBxqhmvs1jGuhVQw6VQvI4CJW8lmtMokLwZDG+m9eaIayNi9YDjhPsR7SsRCkbRWn4H+RNmdypJ0Uy6pbJbcWciy+DlUIZctW7pq9OLWRpxhUxSY9qem6CfUY2CST4pdlLDE8qGtM/bFhWNuPGz2dITcmqdHgljbZ9CMnN/T2Q0MmYcBbYzojgwi7Wp+V+tnWJ45WdiehNXbP5RmEqCMZkmQHpCc4ZybIEyLeyuhA2opgxtTkUbgrd48jI0ziue5fuLcvU6j6MAx3ACZ+DBJVThFmpQBwaP8Ayv8OaMnBfn3fmYt644+cwR/JHz+QONXpKY</latexit>
L
<latexit sha1_base64="P35O/4hZ2SiSHSfDfAG1bUHgTNI=">AAAB8nicbVBNS8NAFHypX7V+VT16WSyCp5KooMeiFw8eKlhbSEPZbLft0s0m7L4IJfRnePGgiFd/jTf/jZs2B20dWBhm3mPnTZhIYdB1v53Syura+kZ5s7K1vbO7V90/eDRxqhlvsVjGuhNSw6VQvIUCJe8kmtMolLwdjm9yv/3EtRGxesBJwoOIDpUYCEbRSn43ojhiVGZ301615tbdGcgy8QpSgwLNXvWr249ZGnGFTFJjfM9NMMioRsEkn1a6qeEJZWM65L6likbcBNks8pScWKVPBrG2TyGZqb83MhoZM4lCO5lHNIteLv7n+SkOroJMqCRFrtj8o0EqCcYkv5/0heYM5cQSyrSwWQkbUU0Z2pYqtgRv8eRl8nhW987r7v1FrXFd1FGGIziGU/DgEhpwC01oAYMYnuEV3hx0Xpx352M+WnKKnUP4A+fzB4GvkWQ=</latexit>
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 51
Gradient Descent
L
<latexit sha1_base64="P35O/4hZ2SiSHSfDfAG1bUHgTNI=">AAAB8nicbVBNS8NAFHypX7V+VT16WSyCp5KooMeiFw8eKlhbSEPZbLft0s0m7L4IJfRnePGgiFd/jTf/jZs2B20dWBhm3mPnTZhIYdB1v53Syura+kZ5s7K1vbO7V90/eDRxqhlvsVjGuhNSw6VQvIUCJe8kmtMolLwdjm9yv/3EtRGxesBJwoOIDpUYCEbRSn43ojhiVGZ301615tbdGcgy8QpSgwLNXvWr249ZGnGFTFJjfM9NMMioRsEkn1a6qeEJZWM65L6likbcBNks8pScWKVPBrG2TyGZqb83MhoZM4lCO5lHNIteLv7n+SkOroJMqCRFrtj8o0EqCcYkv5/0heYM5cQSyrSwWQkbUU0Z2pYqtgRv8eRl8nhW987r7v1FrXFd1FGGIziGU/DgEhpwC01oAYMYnuEV3hx0Xpx352M+WnKKnUP4A+fzB4GvkWQ=</latexit>
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 52
Gradient Descent
L
<latexit sha1_base64="P35O/4hZ2SiSHSfDfAG1bUHgTNI=">AAAB8nicbVBNS8NAFHypX7V+VT16WSyCp5KooMeiFw8eKlhbSEPZbLft0s0m7L4IJfRnePGgiFd/jTf/jZs2B20dWBhm3mPnTZhIYdB1v53Syura+kZ5s7K1vbO7V90/eDRxqhlvsVjGuhNSw6VQvIUCJe8kmtMolLwdjm9yv/3EtRGxesBJwoOIDpUYCEbRSn43ojhiVGZ301615tbdGcgy8QpSgwLNXvWr249ZGnGFTFJjfM9NMMioRsEkn1a6qeEJZWM65L6likbcBNks8pScWKVPBrG2TyGZqb83MhoZM4lCO5lHNIteLv7n+SkOroJMqCRFrtj8o0EqCcYkv5/0heYM5cQSyrSwWQkbUU0Z2pYqtgRv8eRl8nhW987r7v1FrXFd1FGGIziGU/DgEhpwC01oAYMYnuEV3hx0Xpx352M+WnKKnUP4A+fzB4GvkWQ=</latexit>
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
L
<latexit sha1_base64="P35O/4hZ2SiSHSfDfAG1bUHgTNI=">AAAB8nicbVBNS8NAFHypX7V+VT16WSyCp5KooMeiFw8eKlhbSEPZbLft0s0m7L4IJfRnePGgiFd/jTf/jZs2B20dWBhm3mPnTZhIYdB1v53Syura+kZ5s7K1vbO7V90/eDRxqhlvsVjGuhNSw6VQvIUCJe8kmtMolLwdjm9yv/3EtRGxesBJwoOIDpUYCEbRSn43ojhiVGZ301615tbdGcgy8QpSgwLNXvWr249ZGnGFTFJjfM9NMMioRsEkn1a6qeEJZWM65L6likbcBNks8pScWKVPBrG2TyGZqb83MhoZM4lCO5lHNIteLv7n+SkOroJMqCRFrtj8o0EqCcYkv5/0heYM5cQSyrSwWQkbUU0Z2pYqtgRv8eRl8nhW987r7v1FrXFd1FGGIziGU/DgEhpwC01oAYMYnuEV3hx0Xpx352M+WnKKnUP4A+fzB4GvkWQ=</latexit>
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 53
Linear Regression Loss Derivative
X
L(w, b) = (ŷ [i] y [i] )2 Sum Squared Error (SSE) loss
<latexit sha1_base64="J5nwe8Lv0CM44wk1z3nGWinGIrk=">AAACK3icbZDLSgMxFIYzXmu9VV26CRahBS0zVdCNUOrGhYsK9gKdacmkmTY0cyHJKEOY93Hjq7jQhRfc+h6m7Sy0+kPg4z/ncHJ+N2JUSNN8NxYWl5ZXVnNr+fWNza3tws5uS4Qxx6SJQxbyjosEYTQgTUklI52IE+S7jLTd8eWk3r4jXNAwuJVJRBwfDQPqUYyktvqFuu0jOcKIqeu0NGXXU/fpEXTL8ALaIvb7iqawZI+QVEnaU13qpPAYJhmVe9V+oWhWzKngX7AyKIJMjX7h2R6EOPZJIDFDQnQtM5KOQlxSzEiat2NBIoTHaEi6GgPkE+Go6a0pPNTOAHoh1y+QcOr+nFDIFyLxXd05uUbM1ybmf7VuLL1zR9EgiiUJ8GyRFzMoQzgJDg4oJ1iyRAPCnOq/QjxCHGGp483rEKz5k/9Cq1qxTirmzWmxVs/iyIF9cABKwAJnoAauQAM0AQYP4Am8gjfj0XgxPozPWeuCkc3sgV8yvr4BCWGm5A==</latexit>
i
@L @ X [i]
= (ŷ y [i] )2
@wj @wj i
@ X
= ( (wT x[i] ) y [i] )2
@wj i
X @
T [i] [i]
= 2( (w x ) y ) ( (wT x[i] ) y [i] )
i
@wj
X d @
T [i] [i] T [i]
= 2( (w x ) y ) T [i]
w x
i
d(w x ) @w j
X d [i]
T [i] [i]
= 2( (w x ) y ) T x[i] ) j
x (Note that the activation function is the
i
d(w
X identity function in linear regression)
T [i] [i] [i]
= 2( (w x ) y )xj
<latexit sha1_base64="z+trg6C0NNIpeTPQSfXLY/fzu50=">AAAEu3icxVNNaxsxEFW82yZ1v5z22IuoabEPNbtuILkEQnvpoYcU4iRgrRetVmvL0X4gaZMsQj+yveXfRGtvceOUpKGFDggeM/OenkZMVHAmleddbbQc99Hjza0n7afPnr942dl+dSzzUhA6IjnPxWmEJeUsoyPFFKenhaA4jTg9ic4+1/WTcyoky7MjVRU0SPE0YwkjWNlUuN36jhKBiUYFFophDlGK1Yxgrr8as8pehHMD3+/Dm83rDUiWachgD82w0pWZ6DELDPwAqwb1J0OE2n8uI9k0xb2FoyjRF2Zy9BNfNuL9lfpKfEkfwgcr3OPrgXKw/9d+GkMxXBKNju/mmvsmewf5P5i9DOf/6u6VVjvsdL2Btwh4G/gN6IImDsPODxTnpExppgjHUo59r1CBrgdHODVtVEpaYHKGp3RsYYZTKgO92D0D39lMDJNc2JMpuMj+ytA4lbJKI9tZP0Cu1+rk72rjUiV7gWZZUSqakeVFScmhymG9yDBmghLFKwswEcx6hWSG7Q8ou+71EPz1J98Gx8OB/3HgfdvpHnxqxrEF3oC3oAd8sAsOwBdwCEaAOHvOxJk6M3ffJe7c5cvW1kbDeQ1uhFteA33FlnQ=</latexit>
i
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 54
Linear Regression Loss Derivative (alt.)
@L @ 1 X [i]
= (ŷ y [i] )2
@wj @wj 2n i
@ X 1
= ( (wT x[i] ) y [i] )2
@wj i 2n
X1 @
T [i] [i]
= ( (w x ) y ) ( (wT x[i] ) y [i] )
i
n @w j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 55
Batch Gradient Descent as Surface Plot
Lmin
<latexit sha1_base64="Mo8CVaK+/t2oXYiT1RZo2sPshjM=">AAAB+3icbVDLSsNAFL3xWesr1qWbYBFclUQFXRbduHBRwT6gCWEynbZDZyZhZiKWkF9x40IRt/6IO//GSZuFth4YOJxzL/fMiRJGlXbdb2tldW19Y7OyVd3e2d3btw9qHRWnEpM2jlksexFShFFB2ppqRnqJJIhHjHSjyU3hdx+JVDQWD3qakICjkaBDipE2UmjXfI70GCOW3eVh5nMq8tCuuw13BmeZeCWpQ4lWaH/5gxinnAiNGVKq77mJDjIkNcWM5FU/VSRBeIJGpG+oQJyoIJtlz50TowycYSzNE9qZqb83MsSVmvLITBZJ1aJXiP95/VQPr4KMiiTVROD5oWHKHB07RRHOgEqCNZsagrCkJquDx0girE1dVVOCt/jlZdI5a3jnDff+ot68LuuowBEcwyl4cAlNuIUWtAHDEzzDK7xZufVivVsf89EVq9w5hD+wPn8Ao1eU0g==</latexit>
w2
<latexit sha1_base64="UkXJNXFgsoaYDNhtSeqRDvmteSE=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEMWI2i</latexit>
Updates perpendicular
to contour lines
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 57
Stochastic Gradient Descent as Surface Plot
Lmin
Stochastic updates
<latexit sha1_base64="Mo8CVaK+/t2oXYiT1RZo2sPshjM=">AAAB+3icbVDLSsNAFL3xWesr1qWbYBFclUQFXRbduHBRwT6gCWEynbZDZyZhZiKWkF9x40IRt/6IO//GSZuFth4YOJxzL/fMiRJGlXbdb2tldW19Y7OyVd3e2d3btw9qHRWnEpM2jlksexFShFFB2ppqRnqJJIhHjHSjyU3hdx+JVDQWD3qakICjkaBDipE2UmjXfI70GCOW3eVh5nMq8tCuuw13BmeZeCWpQ4lWaH/5gxinnAiNGVKq77mJDjIkNcWM5FU/VSRBeIJGpG+oQJyoIJtlz50TowycYSzNE9qZqb83MsSVmvLITBZJ1aJXiP95/VQPr4KMiiTVROD5oWHKHB07RRHOgEqCNZsagrCkJquDx0girE1dVVOCt/jlZdI5a3jnDff+ot68LuuowBEcwyl4cAlNuIUWtAHDEzzDK7xZufVivVsf89EVq9w5hD+wPn8Ao1eU0g==</latexit>
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 58
Batch Gradient Descent as Surface Plot
Lmin
<latexit sha1_base64="Mo8CVaK+/t2oXYiT1RZo2sPshjM=">AAAB+3icbVDLSsNAFL3xWesr1qWbYBFclUQFXRbduHBRwT6gCWEynbZDZyZhZiKWkF9x40IRt/6IO//GSZuFth4YOJxzL/fMiRJGlXbdb2tldW19Y7OyVd3e2d3btw9qHRWnEpM2jlksexFShFFB2ppqRnqJJIhHjHSjyU3hdx+JVDQWD3qakICjkaBDipE2UmjXfI70GCOW3eVh5nMq8tCuuw13BmeZeCWpQ4lWaH/5gxinnAiNGVKq77mJDjIkNcWM5FU/VSRBeIJGpG+oQJyoIJtlz50TowycYSzNE9qZqb83MsSVmvLITBZJ1aJXiP95/VQPr4KMiiTVROD5oWHKHB07RRHOgEqCNZsagrCkJquDx0girE1dVVOCt/jlZdI5a3jnDff+ot68LuuowBEcwyl4cAlNuIUWtAHDEzzDK7xZufVivVsf89EVq9w5hD+wPn8Ao1eU0g==</latexit>
w2
<latexit sha1_base64="UkXJNXFgsoaYDNhtSeqRDvmteSE=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEMWI2i</latexit>
w1
<latexit sha1_base64="ELWCbynYAUOpCjzaHAkeFZeonCw=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEK1I2h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 59
Multilayer
Perceptrons
http://playground.tensorflow.org/
Source: Prof. Dalcimar Casanova. Curso Deep Learning - UTFPR - 2020 3
With 1 layer and N neuron
http://vision.stanford.edu/teaching/cs231n-demos/linear-classify/
Source: Prof. Dalcimar Casanova. Curso Deep Learning - UTFPR - 2020 5
Multi-Layer Perceptron
Idéia Central
Os erros dos elementos processadores da
camada de saída (conhecidos pelo treinamento
supervisionado) são retro-propagados para as
camadas intermediarias
8
Processo de aprendizado
Fase 1: Feed-Forward
Fase 1: Feed-Forward
Fase 1: Feed-Forward
Fase 1: Feed-Forward
Fase 1: Feed-Backward
Cálculo do erro da camada de saída
Fase 1: Feed-Backward
Atualização dos pesos da camada de saída
Fase 1: Feed-Backward
Cálculo do erro da 2º camada escondida
Fase 1: Feed-Backward
Atualização dos pesos da 2º camada escondida
Fase 1: Feed-Backward
Cálculo do erro da 1º camada escondida
Fase 1: Feed-Backward
Atualização dos pesos da 1º camada escondida
18
Exemplo MLP
Algoritmo de Aprendizado:
Camada de Saída
Camada Escondida
Feed-Forward:
y1 = 1*0+1*0+0*0 = 0
x1 = F(y1) = 0.5
y2 = 1*0+1*0+0*0 = 0
x2 = F(y2) = 0.5
y3 = 1*0+0.5*0+0.5*0 = 0
x3 = F(y3) = 0.5
22
Exemplo MLP
Feed-Backward:
t3-x3 = 1-0.5 = 0.5
e3 = 0.5*0.25 = 0.125
23
Exemplo MLP
Feed-Backward:
w203 = 0+0.5*1*0.125 = 0.0625
w213 = 0+0.5*0.5*0.125 = 0.0313
w223 = 0+0.5*0.5*0.125 = 0.0313
24
Exemplo MLP
Feed-Backward:
e1 = 0.25*(0.125*0.0313) = 0.00097813
e2 = 0.25*(0.125*0.0313) = 0.00097813
25
Exemplo MLP
Feed-Backward:
w101 = 0+0.5*1*0.00097813 =
0.00048907
w102 = 0+0.5*1*0.00097813 =
0.00048907
w111 = 0+0.5*1*0.00097813 =
0.00048907
w112 = 0+0.5*1*0.00097813 =
0.00048907
w121 = 0+0.5*0*0.00097813 =0
w122 = 0+0.5*0*0.00097813 =0
26
Problema XOR
27
Problema XOR
Borda de decisão Borda de decisão
construída pelo 1º construída pelo 2º
neurônio escondido neurônio escondido
28
Problema XOR
29
With N layer and 1 neuron
http://playground.tensorflow.org/
30
Lecture 09
Multilayer Perceptrons
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 1
Topics
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 6
Graph with Fully-Connected Layers
= Multilayer Perceptron
Nothing new, really
(bias not shown)
(1)
w1,1
(1) a1 <latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
w1,1
(2)
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
<latexit sha1_base64="5CoRH/4hNmmOELpSJVIbVc5Zpaw=">AAAB9XicbVBNSwMxEJ31s9avqkcvwSJUkLJRQY9FLx4r2A9otyWbZtvQbHZJspay9H948aCIV/+LN/+NabsHbX0w8Hhvhpl5fiy4Nq777aysrq1vbOa28ts7u3v7hYPDuo4SRVmNRiJSTZ9oJrhkNcONYM1YMRL6gjX84d3UbzwxpXkkH804Zl5I+pIHnBJjpc6om+JzhCedtITPJt1C0S27M6BlgjNShAzVbuGr3YtoEjJpqCBat7AbGy8lynAq2CTfTjSLCR2SPmtZKknItJfOrp6gU6v0UBApW9Kgmfp7IiWh1uPQt50hMQO96E3F/7xWYoIbL+UyTgyTdL4oSAQyEZpGgHpcMWrE2BJCFbe3IjogilBjg8rbEPDiy8ukflHGl2X34apYuc3iyMExnEAJMFxDBe6hCjWgoOAZXuHNGTkvzrvzMW9dcbKZI/gD5/MHvkORXA==</latexit>
<latexit sha1_base64="UUz+fFdIMuJxCEAyuchQQrqM+Xo=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/JkV0=</latexit>
x1 (2) (3)
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
w1,1
<latexit sha1_base64="2hc9HR5bv0+inQ8IsEn3Gd2v7nM=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpTECnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaV5WnGrFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8AcFPkV4=</latexit>
L(y, o) = l
(1) o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>
(1)
w1,2 a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
(2)
<latexit sha1_base64="94K/4WUXlcb/JabfmqCJ0lfwAyA=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QNW0l5Sd87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/OkV0=</latexit>
x2
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
(2)
w1,3
<latexit sha1_base64="AnHRMEgiEAO14EMHGCDfABkTtl0=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTfuJcomraS0qVi7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8ODkWE=</latexit>
a2
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1) (1)
w3,2
<latexit sha1_base64="PK5wtTnoxgAbdbB3wQfMpe3ws7s=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTflK9RJW0l5Sci7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8OOkWE=</latexit>
a3 where a := σ(z) = σ(w⊤x + b)
<latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(2) (1)
@l @l @o @a1 @a1 (Assume network for binary classification)
(1)
= · (2) · (1) · (1)
@w1,1 @o @a @a @w
1 1 1,1
(2) (1)
@l @o @a2 @a1
+ · (2) · (1)
· (1)
@o @a @a1 @w1,1
<latexit sha1_base64="duY3mtbRiHW1HhXeSoGG3GDEm3E=">AAADmXictVJdS8MwFM1aP2b92hR88SU4BEUZzRT0QcEPFPFJ0amwzpFm6QymTUlSZZT+Jv+Lb/4b0zlwbqIveqFwOPecm57L9WPOlHbdt4Jlj41PTBannOmZ2bn5UnnhRolEElonggt552NFOYtoXTPN6V0sKQ59Tm/9x+O8f/tEpWIiutbdmDZD3IlYwAjWhmqVCy9eIDFJvRhLzTCHPPvEz60UbaLsPl1D61kGnX34g1hk0CNtoZ0hjRjQYDMwH1fLx32rHlSMsuhXH/ri6wWAA1bP2fiDDLVfM9T+LUOrVHGrbq/gKEB9UAH9umiVXr22IElII004VqqB3Fg303w64TRzvETRGJNH3KENAyMcUtVMe5eVwVXDtGEgpPkiDXvsoCPFoVLd0DfKEOsHNdzLye96jUQHu82URXGiaUQ+HgoSDrWA+ZnCNpOUaN41ABPJzL9C8oDNurQ5ZscsAQ1HHgU3tSraqrqX25WDo/46imAZrIA1gMAOOABn4ALUAbGWrD3rxDq1l+1D+8w+/5Bahb5nEXwp++oddm8n9w==</latexit>
2
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 7
Graph with Fully-Connected Layers
= Multilayer Perceptron
(1)
w1,1
(1) a1 <latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
w1,1
(2)
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
<latexit sha1_base64="5CoRH/4hNmmOELpSJVIbVc5Zpaw=">AAAB9XicbVBNSwMxEJ31s9avqkcvwSJUkLJRQY9FLx4r2A9otyWbZtvQbHZJspay9H948aCIV/+LN/+NabsHbX0w8Hhvhpl5fiy4Nq777aysrq1vbOa28ts7u3v7hYPDuo4SRVmNRiJSTZ9oJrhkNcONYM1YMRL6gjX84d3UbzwxpXkkH804Zl5I+pIHnBJjpc6om+JzhCedtITPJt1C0S27M6BlgjNShAzVbuGr3YtoEjJpqCBat7AbGy8lynAq2CTfTjSLCR2SPmtZKknItJfOrp6gU6v0UBApW9Kgmfp7IiWh1uPQt50hMQO96E3F/7xWYoIbL+UyTgyTdL4oSAQyEZpGgHpcMWrE2BJCFbe3IjogilBjg8rbEPDiy8ukflHGl2X34apYuc3iyMExnEAJMFxDBe6hCjWgoOAZXuHNGTkvzrvzMW9dcbKZI/gD5/MHvkORXA==</latexit>
<latexit sha1_base64="UUz+fFdIMuJxCEAyuchQQrqM+Xo=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/JkV0=</latexit>
x1 (2) (3)
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
w1,1
<latexit sha1_base64="2hc9HR5bv0+inQ8IsEn3Gd2v7nM=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpTECnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaV5WnGrFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8AcFPkV4=</latexit>
L(y, o) = l
(1) o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>
(1)
w1,2 a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
(2)
<latexit sha1_base64="94K/4WUXlcb/JabfmqCJ0lfwAyA=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QNW0l5Sd87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/OkV0=</latexit>
x2
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
(2)
w1,3
<latexit sha1_base64="AnHRMEgiEAO14EMHGCDfABkTtl0=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTfuJcomraS0qVi7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8ODkWE=</latexit>
a2
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1) (1)
w3,2 a3
output layer
<latexit sha1_base64="PK5wtTnoxgAbdbB3wQfMpe3ws7s=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTflK9RJW0l5Sci7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8OOkWE=</latexit>
<latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(layer 4)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 8
Graph with Fully-Connected Layers
= Multilayer Perceptron
A more common counting/naming scheme, because then a perceptron/Adaline/
logistic regression model can be called a "1-layer neural network"
(1)
w1,1
(1) a1 <latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
w1,1
(2)
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
<latexit sha1_base64="5CoRH/4hNmmOELpSJVIbVc5Zpaw=">AAAB9XicbVBNSwMxEJ31s9avqkcvwSJUkLJRQY9FLx4r2A9otyWbZtvQbHZJspay9H948aCIV/+LN/+NabsHbX0w8Hhvhpl5fiy4Nq777aysrq1vbOa28ts7u3v7hYPDuo4SRVmNRiJSTZ9oJrhkNcONYM1YMRL6gjX84d3UbzwxpXkkH804Zl5I+pIHnBJjpc6om+JzhCedtITPJt1C0S27M6BlgjNShAzVbuGr3YtoEjJpqCBat7AbGy8lynAq2CTfTjSLCR2SPmtZKknItJfOrp6gU6v0UBApW9Kgmfp7IiWh1uPQt50hMQO96E3F/7xWYoIbL+UyTgyTdL4oSAQyEZpGgHpcMWrE2BJCFbe3IjogilBjg8rbEPDiy8ukflHGl2X34apYuc3iyMExnEAJMFxDBe6hCjWgoOAZXuHNGTkvzrvzMW9dcbKZI/gD5/MHvkORXA==</latexit>
<latexit sha1_base64="UUz+fFdIMuJxCEAyuchQQrqM+Xo=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/JkV0=</latexit>
x1 (2) (3)
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
w1,1
<latexit sha1_base64="2hc9HR5bv0+inQ8IsEn3Gd2v7nM=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpTECnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaV5WnGrFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8AcFPkV4=</latexit>
L(y, o) = l
(1) o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>
(1)
w1,2 a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
(2)
<latexit sha1_base64="94K/4WUXlcb/JabfmqCJ0lfwAyA=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QNW0l5Sd87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/OkV0=</latexit>
x2
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
(2)
w1,3
<latexit sha1_base64="AnHRMEgiEAO14EMHGCDfABkTtl0=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTfuJcomraS0qVi7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8ODkWE=</latexit>
a2
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1) (1)
w3,2 a3
output layer
<latexit sha1_base64="PK5wtTnoxgAbdbB3wQfMpe3ws7s=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTflK9RJW0l5Sci7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8OOkWE=</latexit>
<latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(layer 4) layer 3
(1)
w1,1
(1) a1 <latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
w1,1
(2)
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
<latexit sha1_base64="5CoRH/4hNmmOELpSJVIbVc5Zpaw=">AAAB9XicbVBNSwMxEJ31s9avqkcvwSJUkLJRQY9FLx4r2A9otyWbZtvQbHZJspay9H948aCIV/+LN/+NabsHbX0w8Hhvhpl5fiy4Nq777aysrq1vbOa28ts7u3v7hYPDuo4SRVmNRiJSTZ9oJrhkNcONYM1YMRL6gjX84d3UbzwxpXkkH804Zl5I+pIHnBJjpc6om+JzhCedtITPJt1C0S27M6BlgjNShAzVbuGr3YtoEjJpqCBat7AbGy8lynAq2CTfTjSLCR2SPmtZKknItJfOrp6gU6v0UBApW9Kgmfp7IiWh1uPQt50hMQO96E3F/7xWYoIbL+UyTgyTdL4oSAQyEZpGgHpcMWrE2BJCFbe3IjogilBjg8rbEPDiy8ukflHGl2X34apYuc3iyMExnEAJMFxDBe6hCjWgoOAZXuHNGTkvzrvzMW9dcbKZI/gD5/MHvkORXA==</latexit>
<latexit sha1_base64="UUz+fFdIMuJxCEAyuchQQrqM+Xo=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/JkV0=</latexit>
x1 (2) (3)
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
w1,1
<latexit sha1_base64="2hc9HR5bv0+inQ8IsEn3Gd2v7nM=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpTECnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaV5WnGrFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8AcFPkV4=</latexit>
L(y, o) = l
(1) o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>
(1)
w1,2 a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
(2)
<latexit sha1_base64="94K/4WUXlcb/JabfmqCJ0lfwAyA=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QNW0l5Sd87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/OkV0=</latexit>
x2
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
(2)
w1,3
<latexit sha1_base64="AnHRMEgiEAO14EMHGCDfABkTtl0=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTfuJcomraS0qVi7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8ODkWE=</latexit>
a2
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1) (1)
w3,2 a3 could use sigmoid here
<latexit sha1_base64="PK5wtTnoxgAbdbB3wQfMpe3ws7s=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTflK9RJW0l5Sci7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVcebgq1m6zOHJwCmdQAgeuoQb3UIcGEJDwDK/wZk2sF+vd+li0rlnZzAn8gfX5A8OOkWE=</latexit>
<latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 10
Graph with Fully-Connected Layers
= Multilayer Perceptron
(1)
a1
y1 y2 y3
<latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
x1 (2) o1
<latexit sha1_base64="sJdgXiAVm2a4S+4dRd3rRrYB1HY=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK1hbaUDbbTbt0swm7EyGE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSKQw6LrfTmlldW19o7xZ2dre2d2r7h88mjjVjLdYLGPdCajhUijeQoGSdxLNaRRI3g7GN1O//cS1EbF6wCzhfkSHSoSCUbTSfdb3+tWaW3dnIMvEK0gNCjT71a/eIGZpxBUySY3pem6Cfk41Cib5pNJLDU8oG9Mh71qqaMSNn89OnZATqwxIGGtbCslM/T2R08iYLApsZ0RxZBa9qfif100xvPJzoZIUuWLzRWEqCcZk+jcZCM0ZyswSyrSwtxI2opoytOlUbAje4svL5PGs7p3X3buLWuO6iKMMR3AMp+DBJTTgFprQAgZDeIZXeHOk8+K8Ox/z1pJTzBzCHzifPw3gjaM=</latexit> <latexit sha1_base64="fJxEJZDwIRAXzsny9UbZpYFPXZ4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48V7Qe0oWy2k3bpZhN2N0Io/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHkyXoR3QoecgZNVZ6yPq1frniVt05yCrxclKBHI1++as3iFkaoTRMUK27npsYf0KV4UzgtNRLNSaUjekQu5ZKGqH2J/NTp+TMKgMSxsqWNGSu/p6Y0EjrLApsZ0TNSC97M/E/r5ua8NqfcJmkBiVbLApTQUxMZn+TAVfIjMgsoUxxeythI6ooMzadkg3BW355lbRqVe+i6t5fVuo3eRxFOIFTOAcPrqAOd9CAJjAYwjO8wpsjnBfn3flYtBacfOYY/sD5/AEPZI2k</latexit>
<latexit sha1_base64="85kz+6+8sUyRlr+84amQIqvMQLQ=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0msoMeiF48V7Qe0oWy2k3bpZhN2N0Io/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHkyXoR3QoecgZNVZ6yPq1frniVt05yCrxclKBHI1++as3iFkaoTRMUK27npsYf0KV4UzgtNRLNSaUjekQu5ZKGqH2J/NTp+TMKgMSxsqWNGSu/p6Y0EjrLApsZ0TNSC97M/E/r5ua8NqfcJmkBiVbLApTQUxMZn+TAVfIjMgsoUxxeythI6ooMzadkg3BW355lbQuql6t6t5fVuo3eRxFOIFTOAcPrqAOd9CAJjAYwjO8wpsjnBfn3flYtBacfOYY/sD5/AEQ6I2l</latexit>
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
<latexit sha1_base64="P19Wda8vivmLhYYW0RO0w4mIIBA=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFoNgFe5U0DJoYxnRmEByhL3NXrJkb/fYnRNCyE+wsVDE1l9k579xk1yhiQ8GHu/NMDMvSqWw6PvfXmFldW19o7hZ2tre2d0r7x88Wp0ZxhtMS21aEbVcCsUbKFDyVmo4TSLJm9HwZuo3n7ixQqsHHKU8TGhfiVgwik66192gW674VX8GskyCnFQgR71b/ur0NMsSrpBJam078FMMx9SgYJJPSp3M8pSyIe3ztqOKJtyG49mpE3LilB6JtXGlkMzU3xNjmlg7SiLXmVAc2EVvKv7ntTOMr8KxUGmGXLH5ojiTBDWZ/k16wnCGcuQIZUa4WwkbUEMZunRKLoRg8eVl8nhWDc6r/t1FpXadx1GEIziGUwjgEmpwC3VoAIM+PMMrvHnSe/HevY95a8HLZw7hD7zPH/6VjZk=</latexit>
(1)
a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
x2 (2)
a2 o2 <latexit sha1_base64="lp3ZeQ57DPk1EUCeNsK7Ny2DKe8=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0mrVvUvqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwAAKI2a</latexit>
L(y, o)
<latexit sha1_base64="jRuGYuNAf6C7yfYguq+x/vIHq08=">AAACDHicbZDLSsNAFIYnXmu9VV26GSxCBSmJCrosunHhooK9QBvKZDpph05mwsxECCEP4MZXceNCEbc+gDvfxkkaQVt/GPj4zznMOb8XMqq0bX9ZC4tLyyurpbXy+sbm1nZlZ7etRCQxaWHBhOx6SBFGOWlpqhnphpKgwGOk402usnrnnkhFBb/TcUjcAI049SlG2liDSrUfID3GiCU3aS1nz0/i9Bj+sEiPTJddt3PBeXAKqIJCzUHlsz8UOAoI15ghpXqOHWo3QVJTzEha7keKhAhP0Ij0DHIUEOUm+TEpPDTOEPpCmsc1zN3fEwkKlIoDz3RmK6rZWmb+V+tF2r9wE8rDSBOOpx/5EYNawCwZOKSSYM1iAwhLanaFeIwkwtrkVzYhOLMnz0P7pO6c1u3bs2rjsoijBPbBAagBB5yDBrgGTdACGDyAJ/ACXq1H69l6s96nrQtWMbMH/sj6+AYGV5uW</latexit>
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1)
a3 <latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(2) o3
a3
<latexit sha1_base64="wT+53Eb88nVtTpSfn4qk4kRsjrU=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexaQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0nrourXqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwABrI2b</latexit>
<latexit sha1_base64="vBxbcVs2Wnfm0yi6DKhPPczIBHw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOhcI+i</latexit>
(1)
a4
(2)
<latexit sha1_base64="uxWzlquY+EeW/UpcO69SCXeIYtQ=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGhdI+i</latexit>
a4
<latexit sha1_base64="vsgJntgqeAyiGWhpRcem3fXieTw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteJdVNy7Wql+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOi+o+j</latexit>
use softmax if this is a multi-class
(1) problem with mutually exclusive classes
a5 <latexit sha1_base64="NHK0ywkULzi4Jl2BlAjdO8n2Yig=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquVfRY9OKxgv2Qdi3ZNNuGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Gbqt56o0iyS92YcU1/ggWQhI9hY6QH3Lh7Tsnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwis/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK+7deal2ncWRhyM4hjJ4cAk1uIU6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QOi/o+j</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 11
Activation Functions
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 15
Solving the XOR Problem with Non-Linear Activations
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 21
Solving the XOR Problem with Non-Linear Activations
https://github.com/rasbt/stat453-deep-learning-ss20/blob/master/L08-mlp/code/xor-problem.ipynb
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 22
A Selection of Common Activation Functions (1)
1
(z) =
<latexit sha1_base64="Rt8jTWtDgekNkOeHqTh73RoLHno=">AAAB9HicbVBNSwMxEJ31s9avqkcvwSLUS9lVQS9C0YvHCvYD2qVk02wbmmTXJFtol/4OLx4U8eqP8ea/MW33oK0PBh7vzTAzL4g508Z1v52V1bX1jc3cVn57Z3dvv3BwWNdRogitkYhHqhlgTTmTtGaY4bQZK4pFwGkjGNxN/caQKs0i+WhGMfUF7kkWMoKNlfy2Zj2BS+MzdIPGnULRLbszoGXiZaQIGaqdwle7G5FEUGkIx1q3PDc2foqVYYTTSb6daBpjMsA92rJUYkG1n86OnqBTq3RRGClb0qCZ+nsixULrkQhsp8Cmrxe9qfif10pMeO2nTMaJoZLMF4UJRyZC0wRQlylKDB9Zgoli9lZE+lhhYmxOeRuCt/jyMqmfl72LsvtwWazcZnHk4BhOoAQeXEEF7qEKNSDwBM/wCm/O0Hlx3p2PeeuKk80cwR84nz9f5JEt</latexit>
(z) = z <latexit sha1_base64="juP/lRLXXUm+xowd+Wp3uxbypdo=">AAACCXicbVDLSgMxFM3UV62vUZdugkVoEcuMCroRim5cVrAP6Awlk2ba0CQzJBmxHWbrxl9x40IRt/6BO//G9LHQ6oELh3Pu5d57gphRpR3ny8otLC4tr+RXC2vrG5tb9vZOQ0WJxKSOIxbJVoAUYVSQuqaakVYsCeIBI81gcDX2m3dEKhqJWz2Mic9RT9CQYqSN1LGhp2iPo9KoDC+gF0qEUzdL3UOP3Melo1E569hFp+JMAP8Sd0aKYIZax/70uhFOOBEaM6RU23Vi7adIaooZyQpeokiM8AD1SNtQgThRfjr5JIMHRunCMJKmhIYT9edEirhSQx6YTo50X817Y/E/r53o8NxPqYgTTQSeLgoTBnUEx7HALpUEazY0BGFJza0Q95GJQ5vwCiYEd/7lv6RxXHFPKs7NabF6OYsjD/bAPigBF5yBKrgGNVAHGDyAJ/ACXq1H69l6s96nrTlrNrMLfsH6+AYrgpi4</latexit>
1 + exp( z)
8
>
<1 if z > 1
exp(z) exp( z)
Tanh(z) = HardTanh(z) = 1 if z < 1
exp(z) + exp( z) >
:
<latexit sha1_base64="5cZuu210xeCBlZ4E5jtS8+GzZjM=">AAACKHicbZDLSgMxFIYzXmu9VV26CRbBIpYZFXQjim5cKtgqdErJpGdsMJMZkjNiO/Rx3PgqbkQUceuTmF7wVn8IfPnPOSTnDxIpDLruuzM2PjE5NZ2byc/OzS8sFpaWqyZONYcKj2WsrwJmQAoFFRQo4SrRwKJAwmVwc9KrX96CNiJWF9hOoB6xayVCwRlaq1E49BHuMLtgqtXd6JToAfVDzXjmw13Su2/RPm11Sl36ZW5+m41C0S27fdFR8IZQJEOdNQrPfjPmaQQKuWTG1Dw3wXrGNAouoZv3UwMJ4zfsGmoWFYvA1LP+ol26bp0mDWNtj0Lad39OZCwyph0FtjNi2DJ/az3zv1otxXC/ngmVpAiKDx4KU0kxpr3UaFNo4CjbFhjXwv6V8hazQaHNNm9D8P6uPArV7bK3U3bPd4tHx8M4cmSVrJEN4pE9ckROyRmpEE7uySN5Ia/Og/PkvDnvg9YxZzizQn7J+fgEGYOkGQ==</latexit>
<latexit sha1_base64="HUDbc5GxdYcfA36GuHXXRmmXRRg=">AAACmXicbZHPbtNAEMbXLtAS/oUi9dLLiAhUDkQ2rVQOLWpBoIhTEU1bKY6i9XqcrLpeW7vjQmL5nXgWbrwNm9QSacInrfTpm99odmfjQklLQfDH8zfu3X+wufWw9ejxk6fP2s+3L2xeGoF9kavcXMXcopIa+yRJ4VVhkGexwsv4+tO8fnmDxspcn9O0wGHGx1qmUnBy0aj9CxpFhD+p6nGTnHM9qfdmb+AYohjHUlfCDbB1C5YUwuumBWQKNczgg8ui6A70dp06mocr2OwfldMEzQ9p0cFLVIQ6aW4xaneCbrAQrJuwMR3W6GzU/h0luSgz1CQUt3YQBgUNK25ICoV1KyotFlxc8zEOnNU8QzusFput4ZVLEkhz444mWKTLHRXPrJ1msSMzThO7WpuH/6sNSkrfDyupi5JQi9tBaamAcph/EyTSoCA1dYYLI91dQUy44YLcZ7bcEsLVJ6+bi3fdcL8bfDvonHxs1rHFdtlLtsdCdshOWI+dsT4T3o535H32vvi7/qnf87/eor7X9Lxgd+R//wuLNLmu</latexit>
z otherwise
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 24
A Selection of Common Activation Functions (1)
(Logistic) Sigmoid
• Advantages of Tanh
• Mean centering
• Positive and negative values
• Larger gradients
Tanh ("tanH")
Additional tip: Also good to
normalize inputs to mean zero and
use random weight initialization
with avg. weight centered at zero
Also simple
derivative:
d
T anh(z) = 1 T anh(z)2
<latexit sha1_base64="DqKNoYPxBXT/vBenWHakJA2dvWM=">AAACD3icbVC7TsMwFHXKq5RXgJHFogKVgSopSLAgVbAwFqkvqS2V4zitVceJbAepjfIHLPwKCwMIsbKy8Tc4bQZoOdKVjs+5V773OCGjUlnWt5FbWl5ZXcuvFzY2t7Z3zN29pgwigUkDBywQbQdJwignDUUVI+1QEOQ7jLSc0U3qtx6IkDTgdTUOSc9HA049ipHSUt887noC4dhNYneS1BEfliYn8Ara8BRmr/u4khT6ZtEqW1PARWJnpAgy1PrmV9cNcOQTrjBDUnZsK1S9GAlFMSNJoRtJEiI8QgPS0ZQjn8hePL0ngUdacaEXCF1cwan6eyJGvpRj39GdPlJDOe+l4n9eJ1LeZS+mPIwU4Xj2kRcxqAKYhgNdKghWbKwJwoLqXSEeIh2Q0hGmIdjzJy+SZqVsn5Wtu/Ni9TqLIw8OwCEoARtcgCq4BTXQABg8gmfwCt6MJ+PFeDc+Zq05I5vZB39gfP4A+tmarA==</latexit>
dz
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 25
A Selection of Common Activation Functions (2)
ReLU (Rectified Linear Unit) Leaky ReLU
(
( z, if z 0
z, if z 0 LeakyReLU(z) =
ReLU(z) = ↵ ⇥ z, otherwise
0, otherwise <latexit sha1_base64="URhCySzdBd9uYPw3t3ZOk9dZRqk=">AAACg3icbVFNb9NAEF0bCiV8pXDkMiKiKgKldqlULpUquHDooSDSVoqjaLwZJ6us12Z3DCSW/wg/ixv/hk1qtbTlSSs9vTdvZ3Y2LbVyHEV/gvDO3Y179zcfdB4+evzkaXfr2akrKitpIAtd2PMUHWllaMCKNZ2XljBPNZ2l848r/+w7WacK85UXJY1ynBqVKYnspXH3F7RImH5yfUw4X3yh40Gzs3wNh51LN6WpMrX0nVxzqS7fwnYbBJVBA0tIpvQNIkiSqyjqcoa+TOXkrkUKnpH9oRzB1ZUJmUnbZtztRf1oDbhN4pb0RIuTcfd3MilklZNhqdG5YRyVPKrRspKamk5SOSpRznFKQ08N+oFG9XqHDbzyygSywvpjGNbqv4kac+cWeeorc+SZu+mtxP95w4qz96NambJiMvKiUVZp4AJWHwITZUmyXniC0io/K8gZWpTsv63jlxDffPJtcrrXj9/1o8/7vaMP7To2xQvxUuyIWByII/FJnIiBkIEItoPdIAo3wjfhXrh/URoGbea5uIbw8C+bYraB</latexit>
<latexit sha1_base64="8/mhuw0565qyUYql8RjpNUv+x0k=">AAACaXicbVHRShtBFJ1dbbVRa1Qq0r5cGioWSthVob4UpH3xwQdbGhWyIcxO7iaDs7PrzF1tsiz0G/vWH+hLf8JJXKxVDwwczj333pkzca6kpSD47flz88+eLyy+aCwtr7xcba6tn9qsMAI7IlOZOY+5RSU1dkiSwvPcIE9jhWfxxZdp/ewKjZWZ/k7jHHspH2qZSMHJSf3mT4gIf1D5DY871c7kPXxqQI0oxqHUpXDTbXWnTj7Adt0DMoEKJhAN8RICiKI7U3DPlNEIzbW0CP+GRKgH9eB+sxW0gxngMQlr0mI1TvrNX9EgE0WKmoTi1nbDIKdeyQ1JobBqRIXFnIsLPsSuo5qnaHvlLKkK3jllAElm3NEEM/V+R8lTa8dp7Jwpp5F9WJuKT9W6BSUHvVLqvCDU4nZRUiigDKaxw0AaFKTGjnBhpLsriBE3XJD7nIYLIXz45MfkdLcd7rWDr/utw891HIvsDXvLdljIPrJDdsROWIcJ9sdb9l55m95ff83f8l/fWn2v7tlg/8Fv3QCnIK87</latexit>
ReLU(z) = max(0, z)
<latexit sha1_base64="tpvX57OOx4EZDDJwxaIFA+Uqi8w=">AAACA3icbVDLSgNBEJyNrxhfq970MhiECBJ2VdCLEPTiwUMUE4UkhNlJxwyZfTDTK0mWgBd/xYsHRbz6E978GyePgyYWNBRV3XR3eZEUGh3n20rNzM7NL6QXM0vLK6tr9vpGWYex4lDioQzVncc0SBFACQVKuIsUMN+TcOu1zwf+7QMoLcLgBrsR1Hx2H4im4AyNVLe3qggdTK7hstTP9fZOadVnnZyzT3t7dTvr5J0h6DRxxyRLxijW7a9qI+SxDwFyybSuuE6EtYQpFFxCP1ONNUSMt9k9VAwNmA+6lgx/6NNdozRoM1SmAqRD9fdEwnytu75nOn2GLT3pDcT/vEqMzZNaIoIoRgj4aFEzlhRDOgiENoQCjrJrCONKmFspbzHFOJrYMiYEd/LlaVI+yLuHeefqKFs4G8eRJttkh+SIS45JgVyQIikRTh7JM3klb9aT9WK9Wx+j1pQ1ntkkf2B9/gCi/JYz</latexit>
↵ = 0.025
<latexit sha1_base64="GxyagE92KSMVQEIKdX70/1bBiDc=">AAAB83icbVBNSwMxEJ31s9avqkcvi0XwVHaroheh6MVjBfsB3aXMptk2NJsNSVYopX/DiwdFvPpnvPlvTNs9aOuDYR7vzZDJiyRn2njet7Oyura+sVnYKm7v7O7tlw4OmzrNFKENkvJUtSPUlDNBG4YZTttSUUwiTlvR8G7qt56o0iwVj2YkaZhgX7CYETRWCgLkcoA3XsWrXnZLZdtncJeJn5My5Kh3S19BLyVZQoUhHLXu+J404RiVYYTTSTHINJVIhtinHUsFJlSH49nNE/fUKj03TpUtYdyZ+ntjjInWoySykwmagV70puJ/Xicz8XU4ZkJmhgoyfyjOuGtSdxqA22OKEsNHliBRzN7qkgEqJMbGVLQh+ItfXibNasU/r3gPF+XabR5HAY7hBM7AhyuowT3UoQEEJDzDK7w5mfPivDsf89EVJ985gj9wPn8ASyuQiA==</latexit>
(
z, if z 0
PReLU(z) =
↵=1
<latexit sha1_base64="yzGSqYTdMxl7DUavLZUx0UoSOM8=">AAAB73icbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0ItQ9OKxgv2ANpTJdtMu3Wzi7kYooX/CiwdFvPp3vPlv3LY5aOuDgcd7M8zMCxLBtXHdb6ewsrq2vlHcLG1t7+zulfcPmjpOFWUNGotYtQPUTHDJGoYbwdqJYhgFgrWC0e3Ubz0xpXksH8w4YX6EA8lDTtFYqd1FkQzx2uuVK27VnYEsEy8nFchR75W/uv2YphGThgrUuuO5ifEzVIZTwSalbqpZgnSEA9axVGLEtJ/N7p2QE6v0SRgrW9KQmfp7IsNI63EU2M4IzVAvelPxP6+TmvDKz7hMUsMknS8KU0FMTKbPkz5XjBoxtgSp4vZWQoeokBobUcmG4C2+vEyaZ1XvvOreX1RqN3kcRTiCYzgFDy6hBndQhwZQEPAMr/DmPDovzrvzMW8tOPnMIfyB8/kDgdGPnA==</latexit>
↵z, otherwise
<latexit sha1_base64="5XMW6SQ4ZR0mdd8VGBaZZP6S0qU=">AAACcnicbVFNT9tAEF2bfkD6FUBcWqmdNgIFCUU2IMEFCbWXHnpIqwaQ4ihab8bJivXa7I5pE8s/oH+vt/6KXvgBbFKrUOiTVnp6M29m922cK2kpCH55/tKDh48eL680njx99vxFc3XtxGaFEdgTmcrMWcwtKqmxR5IUnuUGeRorPI3PP8zrp5dorMz0V5rmOEj5WMtECk5OGjZ/QET4ncruF/zUq9qzbThqQI0oxrHUpXDjbfVXne3AVm0CmUAFM4jGeAEBRNGNlat8wstZdas5owmab9Ii3AyLUI/qBcNmK+gEC8B9EtakxWp0h82f0SgTRYqahOLW9sMgp0HJDUmhsGpEhcWci3M+xr6jmqdoB+Uisgo2nTKCJDPuaIKFettR8tTaaRq7zpTTxN6tzcX/1foFJYeDUuq8INTiz6KkUEAZzPOHkTQoSE0d4cJId1cQE264IPdLDRdCePfJ98nJbifc6wSf91vH7+s4ltkr9o61WcgO2DH7yLqsxwT77W14r7033pX/0n/r19n5Xu1ZZ//A37kGhZ6ziQ==</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 26
Model Evaluation
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 36
Recommended Practice: Looking at Some Failure Cases
Failure cases of a ~93% accuracy (not very good, but beside the point)
2-layer (1-hidden layer) MLP on MNIST
(where t=target class and p=predicted class)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 37
Overfitting and Underfitting
Training Error
Error
Generalization Error
Overfitting
Model Capacity
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 38
Bias-Variance Decomposition
Bias-Variance Decomposition
Bias [ ˆ
✓] = E[ ˆ
✓] ✓
Bias(θ)̂ ✓= E[θ]̂ − θ
<latexit sha1_base64="ArW0mTesET86qB5wWHfkERheXBA=">AAACKnicbVDJSgNBEO1xN25Rj14Gg+DFMKOCXgQXBI8KZoHMEGo6laRJz/TQXSOEId/jxV/x4kERr36InUUw6oOGV+9V0VUvSqUw5Hnvzszs3PzC4tJyYWV1bX2juLlVNSrTHCtcSaXrERiUIsEKCZJYTzVCHEmsRb2roV97QG2ESu6pn2IYQycRbcGBrNQsXgQqRQ2kdAIx5pcCzKCZB9RFgkEj6AJ9F+HZ9XR9MCbNYskreyO4f4k/ISU2wW2z+BK0FM9iTIhLMKbheymFOWgSXOKgEGQGU+A96GDD0uFeJsxHpw7cPau03LbS9iXkjtSfEznExvTjyHbGQF3z2xuK/3mNjNqnYS6SNCNM+PijdiZdUu4wN7clNHKSfUuAa2F3dXkXNHCy6RZsCP7vk/+S6mHZPyp7d8el88tJHEtsh+2yfeazE3bObtgtqzDOHtkze2VvzpPz4rw7H+PWGWcys82m4Hx+ARaKqXc=</latexit>
( )
2
Var(θ)̂ = E[θ2̂ ] − E[θ ]̂
h i
Var
Var(
ˆ
θ)̂ =✓ [E[(E[
✓] = ˆ
θ]̂ −Eθ)̂ 2]✓
2 ˆ
(E[✓]) 2
h i
<latexit sha1_base64="+kALwOjv4LtTP3EAUE/0PGX7wiU=">AAACR3icbVBBS+NAGJ3UddXq7lY9eglbFvSwJXEX1osgiuBRwVahieXL9EszOMmEmS9CCfl3Xrzubf+CFw+KeHRSK1jdBwNv3nsf882LcikMed4/pzH3af7zwuJSc3nly9dvrdW1nlGF5tjlSip9HoFBKTLskiCJ57lGSCOJZ9HlQe2fXaE2QmWnNM4xTGGUiVhwICsNWheBylEDKZ1BimUPdDUoA0qQoOoHCdDrJdw9DCTGNCNelNtVoMUoofDn5uFsfqs2B6221/EmcD8Sf0rabIrjQetvMFS8SDEjLsGYvu/lFJagSXCJVTMoDObAL2GEfUvrpU1YTnqo3B9WGbqx0vZk5E7UtxMlpMaM08gmU6DEvPdq8X9ev6B4JyxFlheEGX95KC6kS8qtS3WHQiMnObYEuBZ2V5cnoIGTrb5pS/Dff/kj6W13/F8d7+R3e29/Wsci22Df2Sbz2R+2x47YMesyzq7ZLbtnD86Nc+c8Ok8v0YYznVlnM2g4z8j+tLQ=</latexit>
Training Error
Underfitting Overfitting
increases increases Generalization Error
Variance
Bias
Model Capacity
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 40
Deep Learning Works Best with Large Datasets
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 42
Bias & Variance vs Overfitting & Underfitting
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 43
Multilayer Perceptron Architecture
Nonlinear Activation Functions
Multilayer Perceptron Code Examples
Overfitting and Underfitting
Cats & Dogs and Custom Data Loaders
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 44
VGG16 Convolutional Neural Network for
Kaggle's Cats and Dogs Images
A "real world" example
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 45
Training/Validation/Test splits
Ratio depends on the dataset size, but a 80/5/15 split is usually a good idea
• Training set is used for training, it is not necessary to plot the training accuracy during training but
it can be useful
• Validation set accuracy provides a rough estimate of the generalization performance (it can be
optimistically biased if you design the network to do well on the validation set ("information
leakage")
• Test set should only be used once to get an unbiased estimate of the generalization performance
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 46
Training/Validation/Test splits
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 47
Parameters vs Hyperparameters
Parameters Hyperparameters
• weights (weight parameters) • minibatch size
• biases (bias units) • data normalization schemes
• number of epochs
• number of hidden layers
• number of hidden units
• learning rates
• (random seed, why?)
• loss function
• various weights (weighting terms)
• activation function types
• regularization schemes (more later)
• weight initialization schemes (more later)
• optimization algorithm type (more later)
• ...
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 48
Custom DataLoader Classes ...
• Example showing how you can create your own data loader to efficiently iterate
through your own collection of images
(pretend the MNIST images there are some custom image collection)
https://github.com/rasbt/stat453-deep-learning-ss20/blob/master/L08-mlp/code/
custom-dataloader/custom-dataloader-example.ipynb
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 49
DataLoader with Train/Validation/Test splits
https://github.com/rasbt/stat453-deep-learning-ss20/blob/master/L08-mlp/code/mnist-validation-split.ipynb
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 50
Lecture 09
Regularization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 1
Goal: Reduce Overfitting
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 2
Regularization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 3
Regularization / Regularizing Effects
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 4
Lecture Overview
4. Dropout
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 5
General Strategies to Avoid Overfitting
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 6
Hig
Best Way to Reduce Overfitting is Collecting More Data
(Not
Figure 3: Illustration of bias and variance.
Figure 4:on
Softmax Learning
MNISTcurves of (test
subset softmax
setclassifiers fit toconstant)
size is kept MNIST.
Original
Randomly Augmented
Randomly Augmented
without resample=PIL.Image.BILINEAR
https://github.com/rasbt/stat453-deep-learning-ss20/blob/master/L09-regularization/code/
data-augmentation.ipynb
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 8
Use (0.5, 0.5, 0.5) for RGB images
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 10
Other Ways for Dealing with Overfitting
if Collecting More Data is not Feasible
=> Reducing Network's Capacity by Other Means
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 11
Other Ways for Dealing with Overfitting
if Collecting More Data is not Feasible
=> Reducing Network's Capacity by Other Means
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 12
Early Stopping
Step 1: Split your dataset into 3 parts (always recommended)
• use test set only once at the end (for unbiased estimate of
generalization performance)
• use validation accuracy for tuning (always recommended)
Dataset
Training Validation Test
dataset dataset dataset
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 13
Early Stopping
Training set
Epochs
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 14
1. Avoiding overfitting with more data and data augmentation
2. Reducing network capacity & early stopping
3. Adding norm penalties to the loss: L1 & L2 regularization
4. Dropout
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 15
L1/L2 Regularization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 16
L1/L2 Regularization
for Linear Models (e.g., Logistic Regression)
Xn
1
Costw,b = L(y [i] , ŷ [i] )
n i=1
<latexit sha1_base64="V59LNcdyxr0eVoJo+5mccNlvb8E=">AAACUHicbVHLatwwFL2e9JG6r2m67EZ0KKRQBjsttJtASDZddDGBzExg7BhZI8+IyLKRrpMaoU/MJrt8RzZdtLSaF7RJLwgdnXMuujrKaykMRtFN0Nl68PDR4+0n4dNnz1+87L7aGZmq0YwPWSUrfZpTw6VQfIgCJT+tNadlLvk4Pz9a6OMLro2o1Am2NU9LOlOiEIyip7LuLEH+He1RZdBlNikpzvPCXroPZINz5wjZJ0mhKbOxs8qRxDRlZsV+7M4UISsno9J+c7vtmZ2I1LeHyZyibd3q/D7r9qJ+tCxyH8Rr0IN1DbLudTKtWFNyhUxSYyZxVGNqqUbBJHdh0hheU3ZOZ3zioaIlN6ldBuLIO89MSVFpvxSSJft3h6WlMW2Ze+didnNXW5D/0yYNFl9SK1TdIFdsdVHRSIIVWaRLpkJzhrL1gDIt/KyEzalPDv0fhD6E+O6T74PRXj/+2I+OP/UODtdxbMMbeAu7EMNnOICvMIAhMLiCW/gJv4Lr4EfwuxOsrJsdXsM/1Qn/AM9ytVA=</latexit>
Xn X
1 [i] [i] 2
L2-Regularized-Costw,b = L(y , ŷ ) + wj
n i=1 n j
<latexit sha1_base64="UZp3ipt8/eQftFzaePzqg+XpUpo=">AAACgHicbVFdb9MwFHXCYKN8rINHXqxVSEWsXVKQhpAmTexlD3sYaN0mNWnkOE7rzXEi+4YRLP8O/hdv/Bgk3DYTbONKls4998P3nptWgmsIgl+e/2Dt4aP1jcedJ0+fPd/sbr0402WtKBvTUpTqIiWaCS7ZGDgIdlEpRopUsPP06nARP//KlOalPIWmYnFBZpLnnBJwVNL9EQH7BuZ4NPjCZrUgin9n2eCw1GATExUE5mluru0OvsGptRjv4yhXhJrQGmlxpOsiMXw/tFOJ8SqTEmGObb+ZmgmPXXknmhMwjV35b/DbtkMk3KwZ+dvnEl8nl9NR0u0Fw2Bp+D4IW9BDrZ0k3Z9RVtK6YBKoIFpPwqCC2BAFnApmO1GtWUXoFZmxiYOSFEzHZimgxa8dk+G8VO5JwEv23wpDCq2bInWZi+X03diC/F9sUkP+ITZcVjUwSVcf5bXAUOLFNXDGFaMgGgcIVdzNiumcOGHA3azjRAjvrnwfnI2G4bth8Pl97+BTK8cGeoW2UR+FaA8doCN0gsaIot9ez9vxBr7v9/1dP1yl+l5b8xLdMv/jHwobwpQ=</latexit>
X
where: wj2 = ||w||22
<latexit sha1_base64="ibct1zBUFvljjClJ/FYMGMyWyc4=">AAACCnicbVDLSsNAFJ34rPUVdelmtAiuShIF3QhFNy4r2Ac0aZhMJ+20kwczE0tJunbjr7hxoYhbv8Cdf+Ok7UJbD1w4nHMv997jxYwKaRjf2tLyyuraemGjuLm1vbOr7+3XRZRwTGo4YhFvekgQRkNSk1Qy0ow5QYHHSMMb3OR+44FwQaPwXo5i4gSoG1KfYiSV5OpHtkgCtw+Hbr9twSuYZXaAZM/z0+E4y1yrbbl6ySgbE8BFYs5ICcxQdfUvuxPhJCChxAwJ0TKNWDop4pJiRsZFOxEkRniAuqSlaIgCIpx08soYniilA/2IqwolnKi/J1IUCDEKPNWZ3ynmvVz8z2sl0r90UhrGiSQhni7yEwZlBPNcYIdygiUbKYIwp+pWiHuIIyxVekUVgjn/8iKpW2XzrGzcnZcq17M4CuAQHINTYIILUAG3oApqAINH8AxewZv2pL1o79rHtHVJm80cgD/QPn8AjqmaLA==</latexit>
j
and λ is a hyperparameter
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 17
L2 Regularization for Multilayer Neural Networks
Xn L
X
1 [i] [i] (l) 2
L2-Regularized-Costw,b = L(y , ŷ ) + ||w ||F
n i=1 n
<latexit sha1_base64="RCGZxKRvoEmWPZXkSusdbx4t9Lk=">AAACm3icbVFdb9MwFHXC1ygf6+AFCSFZVEidYFVSJsHL0EQlmFAfBqLbpKaNHMdprTlOZN8AwfWf4qfwxr/BaYMYG1eydHzuPfdeHyel4BqC4JfnX7t+4+atrdudO3fv3d/u7jw40UWlKJvQQhTqLCGaCS7ZBDgIdlYqRvJEsNPkfNTkT78wpXkhP0NdsllOFpJnnBJwVNz9EQH7BmY83PvEFpUgin9n6d6o0GBjE+UElklmvtoX+A9OrMX4AEeZItSE1kiLI13lseEHoZ1LjDeVlAgztv16bqZ85uSdaEnA1HZz38XP2w6RcLum5EIf0fRxWrxa/Z0/N32xa1er+N18GHd7wSBYB74Kwhb0UBvHcfdnlBa0ypkEKojW0zAoYWaIAk4Fs52o0qwk9Jws2NRBSXKmZ2btrcXPHJPirFDuSMBr9qLCkFzrOk9cZbOtvpxryP/lphVkr2eGy7ICJulmUFYJDAVuPgqnXDEKonaAUMXdrpguifMM3Hd2nAnh5SdfBSfDQfhyEH7c7x2+be3YQo/RU9RHIXqFDtEROkYTRL1H3hvvvXfkP/FH/gd/vCn1vVbzEP0T/uQ3kk3NUw==</latexit>
l=1
(l) 2
where ||w ||F is the Frobenius norm (squared): <latexit sha1_base64="71TeQNuRgLqGJEbkEvQmFwkRrF4=">AAACAXicbVDLSsNAFJ3UV62vqBvBzWAR6qYkVdBlURCXFewD2jRMppN26OTBzEQpSdz4K25cKOLWv3Dn3zhps9DWAxcO59zLvfc4IaNCGsa3VlhaXlldK66XNja3tnf03b2WCCKOSRMHLOAdBwnCqE+akkpGOiEnyHMYaTvjq8xv3xMuaODfyUlILA8NfepSjKSSbP0gSXoekiPHjR/SflxhJ2mS2Nf9mq2XjaoxBVwkZk7KIEfD1r96gwBHHvElZkiIrmmE0ooRlxQzkpZ6kSAhwmM0JF1FfeQRYcXTD1J4rJQBdAOuypdwqv6eiJEnxMRzVGd2rZj3MvE/rxtJ98KKqR9Gkvh4tsiNGJQBzOKAA8oJlmyiCMKcqlshHiGOsFShlVQI5vzLi6RVq5qnVfP2rFy/zOMogkNwBCrABOegDm5AAzQBBo/gGbyCN+1Je9HetY9Za0HLZ/bBH2ifP8XjlxM=</latexit>
XX (l) 2
(l) 2
||w ||F = (wi,j )
<latexit sha1_base64="wC+HVng1YzDyBZBFO6aEDDWL7aY=">AAACJHicbVBJSwMxGM3Urdat6tFLsAgtSJmpgoIIRUE8VrALdKZDJs20aTMLScZSpvNjvPhXvHhwwYMXf4vpctDWByGP976P5D0nZFRIXf/SUkvLK6tr6fXMxubW9k52d68mgohjUsUBC3jDQYIw6pOqpJKRRsgJ8hxG6k7/euzXHwgXNPDv5TAkloc6PnUpRlJJdvZiNDI9JLuOGw+SVpxnhWQ0sm9aJXgJTRF5Np1ePZgfTG07psewlxRaJTub04v6BHCRGDOSAzNU7Oy72Q5w5BFfYoaEaBp6KK0YcUkxI0nGjAQJEe6jDmkq6iOPCCuehEzgkVLa0A24Or6EE/X3Row8IYaeoybHgcS8Nxb/85qRdM+tmPphJImPpw+5EYMygOPGYJtygiUbKoIwp+qvEHcRR1iqXjOqBGM+8iKplYrGSdG4O82Vr2Z1pMEBOAR5YIAzUAa3oAKqAINH8AxewZv2pL1oH9rndDSlzXb2wR9o3z95QaQC</latexit>
i j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 21
L2 Regularization for Neural Nets
@L
wi,j := wi,j ⌘
<latexit sha1_base64="5ZjwwihY2mDS1cW07vV16R1eYEo=">AAACNHicbVDLSgMxFM34rPVVdekmWAQXWmZUUASh6EbQRQX7gE4pd9JMG5t5kGSUMsxHufFD3IjgQhG3foOZtlRtPRA4nHvPzb3HCTmTyjRfjKnpmdm5+cxCdnFpeWU1t7ZekUEkCC2TgAei5oCknPm0rJjitBYKCp7DadXpnqf16h0VkgX+jeqFtOFB22cuI6C01Mxd3jdjtotvE3xyikd8D9tUAbZdASS2QxCKAce2B6pDgMdXSfKjjlxJM5c3C2YfeJJYQ5JHQ5SauSe7FZDIo74iHKSsW2aoGnE6mHCaZO1I0hBIF9q0rqkPHpWNuH90gre10sJuIPTzFe6rvx0xeFL2PEd3pnvL8Voq/lerR8o9bsTMDyNFfTL4yI04VgFOE8QtJihRvKcJEMH0rph0QCeldM5ZHYI1fvIkqewXrIOCdX2YL54N48igTbSFdpCFjlARXaASKiOCHtAzekPvxqPxanwYn4PWKWPo2UB/YHx9Aw3LqpE=</latexit>
@wi,j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 22
L2 Regularization for Neural Nets in PyTorch
# regularize loss
L2 = 0.
for p in model.parameters():
L2 = L2 + (p**2).sum()
cost = cost + 2./targets.size(0) * LAMBDA * L2
optimizer.zero_grad()
cost.backward()
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 26
L2 Regularization for Neural Nets in PyTorch
optimizer.zero_grad()
cost.backward()
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 27
L2 Regularization for Neural Nets in PyTorch
• Or, if you only want to regularize the weights, not the biases:
# regularize loss
L2 = 0.
for name, p in model.named_parameters():
if 'weight' in name:
L2 = L2 + (p**2).sum()
optimizer.zero_grad()
cost.backward()
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 28
Effect of Norm Penalties on the Decision Boundary
Assume a nonlinear model
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 29
Dropout*
*Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way
to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929-1958.
http://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 30
Dropout
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014).
Dropout: a simple way to prevent neural networks from overfitting. The Journal of
Machine Learning Research, 15(1), 1929-1958.
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 31
Dropout in a Nutshell: Dropping Nodes
(1)
a1 <latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
x1 (2) o1
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
<latexit sha1_base64="P19Wda8vivmLhYYW0RO0w4mIIBA=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFoNgFe5U0DJoYxnRmEByhL3NXrJkb/fYnRNCyE+wsVDE1l9k579xk1yhiQ8GHu/NMDMvSqWw6PvfXmFldW19o7hZ2tre2d0r7x88Wp0ZxhtMS21aEbVcCsUbKFDyVmo4TSLJm9HwZuo3n7ixQqsHHKU8TGhfiVgwik66192gW674VX8GskyCnFQgR71b/ur0NMsSrpBJam078FMMx9SgYJJPSp3M8pSyIe3ztqOKJtyG49mpE3LilB6JtXGlkMzU3xNjmlg7SiLXmVAc2EVvKv7ntTOMr8KxUGmGXLH5ojiTBDWZ/k16wnCGcuQIZUa4WwkbUEMZunRKLoRg8eVl8nhWDc6r/t1FpXadx1GEIziGUwjgEmpwC3VoAIM+PMMrvHnSe/HevY95a8HLZw7hD7zPH/6VjZk=</latexit>
(1)
a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
x2 (2)
a2 o2 <latexit sha1_base64="lp3ZeQ57DPk1EUCeNsK7Ny2DKe8=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0mrVvUvqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwAAKI2a</latexit>
L(y, o)
<latexit sha1_base64="jRuGYuNAf6C7yfYguq+x/vIHq08=">AAACDHicbZDLSsNAFIYnXmu9VV26GSxCBSmJCrosunHhooK9QBvKZDpph05mwsxECCEP4MZXceNCEbc+gDvfxkkaQVt/GPj4zznMOb8XMqq0bX9ZC4tLyyurpbXy+sbm1nZlZ7etRCQxaWHBhOx6SBFGOWlpqhnphpKgwGOk402usnrnnkhFBb/TcUjcAI049SlG2liDSrUfID3GiCU3aS1nz0/i9Bj+sEiPTJddt3PBeXAKqIJCzUHlsz8UOAoI15ghpXqOHWo3QVJTzEha7keKhAhP0Ij0DHIUEOUm+TEpPDTOEPpCmsc1zN3fEwkKlIoDz3RmK6rZWmb+V+tF2r9wE8rDSBOOpx/5EYNawCwZOKSSYM1iAwhLanaFeIwkwtrkVzYhOLMnz0P7pO6c1u3bs2rjsoijBPbBAagBB5yDBrgGTdACGDyAJ/ACXq1H69l6s96nrQtWMbMH/sj6+AYGV5uW</latexit>
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1)
a3 <latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(2) o3
a3
<latexit sha1_base64="wT+53Eb88nVtTpSfn4qk4kRsjrU=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexaQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0nrourXqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwABrI2b</latexit>
<latexit sha1_base64="vBxbcVs2Wnfm0yi6DKhPPczIBHw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOhcI+i</latexit>
(1)
a4
(2)
<latexit sha1_base64="uxWzlquY+EeW/UpcO69SCXeIYtQ=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGhdI+i</latexit>
a4
<latexit sha1_base64="vsgJntgqeAyiGWhpRcem3fXieTw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteJdVNy7Wql+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOi+o+j</latexit>
Originally, drop probability 0.5
(1) (but 0.2-0.8 also common now)
a5 <latexit sha1_base64="NHK0ywkULzi4Jl2BlAjdO8n2Yig=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquVfRY9OKxgv2Qdi3ZNNuGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Gbqt56o0iyS92YcU1/ggWQhI9hY6QH3Lh7Tsnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwis/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK+7deal2ncWRhyM4hjJ4cAk1uIU6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QOi/o+j</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 32
Dropout in a Nutshell: Dropping Nodes
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 33
Dropout in a Nutshell: Dropping Nodes
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 34
Dropout: Co-Adaptation Interpretation
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 35
Inverted Dropout
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 43
Dropout in PyTorch (Functional API)
class MultilayerPerceptron(torch.nn.Module):
self.drop_proba = drop_proba
self.linear_1 = torch.nn.Linear(num_features,
num_hidden_1)
self.linear_2 = torch.nn.Linear(num_hidden_1,
num_hidden_2)
self.linear_out = torch.nn.Linear(num_hidden_2,
num_classes)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 44
Dropout in PyTorch ([more] Object-Oriented API)
class MultilayerPerceptron(torch.nn.Module):
self.my_network = torch.nn.Sequential(
torch.nn.Linear(num_features, num_hidden_1),
torch.nn.ReLU(),
torch.nn.Dropout(drop_proba),
torch.nn.Linear(num_hidden_1, num_hidden_2),
torch.nn.ReLU(),
torch.nn.Dropout(drop_proba),
torch.nn.Linear(num_hidden_2, num_classes)
)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 45
Dropout in PyTorch
Here, is is very important that you use model.train() and model.eval()!
for epoch in range(NUM_EPOCHS):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
cost.backward()
minibatch_cost.append(cost)
### UPDATE MODEL PARAMETERS
optimizer.step()
model.eval()
with torch.no_grad():
cost = compute_loss(model, train_loader)
epoch_cost.append(cost)
print('Epoch: %03d/%03d Train Cost: %.4f' % (
epoch+1, NUM_EPOCHS, cost))
print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 46
Dropout in PyTorch (Functional API)
https://github.com/rasbt/stat453-deep-learning-ss20/blob/master/L09-regularization/code/
dropout.ipynb
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 47
Dropout: More Practical Tips
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 48
Lecture 10
Slides:
https://github.com/rasbt/stat453-deep-learning-ss20/10_norm-and-init/
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 1
"Tricks" for Improving
Deep Neural Network Training
Today:
1. Feature/Input Normalization
(BatchNorm, InstanceNorm, GroupNorm, LayerNorm)
2. Weight Initialization (Xavier Glorot, Kaiming He)
Next Lecture:
3. Optimization Algorithms (RMSProp, Adagrad, ADAM)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 2
Part 1: Input Normalization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 3
Recap: Why We Normalize Inputs for Gradient Descent
Surface of a convex cost function
minimum
(for simplicity)
w1
<latexit sha1_base64="mx/VFezHqvMY4WbFRc+kdsKvuk4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbTbt0swm7E6WE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSKQw6LrfTmFldW19o7hZ2tre2d0r7x80TZxqxhsslrFuB9RwKRRvoEDJ24nmNAokbwWjm6nfeuTaiFg94DjhfkQHSoSCUbTS/VPP65UrbtWdgSwTLycVyFHvlb+6/ZilEVfIJDWm47kJ+hnVKJjkk1I3NTyhbEQHvGOpohE3fjY7dUJOrNInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tOyYbgLb68TJpnVe+86t1dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnRfn3fmYtxacfOYQ/sD5/AELJo2i</latexit>
w2
<latexit sha1_base64="qmpmSMS1f2q0FkcTJm+uMnFv//o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVinde8e4uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEMqo2j</latexit>
0 [i] xj [i] µj
xj =
j
w1
<latexit sha1_base64="GSG5yknxkEDShk+Y1ybBQKfpdmU=">AAACHXicbZBNS8MwGMfT+TbnW9Wjl+AQvThaHehFGHrxOMG9wFpHmqVbXNKWJBVH6Rfx4lfx4kERD17Eb2O6VdDNPwR++T/PQ/L8vYhRqSzryyjMzS8sLhWXSyura+sb5uZWU4axwKSBQxaKtockYTQgDUUVI+1IEMQ9Rlre8CKrt+6IkDQMrtUoIi5H/YD6FCOlra5ZTe67t/vpTdKhbgrPIHR8gXCSuT/mIXR4rK+JI2mfI01ds2xVrLHgLNg5lEGuetf8cHohjjkJFGZIyo5tRcpNkFAUM5KWnFiSCOEh6pOOxgBxIt1kvF0K97TTg34o9AkUHLu/JxLEpRxxT3dypAZyupaZ/9U6sfJP3YQGUaxIgCcP+TGDKoRZVLBHBcGKjTQgLKj+K8QDpONROtCSDsGeXnkWmkcV+7hiX1XLtfM8jiLYAbvgANjgBNTAJaiDBsDgATyBF/BqPBrPxpvxPmktGPnMNvgj4/Mb38iibA==</latexit>
w2
<latexit sha1_base64="qmpmSMS1f2q0FkcTJm+uMnFv//o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCmhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVinde8e4uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEMqo2j</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 4
However, normalizing
the inputs to the network
only affects the first hidden layer ...
What about the other hidden layers?
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 5
Batch Normalization ("BatchNorm")
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 6
Batch Normalization ("BatchNorm")
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 7
(2)
Suppose, we have net input z1
<latexit sha1_base64="1DVaoxm3QRMSqTsntv9ubd0y06E=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRY9OKxgv2Qdi3ZNNuGJtklyQp16a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqtx6p0iySd2YcU1/ggWQhI9hY6f6p5z2k5erppFcsuRV3BrRMvIyUIEO9V/zq9iOSCCoN4VjrjufGxk+xMoxwOil0E01jTEZ4QDuWSiyo9tPZwRN0YpU+CiNlSxo0U39PpFhoPRaB7RTYDPWiNxX/8zqJCS/9lMk4MVSS+aIw4chEaPo96jNFieFjSzBRzN6KyBArTIzNqGBD8BZfXibNasU7q3i356XaVRZHHo7gGMrgwQXU4Abq0AACAp7hFd4c5bw4787HvDXnZDOH8AfO5w/FWo+6</latexit>
x1 (2) o1
<latexit sha1_base64="sJdgXiAVm2a4S+4dRd3rRrYB1HY=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK1hbaUDbbTbt0swm7EyGE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSKQw6LrfTmlldW19o7xZ2dre2d2r7h88mjjVjLdYLGPdCajhUijeQoGSdxLNaRRI3g7GN1O//cS1EbF6wCzhfkSHSoSCUbTSfdb3+tWaW3dnIMvEK0gNCjT71a/eIGZpxBUySY3pem6Cfk41Cib5pNJLDU8oG9Mh71qqaMSNn89OnZATqwxIGGtbCslM/T2R08iYLApsZ0RxZBa9qfif100xvPJzoZIUuWLzRWEqCcZk+jcZCM0ZyswSyrSwtxI2opoytOlUbAje4svL5PGs7p3X3buLWuO6iKMMR3AMp+DBJTTgFprQAgZDeIZXeHOk8+K8Ox/z1pJTzBzCHzifPw3gjaM=</latexit> <latexit sha1_base64="fJxEJZDwIRAXzsny9UbZpYFPXZ4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48V7Qe0oWy2k3bpZhN2N0Io/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHkyXoR3QoecgZNVZ6yPq1frniVt05yCrxclKBHI1++as3iFkaoTRMUK27npsYf0KV4UzgtNRLNSaUjekQu5ZKGqH2J/NTp+TMKgMSxsqWNGSu/p6Y0EjrLApsZ0TNSC97M/E/r5ua8NqfcJmkBiVbLApTQUxMZn+TAVfIjMgsoUxxeythI6ooMzadkg3BW355lbRqVe+i6t5fVuo3eRxFOIFTOAcPrqAOd9CAJjAYwjO8wpsjnBfn3flYtBacfOYY/sD5/AEPZI2k</latexit>
<latexit sha1_base64="85kz+6+8sUyRlr+84amQIqvMQLQ=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0msoMeiF48V7Qe0oWy2k3bpZhN2N0Io/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHkyXoR3QoecgZNVZ6yPq1frniVt05yCrxclKBHI1++as3iFkaoTRMUK27npsYf0KV4UzgtNRLNSaUjekQu5ZKGqH2J/NTp+TMKgMSxsqWNGSu/p6Y0EjrLApsZ0TNSC97M/E/r5ua8NqfcJmkBiVbLApTQUxMZn+TAVfIjMgsoUxxeythI6ooMzadkg3BW355lbQuql6t6t5fVuo3eRxFOIFTOAcPrqAOd9CAJjAYwjO8wpsjnBfn3flYtBacfOYY/sD5/AEQ6I2l</latexit>
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
<latexit sha1_base64="P19Wda8vivmLhYYW0RO0w4mIIBA=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFoNgFe5U0DJoYxnRmEByhL3NXrJkb/fYnRNCyE+wsVDE1l9k579xk1yhiQ8GHu/NMDMvSqWw6PvfXmFldW19o7hZ2tre2d0r7x88Wp0ZxhtMS21aEbVcCsUbKFDyVmo4TSLJm9HwZuo3n7ixQqsHHKU8TGhfiVgwik66192gW674VX8GskyCnFQgR71b/ur0NMsSrpBJam078FMMx9SgYJJPSp3M8pSyIe3ztqOKJtyG49mpE3LilB6JtXGlkMzU3xNjmlg7SiLXmVAc2EVvKv7ntTOMr8KxUGmGXLH5ojiTBDWZ/k16wnCGcuQIZUa4WwkbUEMZunRKLoRg8eVl8nhWDc6r/t1FpXadx1GEIziGUwjgEmpwC3VoAIM+PMMrvHnSe/HevY95a8HLZw7hD7zPH/6VjZk=</latexit>
(1)
a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
x2 (2)
a2 o2 <latexit sha1_base64="lp3ZeQ57DPk1EUCeNsK7Ny2DKe8=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0mrVvUvqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwAAKI2a</latexit>
L(y, o)
<latexit sha1_base64="jRuGYuNAf6C7yfYguq+x/vIHq08=">AAACDHicbZDLSsNAFIYnXmu9VV26GSxCBSmJCrosunHhooK9QBvKZDpph05mwsxECCEP4MZXceNCEbc+gDvfxkkaQVt/GPj4zznMOb8XMqq0bX9ZC4tLyyurpbXy+sbm1nZlZ7etRCQxaWHBhOx6SBFGOWlpqhnphpKgwGOk402usnrnnkhFBb/TcUjcAI049SlG2liDSrUfID3GiCU3aS1nz0/i9Bj+sEiPTJddt3PBeXAKqIJCzUHlsz8UOAoI15ghpXqOHWo3QVJTzEha7keKhAhP0Ij0DHIUEOUm+TEpPDTOEPpCmsc1zN3fEwkKlIoDz3RmK6rZWmb+V+tF2r9wE8rDSBOOpx/5EYNawCwZOKSSYM1iAwhLanaFeIwkwtrkVzYhOLMnz0P7pO6c1u3bs2rjsoijBPbBAagBB5yDBrgGTdACGDyAJ/ACXq1H69l6s96nrQtWMbMH/sj6+AYGV5uW</latexit>
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1)
a3 <latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(2) o3
a3
<latexit sha1_base64="wT+53Eb88nVtTpSfn4qk4kRsjrU=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexaQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0nrourXqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwABrI2b</latexit>
<latexit sha1_base64="vBxbcVs2Wnfm0yi6DKhPPczIBHw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOhcI+i</latexit>
(1)
a4
(2)
<latexit sha1_base64="uxWzlquY+EeW/UpcO69SCXeIYtQ=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGhdI+i</latexit>
a4
<latexit sha1_base64="vsgJntgqeAyiGWhpRcem3fXieTw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteJdVNy7Wql+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOi+o+j</latexit>
(1)
a5 <latexit sha1_base64="NHK0ywkULzi4Jl2BlAjdO8n2Yig=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquVfRY9OKxgv2Qdi3ZNNuGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Gbqt56o0iyS92YcU1/ggWQhI9hY6QH3Lh7Tsnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwis/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK+7deal2ncWRhyM4hjJ4cAk1uIU6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QOi/o+j</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 8
Now, consider all examples in a minibatch such that the net input of a
given training example
at layer 2 is written as (2)[i]
z1
<latexit sha1_base64="h3wsgVN87oEGw0nFBa6FKhw7OeY=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSLUS0mqoMeiF48V7AekadlsN+3SzSbsbpQa8j+8eFDEq//Fm//GbZuDtj4YeLw3w8w8P+ZMadv+tlZW19Y3Ngtbxe2d3b390sFhS0WJJLRJIh7Jjo8V5UzQpmaa004sKQ59Ttv++Gbqtx+oVCwS93oSUy/EQ8ECRrA2Uu+p7/TSSu0sdZmXZf1S2a7aM6Bl4uSkDDka/dJXdxCRJKRCE46Vch071l6KpWaE06zYTRSNMRnjIXUNFTikyktnV2fo1CgDFETSlNBopv6eSHGo1CT0TWeI9UgtelPxP89NdHDlpUzEiaaCzBcFCUc6QtMI0IBJSjSfGIKJZOZWREZYYqJNUEUTgrP48jJp1arOedW5uyjXr/M4CnAMJ1ABBy6hDrfQgCYQkPAMr/BmPVov1rv1MW9dsfKZI/gD6/MHvpWSBQ==</latexit>
(1)
a1
In the next slides, let's omit the layer
index, as it may be distracting...
<latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
x1 (2) o1
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
<latexit sha1_base64="P19Wda8vivmLhYYW0RO0w4mIIBA=">AAAB6nicbVA9SwNBEJ2LXzF+RS1tFoNgFe5U0DJoYxnRmEByhL3NXrJkb/fYnRNCyE+wsVDE1l9k579xk1yhiQ8GHu/NMDMvSqWw6PvfXmFldW19o7hZ2tre2d0r7x88Wp0ZxhtMS21aEbVcCsUbKFDyVmo4TSLJm9HwZuo3n7ixQqsHHKU8TGhfiVgwik66192gW674VX8GskyCnFQgR71b/ur0NMsSrpBJam078FMMx9SgYJJPSp3M8pSyIe3ztqOKJtyG49mpE3LilB6JtXGlkMzU3xNjmlg7SiLXmVAc2EVvKv7ntTOMr8KxUGmGXLH5ojiTBDWZ/k16wnCGcuQIZUa4WwkbUEMZunRKLoRg8eVl8nhWDc6r/t1FpXadx1GEIziGUwjgEmpwC3VoAIM+PMMrvHnSe/HevY95a8HLZw7hD7zPH/6VjZk=</latexit>
(1)
a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
x2
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
a2
(2) o2 <latexit sha1_base64="lp3ZeQ57DPk1EUCeNsK7Ny2DKe8=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexWQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0mrVvUvqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwAAKI2a</latexit>
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
(1)
a3 <latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
(2) o3
a3
<latexit sha1_base64="wT+53Eb88nVtTpSfn4qk4kRsjrU=">AAAB6nicbVBNSwMxEJ3Ur1q/qh69BIvgqexaQY9FLx4r2g9ol5JNs21oNlmSrFCW/gQvHhTx6i/y5r8xbfegrQ8GHu/NMDMvTAQ31vO+UWFtfWNzq7hd2tnd2z8oHx61jEo1ZU2qhNKdkBgmuGRNy61gnUQzEoeCtcPx7cxvPzFtuJKPdpKwICZDySNOiXXSg+rX+uWKV/XmwKvEz0kFcjT65a/eQNE0ZtJSQYzp+l5ig4xoy6lg01IvNSwhdEyGrOuoJDEzQTY/dYrPnDLAkdKupMVz9fdERmJjJnHoOmNiR2bZm4n/ed3URtdBxmWSWibpYlGUCmwVnv2NB1wzasXEEUI1d7diOiKaUOvSKbkQ/OWXV0nrourXqt79ZaV+k8dRhBM4hXPw4QrqcAcNaAKFITzDK7whgV7QO/pYtBZQPnMMf4A+fwABrI2b</latexit>
<latexit sha1_base64="vBxbcVs2Wnfm0yi6DKhPPczIBHw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOhcI+i</latexit>
(1)
a4
(2)
<latexit sha1_base64="uxWzlquY+EeW/UpcO69SCXeIYtQ=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGhdI+i</latexit>
a4
<latexit sha1_base64="vsgJntgqeAyiGWhpRcem3fXieTw=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPu1x7TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteJdVNy7Wql+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOi+o+j</latexit>
(1)
a5 <latexit sha1_base64="NHK0ywkULzi4Jl2BlAjdO8n2Yig=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquVfRY9OKxgv2Qdi3ZNNuGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Gbqt56o0iyS92YcU1/ggWQhI9hY6QH3Lh7Tsnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwis/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK+7deal2ncWRhyM4hjJ4cAk1uIU6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QOi/o+j</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 9
BatchNorm Step 1: Normalize Net Inputs
1 X [i]
µj = zj
n i
<latexit sha1_base64="iD6slSG69XRJ/P6UaG2r7Qr6f9w=">AAACEHicbVC7TsMwFHXKq5RXgJHFokIwVQkgwYJUwcJYJPqQmhA5rtO6tZ3IdpBKlE9g4VdYGECIlZGNv8F9DNBypCsdnXOv7r0nTBhV2nG+rcLC4tLySnG1tLa+sbllb+80VJxKTOo4ZrFshUgRRgWpa6oZaSWSIB4y0gwHVyO/eU+korG41cOE+Bx1BY0oRtpIgX3o8TTowwsIvUginLl5JnLoqZQHFD4E/bssa1M/zwO77FScMeA8caekDKaoBfaX14lxyonQmCGl2q6TaD9DUlPMSF7yUkUShAeoS9qGCsSJ8rPxQzk8MEoHRrE0JTQcq78nMsSVGvLQdHKke2rWG4n/ee1UR+d+RkWSaiLwZFGUMqhjOEoHdqgkWLOhIQhLam6FuIdMMNpkWDIhuLMvz5PGccU9qbg3p+Xq5TSOItgD++AIuOAMVME1qIE6wOARPINX8GY9WS/Wu/UxaS1Y05ld8AfW5w9PA5zK</latexit>
2 1 X [i] 2
j = (zj µj )
n i
<latexit sha1_base64="Kxx1/87hMgIScSq0BO+ahkkqQWk=">AAACIXicbVBNS8NAEN34WetX1KOXxSLUgyWpgl6EohePFawKSRo22027dXcTdjdCDfkrXvwrXjwo0pv4Z9zWHrT1wcDjvRlm5kUpo0o7zqc1N7+wuLRcWimvrq1vbNpb2zcqySQmLZywRN5FSBFGBWlpqhm5SyVBPGLkNrq/GPm3D0QqmohrPUhJwFFX0JhipI0U2qe+ol2Own67Ds8g9GOJcO4WuSigrzIeUlh9NGaeezQoCngIfZ6F/YN2PbQrTs0ZA84Sd0IqYIJmaA/9ToIzToTGDCnluU6qgxxJTTEjRdnPFEkRvkdd4hkqECcqyMcfFnDfKB0YJ9KU0HCs/p7IEVdqwCPTyZHuqWlvJP7neZmOT4OcijTTROCfRXHGoE7gKC7YoZJgzQaGICypuRXiHjIhaRNq2YTgTr88S27qNfeo5l4dVxrnkzhKYBfsgSpwwQlogEvQBC2AwRN4AW/g3Xq2Xq0Pa/jTOmdNZnbAH1hf35gKoog=</latexit>
[i]
0 [i]
zj µj
zj =
<latexit sha1_base64="cTZJCSBPyPze6hHKcV7OLj0Ks3k=">AAACHnicbZDLSsQwFIZTr+N4q7p0ExxENw6tF3QjDLpxqeDoQFtLmknHjElbklQYQ5/Eja/ixoUigit9G9OxCx39IfDznXM4OX+UMSqV43xaY+MTk1PTtZn67Nz8wqK9tHwh01xg0sYpS0UnQpIwmpC2ooqRTiYI4hEjl9HNcVm/vCVC0jQ5V4OMBBz1EhpTjJRBob2n7zaKsH+ltUeDooCHEPqxQFjflbBkcAv6PA/7hfYl7XFkHAzthtN0hoJ/jVuZBqh0GtrvfjfFOSeJwgxJ6blOpgKNhKKYkaLu55JkCN+gHvGMTRAnMtDD8wq4bkgXxqkwL1FwSH9OaMSlHPDIdHKkruVorYT/1bxcxQeBpkmWK5Lg70VxzqBKYZkV7FJBsGIDYxAW1PwV4mtk0lEm0boJwR09+a+52G66O033bLfROqriqIFVsAY2gQv2QQucgFPQBhjcg0fwDF6sB+vJerXevlvHrGpmBfyS9fEFU+6img==</latexit>
j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 10
BatchNorm Step 1: Normalize Net Inputs
1 X [i]
µj = zj
n i
<latexit sha1_base64="iD6slSG69XRJ/P6UaG2r7Qr6f9w=">AAACEHicbVC7TsMwFHXKq5RXgJHFokIwVQkgwYJUwcJYJPqQmhA5rtO6tZ3IdpBKlE9g4VdYGECIlZGNv8F9DNBypCsdnXOv7r0nTBhV2nG+rcLC4tLySnG1tLa+sbllb+80VJxKTOo4ZrFshUgRRgWpa6oZaSWSIB4y0gwHVyO/eU+korG41cOE+Bx1BY0oRtpIgX3o8TTowwsIvUginLl5JnLoqZQHFD4E/bssa1M/zwO77FScMeA8caekDKaoBfaX14lxyonQmCGl2q6TaD9DUlPMSF7yUkUShAeoS9qGCsSJ8rPxQzk8MEoHRrE0JTQcq78nMsSVGvLQdHKke2rWG4n/ee1UR+d+RkWSaiLwZFGUMqhjOEoHdqgkWLOhIQhLam6FuIdMMNpkWDIhuLMvz5PGccU9qbg3p+Xq5TSOItgD++AIuOAMVME1qIE6wOARPINX8GY9WS/Wu/UxaS1Y05ld8AfW5w9PA5zK</latexit>
2 1 X [i] 2
j = (zj µj )
n i
<latexit sha1_base64="Kxx1/87hMgIScSq0BO+ahkkqQWk=">AAACIXicbVBNS8NAEN34WetX1KOXxSLUgyWpgl6EohePFawKSRo22027dXcTdjdCDfkrXvwrXjwo0pv4Z9zWHrT1wcDjvRlm5kUpo0o7zqc1N7+wuLRcWimvrq1vbNpb2zcqySQmLZywRN5FSBFGBWlpqhm5SyVBPGLkNrq/GPm3D0QqmohrPUhJwFFX0JhipI0U2qe+ol2Own67Ds8g9GOJcO4WuSigrzIeUlh9NGaeezQoCngIfZ6F/YN2PbQrTs0ZA84Sd0IqYIJmaA/9ToIzToTGDCnluU6qgxxJTTEjRdnPFEkRvkdd4hkqECcqyMcfFnDfKB0YJ9KU0HCs/p7IEVdqwCPTyZHuqWlvJP7neZmOT4OcijTTROCfRXHGoE7gKC7YoZJgzQaGICypuRXiHjIhaRNq2YTgTr88S27qNfeo5l4dVxrnkzhKYBfsgSpwwQlogEvQBC2AwRN4AW/g3Xq2Xq0Pa/jTOmdNZnbAH1hf35gKoog=</latexit>
[i]
0 [i]
zj µj In practice:
zj =
[i]
<latexit sha1_base64="cTZJCSBPyPze6hHKcV7OLj0Ks3k=">AAACHnicbZDLSsQwFIZTr+N4q7p0ExxENw6tF3QjDLpxqeDoQFtLmknHjElbklQYQ5/Eja/ixoUigit9G9OxCx39IfDznXM4OX+UMSqV43xaY+MTk1PTtZn67Nz8wqK9tHwh01xg0sYpS0UnQpIwmpC2ooqRTiYI4hEjl9HNcVm/vCVC0jQ5V4OMBBz1EhpTjJRBob2n7zaKsH+ltUeDooCHEPqxQFjflbBkcAv6PA/7hfYl7XFkHAzthtN0hoJ/jVuZBqh0GtrvfjfFOSeJwgxJ6blOpgKNhKKYkaLu55JkCN+gHvGMTRAnMtDD8wq4bkgXxqkwL1FwSH9OaMSlHPDIdHKkruVorYT/1bxcxQeBpkmWK5Lg70VxzqBKYZkV7FJBsGIDYxAW1PwV4mtk0lEm0boJwR09+a+52G66O033bLfROqriqIFVsAY2gQv2QQucgFPQBhjcg0fwDF6sB+vJerXevlvHrGpmBfyS9fEFU+6img==</latexit>
j
0 [i]
zj µj
zj =q
2 +✏
j
<latexit sha1_base64="P+U9nRxp0b3JNLBqX3XzXT4h1O0=">AAACNHicbVDNSgMxGMzW//pX9eglWERBLLsq6EUQvQheFKwWuuuSTbNtapJdk6xQwz6UFx/EiwgeFPHqM5itPah1IGSY+YbkmyhlVGnXfXZKI6Nj4xOTU+Xpmdm5+crC4oVKMolJHScskY0IKcKoIHVNNSONVBLEI0Yuo+ujwr+8JVLRRJzrXkoCjtqCxhQjbaWwcmLu1vKwe2VMkwZ5Dvch9GOJsLkrxEKDm9DnWdjNDfTVjdTFRdscWX8LbkCfpIqyRNhsWKm6NbcPOEy8AamCAU7DyqPfSnDGidCYIaWanpvqwCCpKWYkL/uZIinC16hNmpYKxIkKTH/pHK5apQXjRNojNOyrPxMGcaV6PLKTHOmO+usV4n9eM9PxXmCoSDNNBP5+KM4Y1AksGoQtKgnWrGcJwpLav0LcQbYzbXsu2xK8vysPk4utmrdd8852qgeHgzomwTJYAevAA7vgAByDU1AHGNyDJ/AK3pwH58V5dz6+R0vOILMEfsH5/AIzMasl</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 11
BatchNorm Step 2: Pre-Activation Scaling
[i]
0 [i]
zj µj
zj =q
2 +✏
j
<latexit sha1_base64="P+U9nRxp0b3JNLBqX3XzXT4h1O0=">AAACNHicbVDNSgMxGMzW//pX9eglWERBLLsq6EUQvQheFKwWuuuSTbNtapJdk6xQwz6UFx/EiwgeFPHqM5itPah1IGSY+YbkmyhlVGnXfXZKI6Nj4xOTU+Xpmdm5+crC4oVKMolJHScskY0IKcKoIHVNNSONVBLEI0Yuo+ujwr+8JVLRRJzrXkoCjtqCxhQjbaWwcmLu1vKwe2VMkwZ5Dvch9GOJsLkrxEKDm9DnWdjNDfTVjdTFRdscWX8LbkCfpIqyRNhsWKm6NbcPOEy8AamCAU7DyqPfSnDGidCYIaWanpvqwCCpKWYkL/uZIinC16hNmpYKxIkKTH/pHK5apQXjRNojNOyrPxMGcaV6PLKTHOmO+usV4n9eM9PxXmCoSDNNBP5+KM4Y1AksGoQtKgnWrGcJwpLav0LcQbYzbXsu2xK8vysPk4utmrdd8852qgeHgzomwTJYAevAA7vgAByDU1AHGNyDJ/AK3pwH58V5dz6+R0vOILMEfsH5/AIzMasl</latexit>
0 [i] 0 [i]
aj
<latexit sha1_base64="31SjWZyW5ZbmGL4Hkf1W/oFRZJA=">AAACInicbVDLSsNAFJ34tr6qLt0MFlEQSqKCuhCKblxWsCo0MdxMp3XqTBJmboQa8i1u/BU3LhR1JfgxTmsXvg4MHM45lzv3RKkUBl333RkZHRufmJyaLs3Mzs0vlBeXzkySacYbLJGJvojAcCli3kCBkl+kmoOKJD+Pro/6/vkN10Yk8Sn2Uh4o6MSiLRiglcLyfg7rRdi9zPOmCIqCHlDqd0ApCLvUZ60EaX77I7BJ/YijtcNyxa26A9C/xBuSChmiHpZf/VbCMsVjZBKMaXpuikEOGgWTvCj5meEpsGvo8KalMShugnxwYkHXrNKi7UTbFyMdqN8nclDG9FRkkwrwyvz2+uJ/XjPD9l6QizjNkMfsa1E7kxQT2u+LtoTmDGXPEmBa2L9SdgUaGNpWS7YE7/fJf8nZVtXbrnonO5Xa4bCOKbJCVskG8cguqZFjUicNwsgdeSBP5Nm5dx6dF+ftKzriDGeWyQ84H5+IfaOo</latexit>
= j · zj + j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 12
BatchNorm Step 2: Pre-Activation Scaling
[i]
0 [i]
zj µj
zj =
<latexit sha1_base64="cTZJCSBPyPze6hHKcV7OLj0Ks3k=">AAACHnicbZDLSsQwFIZTr+N4q7p0ExxENw6tF3QjDLpxqeDoQFtLmknHjElbklQYQ5/Eja/ixoUigit9G9OxCx39IfDznXM4OX+UMSqV43xaY+MTk1PTtZn67Nz8wqK9tHwh01xg0sYpS0UnQpIwmpC2ooqRTiYI4hEjl9HNcVm/vCVC0jQ5V4OMBBz1EhpTjJRBob2n7zaKsH+ltUeDooCHEPqxQFjflbBkcAv6PA/7hfYl7XFkHAzthtN0hoJ/jVuZBqh0GtrvfjfFOSeJwgxJ6blOpgKNhKKYkaLu55JkCN+gHvGMTRAnMtDD8wq4bkgXxqkwL1FwSH9OaMSlHPDIdHKkruVorYT/1bxcxQeBpkmWK5Lg70VxzqBKYZkV7FJBsGIDYxAW1PwV4mtk0lEm0boJwR09+a+52G66O033bLfROqriqIFVsAY2gQv2QQucgFPQBhjcg0fwDF6sB+vJerXevlvHrGpmBfyS9fEFU+6img==</latexit>
j
0 [i] 0 [i]
aj
<latexit sha1_base64="31SjWZyW5ZbmGL4Hkf1W/oFRZJA=">AAACInicbVDLSsNAFJ34tr6qLt0MFlEQSqKCuhCKblxWsCo0MdxMp3XqTBJmboQa8i1u/BU3LhR1JfgxTmsXvg4MHM45lzv3RKkUBl333RkZHRufmJyaLs3Mzs0vlBeXzkySacYbLJGJvojAcCli3kCBkl+kmoOKJD+Pro/6/vkN10Yk8Sn2Uh4o6MSiLRiglcLyfg7rRdi9zPOmCIqCHlDqd0ApCLvUZ60EaX77I7BJ/YijtcNyxa26A9C/xBuSChmiHpZf/VbCMsVjZBKMaXpuikEOGgWTvCj5meEpsGvo8KalMShugnxwYkHXrNKi7UTbFyMdqN8nclDG9FRkkwrwyvz2+uJ/XjPD9l6QizjNkMfsa1E7kxQT2u+LtoTmDGXPEmBa2L9SdgUaGNpWS7YE7/fJf8nZVtXbrnonO5Xa4bCOKbJCVskG8cguqZFjUicNwsgdeSBP5Nm5dx6dF+ftKzriDGeWyQ84H5+IfaOo</latexit>
= j · zj + j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 14
BatchNorm Step 1 & 2 Summarized
[i]
0 [i]
zj µj 0 [i] 0 [i]
zj = aj = j · zj + j
j
<latexit sha1_base64="31SjWZyW5ZbmGL4Hkf1W/oFRZJA=">AAACInicbVDLSsNAFJ34tr6qLt0MFlEQSqKCuhCKblxWsCo0MdxMp3XqTBJmboQa8i1u/BU3LhR1JfgxTmsXvg4MHM45lzv3RKkUBl333RkZHRufmJyaLs3Mzs0vlBeXzkySacYbLJGJvojAcCli3kCBkl+kmoOKJD+Pro/6/vkN10Yk8Sn2Uh4o6MSiLRiglcLyfg7rRdi9zPOmCIqCHlDqd0ApCLvUZ60EaX77I7BJ/YijtcNyxa26A9C/xBuSChmiHpZf/VbCMsVjZBKMaXpuikEOGgWTvCj5meEpsGvo8KalMShugnxwYkHXrNKi7UTbFyMdqN8nclDG9FRkkwrwyvz2+uJ/XjPD9l6QizjNkMfsa1E7kxQT2u+LtoTmDGXPEmBa2L9SdgUaGNpWS7YE7/fJf8nZVtXbrnonO5Xa4bCOKbJCVskG8cguqZFjUicNwsgdeSBP5Nm5dx6dF+ftKzriDGeWyQ84H5+IfaOo</latexit>
<latexit sha1_base64="cTZJCSBPyPze6hHKcV7OLj0Ks3k=">AAACHnicbZDLSsQwFIZTr+N4q7p0ExxENw6tF3QjDLpxqeDoQFtLmknHjElbklQYQ5/Eja/ixoUigit9G9OxCx39IfDznXM4OX+UMSqV43xaY+MTk1PTtZn67Nz8wqK9tHwh01xg0sYpS0UnQpIwmpC2ooqRTiYI4hEjl9HNcVm/vCVC0jQ5V4OMBBz1EhpTjJRBob2n7zaKsH+ltUeDooCHEPqxQFjflbBkcAv6PA/7hfYl7XFkHAzthtN0hoJ/jVuZBqh0GtrvfjfFOSeJwgxJ6blOpgKNhKKYkaLu55JkCN+gHvGMTRAnMtDD8wq4bkgXxqkwL1FwSH9OaMSlHPDIdHKkruVorYT/1bxcxQeBpkmWK5Lg70VxzqBKYZkV7FJBsGIDYxAW1PwV4mtk0lEm0boJwR09+a+52G66O033bLfROqriqIFVsAY2gQv2QQucgFPQBhjcg0fwDF6sB+vJerXevlvHrGpmBfyS9fEFU+6img==</latexit>
z1
<latexit sha1_base64="Rx4MmlIzlMkTGZCiDYRfhhgcRVA=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSLUS9mooMeiF48V7Ie0tWTTbBuaZJckK9Slv8KLB0W8+nO8+W9M2z1o64OBx3szzMwLYsGN9f1vb2l5ZXVtPbeR39za3tkt7O3XTZRoymo0EpFuBsQwwRWrWW4Fa8aaERkI1giG1xO/8ci04ZG6s6OYdSTpKx5ySqyT7p+6+CEt4ZNxt1D0y/4UaJHgjBQhQ7Vb+Gr3IppIpiwVxJgW9mPbSYm2nAo2zrcTw2JCh6TPWo4qIpnppNODx+jYKT0URtqVsmiq/p5IiTRmJAPXKYkdmHlvIv7ntRIbXnZSruLEMkVni8JEIBuhyfeoxzWjVowcIVRzdyuiA6IJtS6jvAsBz7+8SOqnZXxWxrfnxcpVFkcODuEISoDhAipwA1WoAQUJz/AKb572Xrx372PWuuRlMwfwB97nD8PUj7k=</latexit>
z1 <latexit sha1_base64="rGcA4ISjDHCLet3FrAdgewQxTEQ=">AAAB83icbVBNSwMxEJ31s9avqkcvwSLWS9mooMeiF48V7Ae0a8mm2TY0m12SrFCX/RtePCji1T/jzX9j2u5BWx8MPN6bYWaeHwuujet+O0vLK6tr64WN4ubW9s5uaW+/qaNEUdagkYhU2yeaCS5Zw3AjWDtWjIS+YC1/dDPxW49MaR7JezOOmReSgeQBp8RYqZs+nWQ9/JBW8GnWK5XdqjsFWiQ4J2XIUe+Vvrr9iCYhk4YKonUHu7HxUqIMp4JlxW6iWUzoiAxYx1JJQqa9dHpzho6t0kdBpGxJg6bq74mUhFqPQ992hsQM9bw3Ef/zOokJrryUyzgxTNLZoiARyERoEgDqc8WoEWNLCFXc3orokChCjY2paEPA8y8vkuZZFZ9X8d1FuXadx1GAQziCCmC4hBrcQh0aQCGGZ3iFNydxXpx352PWuuTkMwfwB87nD/FUkPY=</latexit>
a1 <latexit sha1_base64="a8K0S67W8s7/fnpMEqnG3+G6IX4=">AAAB83icbVBNSwMxEJ2tX7V+VT16CRaxXspGBT0WvXisYD+gXUs2zbah2eySZIWy7N/w4kERr/4Zb/4b03YP2vpg4PHeDDPz/FhwbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRVmTRiJSHZ9oJrhkTcONYJ1YMRL6grX98e3Ubz8xpXkkH8wkZl5IhpIHnBJjpV5KTrM+fkyr+CzrlytuzZ0BLROckwrkaPTLX71BRJOQSUMF0bqL3dh4KVGGU8GyUi/RLCZ0TIasa6kkIdNeOrs5QydWGaAgUrakQTP190RKQq0noW87Q2JGetGbiv953cQE117KZZwYJul8UZAIZCI0DQANuGLUiIklhCpub0V0RBShxsZUsiHgxZeXSeu8hi9q+P6yUr/J4yjCERxDFTBcQR3uoAFNoBDDM7zCm5M4L8678zFvLTj5zCH8gfP5A8p2kN0=</latexit>
a1
<latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
first hidden layer
...
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 15
Backpropagation for BatchNorm
Parameters
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 17
Let's consider a simpler case ...
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
<latexit sha1_base64="F7Swd/C4QN3QVLL72remZ0dqisQ=">AAAB8nicbVDLSgNBEJz1GeMr6tHLYBDiJeyqoMegF48RzAM2a5idzCZjZmeWmV4lLPsZXjwo4tWv8ebfOHkcNLGgoajqprsrTAQ34LrfztLyyuraemGjuLm1vbNb2ttvGpVqyhpUCaXbITFMcMkawEGwdqIZiUPBWuHweuy3Hpk2XMk7GCUsiElf8ohTAlbyn7rZQ36fVbyTvFsqu1V3ArxIvBkpoxnq3dJXp6doGjMJVBBjfM9NIMiIBk4Fy4ud1LCE0CHpM99SSWJmgmxyco6PrdLDkdK2JOCJ+nsiI7Exozi0nTGBgZn3xuJ/np9CdBlkXCYpMEmni6JUYFB4/D/ucc0oiJElhGpub8V0QDShYFMq2hC8+ZcXSfO06p1Vvdvzcu1qFkcBHaIjVEEeukA1dIPqqIEoUugZvaI3B5wX5935mLYuObOZA/QHzucP4a6Q+w==</latexit>
aj
<latexit sha1_base64="yiocqbqzfqGkxQidQIigucf7A/c=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/Ssnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwks/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK97teal2lcWRhyM4hjJ4cAE1uIE6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QP04o/Z</latexit>
<latexit sha1_base64="a7DEW9TA7F3/3elNIcTaVl6Dj6g=">AAAB8nicbVBNS8NAEN3Ur1q/qh69BItQLyWpgh6LXjxWsB+QxrLZbtq1m92wO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPzgpgzDY7zbRVWVtfWN4qbpa3tnd298v5BW8tEEdoikkvVDbCmnAnaAgacdmNFcRRw2gnG11O/80iVZlLcwSSmfoSHgoWMYDCS99RPH7L7tFo/zfrlilNzZrCXiZuTCsrR7Je/egNJkogKIBxr7blODH6KFTDCaVbqJZrGmIzxkHqGChxR7aezkzP7xCgDO5TKlAB7pv6eSHGk9SQKTGeEYaQXvan4n+clEF76KRNxAlSQ+aIw4TZIe/q/PWCKEuATQzBRzNxqkxFWmIBJqWRCcBdfXibtes09q7m355XGVR5HER2hY1RFLrpADXSDmqiFCJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+M0kPw=</latexit>
aj
<latexit sha1_base64="oFSf5mIvKefo1ut2K84bI/HuXwI=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/ScvV00iuW3Io7A1omXkZKkKHeK351+xFJBJWGcKx1x3Nj46dYGUY4nRS6iaYxJiM8oB1LJRZU++ns4Ak6sUofhZGyJQ2aqb8nUiy0HovAdgpshnrRm4r/eZ3EhJd+ymScGCrJfFGYcGQiNP0e9ZmixPCxJZgoZm9FZIgVJsZmVLAheIsvL5NmteKdVbzb81LtKosjD0dwDGXw4AJqcAN1aAABAc/wCm+Ocl6cd+dj3ppzsplD+APn8wf2aI/a</latexit>
<latexit sha1_base64="9JMSM66KNyCTK9VIv0nIhGcw/P8=">AAAB8nicbVBNS8NAEN3Ur1q/qh69LBahXkpiBT0WvXisYD8gjWWz3bRrN9mwO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPz/FhwDbb9bRVWVtfWN4qbpa3tnd298v5BW8tEUdaiUkjV9YlmgkesBRwE68aKkdAXrOOPr6d+55EpzWV0B5OYeSEZRjzglICR3Kd++pDdp9X6adYvV+yaPQNeJk5OKihHs1/+6g0kTUIWARVEa9exY/BSooBTwbJSL9EsJnRMhsw1NCIh0146OznDJ0YZ4EAqUxHgmfp7IiWh1pPQN50hgZFe9Kbif56bQHDppTyKE2ARnS8KEoFB4un/eMAVoyAmhhCquLkV0xFRhIJJqWRCcBZfXibts5pTrzm355XGVR5HER2hY1RFDrpADXSDmqiFKJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+S6kP0=</latexit>
<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
@l @l @o
(3)
= ·
@wj @o @w(3)
<latexit sha1_base64="4bMSLhq2g4BUKinKV0nQn3Lzp8I=">AAACWnicfVFbS8MwGE2r7uplXt58CQ5hvozWCfoiDH3xcYK7wFpHmqVbXNrUJFVG6Z/0RQT/imC2FdRteCBwON/5vi858SJGpbKsD8Pc2NzK5QvFUnl7Z3evsn/QkTwWmLQxZ1z0PCQJoyFpK6oY6UWCoMBjpOtNbmf17gsRkvLwQU0j4gZoFFKfYqS0NKg8O75AOHEiJBRFDLL0h78Onh6TWuMsTeE1/MfHU+jgIVfLHr521qBSterWHHCV2BmpggytQeXNGXIcByRUmCEp+7YVKTeZDcaMpCUnliRCeIJGpK9piAIi3WQeTQpPtTKEPhf6hArO1d8dCQqknAaedgZIjeVybSauq/Vj5V+5CQ2jWJEQLxb5MYOKw1nOcEgFwYpNNUFYUH1XiMdI56P0b5R0CPbyk1dJ57xuN+r2/UW1eZPFUQDH4ATUgA0uQRPcgRZoAwzewZeRM/LGp2maRbO8sJpG1nMI/sA8+gaxY7X5</latexit>
j
(2)
@l @l @o @aj
(2)
= · (2) · (2)
@wj @o @a @wj
<latexit sha1_base64="g0Auodaw4LI+T5IzOKYJUIsl0jw=">AAACjnicfVFdT8IwFO3mF4IfUx99aSQm+EI2NWo0RKIvPGIiHwlD0pUOKt26tJ2GLPs5/iHf/DcWWAIC8SZNTs45vbc914sYlcq2fwxzY3Nreye3my/s7R8cWkfHTcljgUkDc8ZF20OSMBqShqKKkXYkCAo8Rlre6Hmitz6IkJSHr2ockW6ABiH1KUZKUz3ry/UFwokbIaEoYpClc/zZe39LSpcXaQor8B8fT6GL+1wte/iCB817rfXO9XXze1bRLtvTgqvAyUARZFXvWd9un+M4IKHCDEnZcexIdZNJY8xImndjSSKER2hAOhqGKCCym0zjTOG5ZvrQ50KfUMEpu3gjQYGU48DTzgCpoVzWJuQ6rRMr/66b0DCKFQnxbJAfM6g4nOwG9qkgWLGxBggLqt8K8RDpnJTeYF6H4Cx/eRU0L8vOVdl5uS5Wn7I4cuAUnIEScMAtqIIaqIMGwEbBcIx748G0zBuzYj7OrKaR3TkBf8qs/QJWbcgI</latexit>
j
(2) (1)
@l @l @o @aj @aj
(1)
= · (2) · (1)
· (1)
@wj @o @a @aj @wj
<latexit sha1_base64="6udM3tazQW1+8dioSi0gAe9yMoo=">AAACwnichVFbT8IwFO7mDfGG+uhLIzHRF7Kiib4YifrgoyaCJAxJVzopdO1sOw2Z+5O+8W/skAQFoidp8uW7nLbnBDFn2njeyHGXlldW1wrrxY3Nre2d0u5eQ8tEEVonkkvVDLCmnAlaN8xw2owVxVHA6VMwuMn1pzeqNJPi0Qxj2o7wi2AhI9hYqlMa+aHCJPVjrAzDHPJsit87/ef0GJ1kGbyEf/hkBn3SlWbWI3948LhXNe+10DvVZzn0Twb9ykzf3CmVvYo3LjgP0ASUwaTuO6VPvytJElFhCMdat5AXm3aaNyacZkU/0TTGZIBfaMtCgSOq2+l4BRk8skwXhlLZIwwcsz8TKY60HkaBdUbY9PSslpOLtFZiwot2ykScGCrI90VhwqGRMN8n7DJFieFDCzBRzL4Vkh62czJ260U7BDT75XnQqFbQaQU9nJVr15NxFMABOATHAIFzUAN34B7UAXGuHOoIR7q3bt99dfW31XUmmX3wq9yPL2zH3RI=</latexit>
j
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 19
(previously, we didn't write the
net input explicitly in the comp.
graph)
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
aj aj
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
<latexit sha1_base64="F7Swd/C4QN3QVLL72remZ0dqisQ=">AAAB8nicbVDLSgNBEJz1GeMr6tHLYBDiJeyqoMegF48RzAM2a5idzCZjZmeWmV4lLPsZXjwo4tWv8ebfOHkcNLGgoajqprsrTAQ34LrfztLyyuraemGjuLm1vbNb2ttvGpVqyhpUCaXbITFMcMkawEGwdqIZiUPBWuHweuy3Hpk2XMk7GCUsiElf8ohTAlbyn7rZQ36fVbyTvFsqu1V3ArxIvBkpoxnq3dJXp6doGjMJVBBjfM9NIMiIBk4Fy4ud1LCE0CHpM99SSWJmgmxyco6PrdLDkdK2JOCJ+nsiI7Exozi0nTGBgZn3xuJ/np9CdBlkXCYpMEmni6JUYFB4/D/ucc0oiJElhGpub8V0QDShYFMq2hC8+ZcXSfO06p1Vvdvzcu1qFkcBHaIjVEEeukA1dIPqqIEoUugZvaI3B5wX5935mLYuObOZA/QHzucP4a6Q+w==</latexit>
<latexit sha1_base64="yiocqbqzfqGkxQidQIigucf7A/c=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/Ssnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwks/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK97teal2lcWRhyM4hjJ4cAE1uIE6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QP04o/Z</latexit>
<latexit sha1_base64="a7DEW9TA7F3/3elNIcTaVl6Dj6g=">AAAB8nicbVBNS8NAEN3Ur1q/qh69BItQLyWpgh6LXjxWsB+QxrLZbtq1m92wO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPzgpgzDY7zbRVWVtfWN4qbpa3tnd298v5BW8tEEdoikkvVDbCmnAnaAgacdmNFcRRw2gnG11O/80iVZlLcwSSmfoSHgoWMYDCS99RPH7L7tFo/zfrlilNzZrCXiZuTCsrR7Je/egNJkogKIBxr7blODH6KFTDCaVbqJZrGmIzxkHqGChxR7aezkzP7xCgDO5TKlAB7pv6eSHGk9SQKTGeEYaQXvan4n+clEF76KRNxAlSQ+aIw4TZIe/q/PWCKEuATQzBRzNxqkxFWmIBJqWRCcBdfXibtes09q7m355XGVR5HER2hY1RFLrpADXSDmqiFCJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+M0kPw=</latexit>
zj
<latexit sha1_base64="/U5BI6r2ubz7lzZTHbYvf7aPySc=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSLUS9mtgh6LXjxWsB/SriWbZtvYJLskWaEu/RVePCji1Z/jzX9j2u5BWx8MPN6bYWZeEHOmjet+O0vLK6tr67mN/ObW9s5uYW+/oaNEEVonEY9UK8CaciZp3TDDaStWFIuA02YwvJr4zUeqNIvkrRnF1Be4L1nICDZWunvqPtynpcrJuFsoumV3CrRIvIwUIUOtW/jq9CKSCCoN4VjrtufGxk+xMoxwOs53Ek1jTIa4T9uWSiyo9tPpwWN0bJUeCiNlSxo0VX9PpFhoPRKB7RTYDPS8NxH/89qJCS/8lMk4MVSS2aIw4chEaPI96jFFieEjSzBRzN6KyAArTIzNKG9D8OZfXiSNStk7LXs3Z8XqZRZHDg7hCErgwTlU4RpqUAcCAp7hFd4c5bw4787HrHXJyWYO4A+czx8dI4/z</latexit>
<latexit sha1_base64="oFSf5mIvKefo1ut2K84bI/HuXwI=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/ScvV00iuW3Io7A1omXkZKkKHeK351+xFJBJWGcKx1x3Nj46dYGUY4nRS6iaYxJiM8oB1LJRZU++ns4Ak6sUofhZGyJQ2aqb8nUiy0HovAdgpshnrRm4r/eZ3EhJd+ymScGCrJfFGYcGQiNP0e9ZmixPCxJZgoZm9FZIgVJsZmVLAheIsvL5NmteKdVbzb81LtKosjD0dwDGXw4AJqcAN1aAABAc/wCm+Ocl6cd+dj3ppzsplD+APn8wf2aI/a</latexit>
<latexit sha1_base64="9JMSM66KNyCTK9VIv0nIhGcw/P8=">AAAB8nicbVBNS8NAEN3Ur1q/qh69LBahXkpiBT0WvXisYD8gjWWz3bRrN9mwO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPz/FhwDbb9bRVWVtfWN4qbpa3tnd298v5BW8tEUdaiUkjV9YlmgkesBRwE68aKkdAXrOOPr6d+55EpzWV0B5OYeSEZRjzglICR3Kd++pDdp9X6adYvV+yaPQNeJk5OKihHs1/+6g0kTUIWARVEa9exY/BSooBTwbJSL9EsJnRMhsw1NCIh0146OznDJ0YZ4EAqUxHgmfp7IiWh1pPQN50hgZFe9Kbif56bQHDppTyKE2ARnS8KEoFB4un/eMAVoyAmhhCquLkV0xFRhIJJqWRCcBZfXibts5pTrzm355XGVR5HER2hY1RFDrpADXSDmqiFKJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+S6kP0=</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
Adding a
BatchNorm layer ...
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
<latexit sha1_base64="F7Swd/C4QN3QVLL72remZ0dqisQ=">AAAB8nicbVDLSgNBEJz1GeMr6tHLYBDiJeyqoMegF48RzAM2a5idzCZjZmeWmV4lLPsZXjwo4tWv8ebfOHkcNLGgoajqprsrTAQ34LrfztLyyuraemGjuLm1vbNb2ttvGpVqyhpUCaXbITFMcMkawEGwdqIZiUPBWuHweuy3Hpk2XMk7GCUsiElf8ohTAlbyn7rZQ36fVbyTvFsqu1V3ArxIvBkpoxnq3dJXp6doGjMJVBBjfM9NIMiIBk4Fy4ud1LCE0CHpM99SSWJmgmxyco6PrdLDkdK2JOCJ+nsiI7Exozi0nTGBgZn3xuJ/np9CdBlkXCYpMEmni6JUYFB4/D/ucc0oiJElhGpub8V0QDShYFMq2hC8+ZcXSfO06p1Vvdvzcu1qFkcBHaIjVEEeukA1dIPqqIEoUugZvaI3B5wX5935mLYuObOZA/QHzucP4a6Q+w==</latexit>
<latexit sha1_base64="yiocqbqzfqGkxQidQIigucf7A/c=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/Ssnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwks/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK97teal2lcWRhyM4hjJ4cAE1uIE6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QP04o/Z</latexit>
<latexit sha1_base64="a7DEW9TA7F3/3elNIcTaVl6Dj6g=">AAAB8nicbVBNS8NAEN3Ur1q/qh69BItQLyWpgh6LXjxWsB+QxrLZbtq1m92wO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPzgpgzDY7zbRVWVtfWN4qbpa3tnd298v5BW8tEEdoikkvVDbCmnAnaAgacdmNFcRRw2gnG11O/80iVZlLcwSSmfoSHgoWMYDCS99RPH7L7tFo/zfrlilNzZrCXiZuTCsrR7Je/egNJkogKIBxr7blODH6KFTDCaVbqJZrGmIzxkHqGChxR7aezkzP7xCgDO5TKlAB7pv6eSHGk9SQKTGeEYaQXvan4n+clEF76KRNxAlSQ+aIw4TZIe/q/PWCKEuATQzBRzNxqkxFWmIBJqWRCcBdfXibtes09q7m355XGVR5HER2hY1RFLrpADXSDmqiFCJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+M0kPw=</latexit>
zj
<latexit sha1_base64="/U5BI6r2ubz7lzZTHbYvf7aPySc=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSLUS9mtgh6LXjxWsB/SriWbZtvYJLskWaEu/RVePCji1Z/jzX9j2u5BWx8MPN6bYWZeEHOmjet+O0vLK6tr67mN/ObW9s5uYW+/oaNEEVonEY9UK8CaciZp3TDDaStWFIuA02YwvJr4zUeqNIvkrRnF1Be4L1nICDZWunvqPtynpcrJuFsoumV3CrRIvIwUIUOtW/jq9CKSCCoN4VjrtufGxk+xMoxwOs53Ek1jTIa4T9uWSiyo9tPpwWN0bJUeCiNlSxo0VX9PpFhoPRKB7RTYDPS8NxH/89qJCS/8lMk4MVSS2aIw4chEaPI96jFFieEjSzBRzN6KyAArTIzNKG9D8OZfXiSNStk7LXs3Z8XqZRZHDg7hCErgwTlU4RpqUAcCAp7hFd4c5bw4787HrHXJyWYO4A+czx8dI4/z</latexit>
<latexit sha1_base64="MjYPTcWhFsUmb5oT0mVBLyugRyU=">AAAB+XicbVDLTsMwEHTKq5RXgCMXiwpRLlVSkOBYwYVjkehDakPkuE5r6jiR7VQqVv6ECwcQ4sqfcONvcNscoGWklUYzu9rdCRJGpXKcb6uwsrq2vlHcLG1t7+zu2fsHLRmnApMmjlksOgGShFFOmooqRjqJICgKGGkHo5up3x4TIWnM79UkIV6EBpyGFCNlJN+29VOmTzNfP2YPulI7y3y77FSdGeAycXNSBjkavv3V68c4jQhXmCEpu66TKE8joShmJCv1UkkShEdoQLqGchQR6enZ5Rk8MUofhrEwxRWcqb8nNIqknESB6YyQGspFbyr+53VTFV55mvIkVYTj+aIwZVDFcBoD7FNBsGITQxAW1NwK8RAJhJUJq2RCcBdfXiatWtU9r7p3F+X6dR5HERyBY1ABLrgEdXALGqAJMBiDZ/AK3ixtvVjv1se8tWDlM4fgD6zPH1+Ok3k=</latexit>
<latexit sha1_base64="Mon6EHXCa11EjnXsym3XNpquRp0=">AAAB9XicbVBNT8JAEJ3iF+IX6tHLRmLEC2nRRI9ELx4xkY8ECtkuW1jZbpvdrYY0/R9ePGiMV/+LN/+NC/Sg4EsmeXlvJjPzvIgzpW3728qtrK6tb+Q3C1vbO7t7xf2DpgpjSWiDhDyUbQ8rypmgDc00p+1IUhx4nLa88c3Ubz1SqVgo7vUkom6Ah4L5jGBtpF6CT9N+8pD2knL1LO0XS3bFngEtEycjJchQ7xe/uoOQxAEVmnCsVMexI+0mWGpGOE0L3VjRCJMxHtKOoQIHVLnJ7OoUnRhlgPxQmhIazdTfEwkOlJoEnukMsB6pRW8q/ud1Yu1fuQkTUaypIPNFfsyRDtE0AjRgkhLNJ4ZgIpm5FZERlphoE1TBhOAsvrxMmtWKc15x7i5KtessjjwcwTGUwYFLqMEt1KEBBCQ8wyu8WU/Wi/Vufcxbc1Y2cwh/YH3+APCYkiM=</latexit>
<latexit sha1_base64="oFSf5mIvKefo1ut2K84bI/HuXwI=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/ScvV00iuW3Io7A1omXkZKkKHeK351+xFJBJWGcKx1x3Nj46dYGUY4nRS6iaYxJiM8oB1LJRZU++ns4Ak6sUofhZGyJQ2aqb8nUiy0HovAdgpshnrRm4r/eZ3EhJd+ymScGCrJfFGYcGQiNP0e9ZmixPCxJZgoZm9FZIgVJsZmVLAheIsvL5NmteKdVbzb81LtKosjD0dwDGXw4AJqcAN1aAABAc/wCm+Ocl6cd+dj3ppzsplD+APn8wf2aI/a</latexit>
<latexit sha1_base64="9JMSM66KNyCTK9VIv0nIhGcw/P8=">AAAB8nicbVBNS8NAEN3Ur1q/qh69LBahXkpiBT0WvXisYD8gjWWz3bRrN9mwO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPz/FhwDbb9bRVWVtfWN4qbpa3tnd298v5BW8tEUdaiUkjV9YlmgkesBRwE68aKkdAXrOOPr6d+55EpzWV0B5OYeSEZRjzglICR3Kd++pDdp9X6adYvV+yaPQNeJk5OKihHs1/+6g0kTUIWARVEa9exY/BSooBTwbJSL9EsJnRMhsw1NCIh0146OznDJ0YZ4EAqUxHgmfp7IiWh1pPQN50hgZFe9Kbif56bQHDppTyKE2ARnS8KEoFB4un/eMAVoyAmhhCquLkV0xFRhIJJqWRCcBZfXibts5pTrzm355XGVR5HER2hY1RFDrpADXSDmqiFKJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+S6kP0=</latexit>
(2)
0 (2)
zj µj 0 (2) 0 (2)
zj = aj = j · zj + j
j
<latexit sha1_base64="pi6ijXkXslQZaFrhMo0WalC9xvk=">AAACH3icbZBNS8NAEIY3flu/qh69LBZREUqiol4E0YvHClaFpobJdlu37iZhdyLUkH/ixb/ixYMi4q3/xm3Nwa+BhZf3mWF23jCRwqDr9p2R0bHxicmp6dLM7Nz8Qnlx6cLEqWa8zmIZ66sQDJci4nUUKPlVojmoUPLL8PZkwC/vuDYijs6xl/Cmgk4k2oIBWiso72WwngdZN7/ONrY3c3pI/Q4oBUGX+qwVI83uLe8WdIv6IUcLg3LFrbrDon+FV4gKKaoWlD/8VsxSxSNkEoxpeG6CzQw0CiZ5XvJTwxNgt9DhDSsjUNw0s+F9OV2zTou2Y21fhHTofp/IQBnTU6HtVIA35jcbmP+xRortg2YmoiRFHrGvRe1UUozpICzaEpozlD0rgGlh/0rZDWhgaCMt2RC83yf/FRfbVW+n6p3tVo6OizimyApZJRvEI/vkiJySGqkTRh7IE3khr86j8+y8Oe9frSNOMbNMfpTT/wQ5caE2</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 21
Backprop for BatchNorm Parameters
(2)
0 (2)
zj µj 0 (2) 0 (2)
zj = aj = j · zj + j
j
<latexit sha1_base64="pi6ijXkXslQZaFrhMo0WalC9xvk=">AAACH3icbZBNS8NAEIY3flu/qh69LBZREUqiol4E0YvHClaFpobJdlu37iZhdyLUkH/ixb/ixYMi4q3/xm3Nwa+BhZf3mWF23jCRwqDr9p2R0bHxicmp6dLM7Nz8Qnlx6cLEqWa8zmIZ66sQDJci4nUUKPlVojmoUPLL8PZkwC/vuDYijs6xl/Cmgk4k2oIBWiso72WwngdZN7/ONrY3c3pI/Q4oBUGX+qwVI83uLe8WdIv6IUcLg3LFrbrDon+FV4gKKaoWlD/8VsxSxSNkEoxpeG6CzQw0CiZ5XvJTwxNgt9DhDSsjUNw0s+F9OV2zTou2Y21fhHTofp/IQBnTU6HtVIA35jcbmP+xRortg2YmoiRFHrGvRe1UUozpICzaEpozlD0rgGlh/0rZDWhgaCMt2RC83yf/FRfbVW+n6p3tVo6OizimyApZJRvEI/vkiJySGqkTRh7IE3khr86j8+y8Oe9frSNOMbNMfpTT/wQ5caE2</latexit>
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
<latexit sha1_base64="F7Swd/C4QN3QVLL72remZ0dqisQ=">AAAB8nicbVDLSgNBEJz1GeMr6tHLYBDiJeyqoMegF48RzAM2a5idzCZjZmeWmV4lLPsZXjwo4tWv8ebfOHkcNLGgoajqprsrTAQ34LrfztLyyuraemGjuLm1vbNb2ttvGpVqyhpUCaXbITFMcMkawEGwdqIZiUPBWuHweuy3Hpk2XMk7GCUsiElf8ohTAlbyn7rZQ36fVbyTvFsqu1V3ArxIvBkpoxnq3dJXp6doGjMJVBBjfM9NIMiIBk4Fy4ud1LCE0CHpM99SSWJmgmxyco6PrdLDkdK2JOCJ+nsiI7Exozi0nTGBgZn3xuJ/np9CdBlkXCYpMEmni6JUYFB4/D/ucc0oiJElhGpub8V0QDShYFMq2hC8+ZcXSfO06p1Vvdvzcu1qFkcBHaIjVEEeukA1dIPqqIEoUugZvaI3B5wX5935mLYuObOZA/QHzucP4a6Q+w==</latexit>
<latexit sha1_base64="yiocqbqzfqGkxQidQIigucf7A/c=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/Ssnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwks/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK97teal2lcWRhyM4hjJ4cAE1uIE6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QP04o/Z</latexit>
<latexit sha1_base64="a7DEW9TA7F3/3elNIcTaVl6Dj6g=">AAAB8nicbVBNS8NAEN3Ur1q/qh69BItQLyWpgh6LXjxWsB+QxrLZbtq1m92wO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPzgpgzDY7zbRVWVtfWN4qbpa3tnd298v5BW8tEEdoikkvVDbCmnAnaAgacdmNFcRRw2gnG11O/80iVZlLcwSSmfoSHgoWMYDCS99RPH7L7tFo/zfrlilNzZrCXiZuTCsrR7Je/egNJkogKIBxr7blODH6KFTDCaVbqJZrGmIzxkHqGChxR7aezkzP7xCgDO5TKlAB7pv6eSHGk9SQKTGeEYaQXvan4n+clEF76KRNxAlSQ+aIw4TZIe/q/PWCKEuATQzBRzNxqkxFWmIBJqWRCcBdfXibtes09q7m355XGVR5HER2hY1RFLrpADXSDmqiFCJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+M0kPw=</latexit>
zj
<latexit sha1_base64="/U5BI6r2ubz7lzZTHbYvf7aPySc=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSLUS9mtgh6LXjxWsB/SriWbZtvYJLskWaEu/RVePCji1Z/jzX9j2u5BWx8MPN6bYWZeEHOmjet+O0vLK6tr67mN/ObW9s5uYW+/oaNEEVonEY9UK8CaciZp3TDDaStWFIuA02YwvJr4zUeqNIvkrRnF1Be4L1nICDZWunvqPtynpcrJuFsoumV3CrRIvIwUIUOtW/jq9CKSCCoN4VjrtufGxk+xMoxwOs53Ek1jTIa4T9uWSiyo9tPpwWN0bJUeCiNlSxo0VX9PpFhoPRKB7RTYDPS8NxH/89qJCS/8lMk4MVSS2aIw4chEaPI96jFFieEjSzBRzN6KyAArTIzNKG9D8OZfXiSNStk7LXs3Z8XqZRZHDg7hCErgwTlU4RpqUAcCAp7hFd4c5bw4787HrHXJyWYO4A+czx8dI4/z</latexit>
<latexit sha1_base64="MjYPTcWhFsUmb5oT0mVBLyugRyU=">AAAB+XicbVDLTsMwEHTKq5RXgCMXiwpRLlVSkOBYwYVjkehDakPkuE5r6jiR7VQqVv6ECwcQ4sqfcONvcNscoGWklUYzu9rdCRJGpXKcb6uwsrq2vlHcLG1t7+zu2fsHLRmnApMmjlksOgGShFFOmooqRjqJICgKGGkHo5up3x4TIWnM79UkIV6EBpyGFCNlJN+29VOmTzNfP2YPulI7y3y77FSdGeAycXNSBjkavv3V68c4jQhXmCEpu66TKE8joShmJCv1UkkShEdoQLqGchQR6enZ5Rk8MUofhrEwxRWcqb8nNIqknESB6YyQGspFbyr+53VTFV55mvIkVYTj+aIwZVDFcBoD7FNBsGITQxAW1NwK8RAJhJUJq2RCcBdfXiatWtU9r7p3F+X6dR5HERyBY1ABLrgEdXALGqAJMBiDZ/AK3ixtvVjv1se8tWDlM4fgD6zPH1+Ok3k=</latexit>
<latexit sha1_base64="Mon6EHXCa11EjnXsym3XNpquRp0=">AAAB9XicbVBNT8JAEJ3iF+IX6tHLRmLEC2nRRI9ELx4xkY8ECtkuW1jZbpvdrYY0/R9ePGiMV/+LN/+NC/Sg4EsmeXlvJjPzvIgzpW3728qtrK6tb+Q3C1vbO7t7xf2DpgpjSWiDhDyUbQ8rypmgDc00p+1IUhx4nLa88c3Ubz1SqVgo7vUkom6Ah4L5jGBtpF6CT9N+8pD2knL1LO0XS3bFngEtEycjJchQ7xe/uoOQxAEVmnCsVMexI+0mWGpGOE0L3VjRCJMxHtKOoQIHVLnJ7OoUnRhlgPxQmhIazdTfEwkOlJoEnukMsB6pRW8q/ud1Yu1fuQkTUaypIPNFfsyRDtE0AjRgkhLNJ4ZgIpm5FZERlphoE1TBhOAsvrxMmtWKc15x7i5KtessjjwcwTGUwYFLqMEt1KEBBCQ8wyu8WU/Wi/Vufcxbc1Y2cwh/YH3+APCYkiM=</latexit>
<latexit sha1_base64="oFSf5mIvKefo1ut2K84bI/HuXwI=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/ScvV00iuW3Io7A1omXkZKkKHeK351+xFJBJWGcKx1x3Nj46dYGUY4nRS6iaYxJiM8oB1LJRZU++ns4Ak6sUofhZGyJQ2aqb8nUiy0HovAdgpshnrRm4r/eZ3EhJd+ymScGCrJfFGYcGQiNP0e9ZmixPCxJZgoZm9FZIgVJsZmVLAheIsvL5NmteKdVbzb81LtKosjD0dwDGXw4AJqcAN1aAABAc/wCm+Ocl6cd+dj3ppzsplD+APn8wf2aI/a</latexit>
<latexit sha1_base64="9JMSM66KNyCTK9VIv0nIhGcw/P8=">AAAB8nicbVBNS8NAEN3Ur1q/qh69LBahXkpiBT0WvXisYD8gjWWz3bRrN9mwO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPz/FhwDbb9bRVWVtfWN4qbpa3tnd298v5BW8tEUdaiUkjV9YlmgkesBRwE68aKkdAXrOOPr6d+55EpzWV0B5OYeSEZRjzglICR3Kd++pDdp9X6adYvV+yaPQNeJk5OKihHs1/+6g0kTUIWARVEa9exY/BSooBTwbJSL9EsJnRMhsw1NCIh0146OznDJ0YZ4EAqUxHgmfp7IiWh1pPQN50hgZFe9Kbif56bQHDppTyKE2ARnS8KEoFB4un/eMAVoyAmhhCquLkV0xFRhIJJqWRCcBZfXibts5pTrzm355XGVR5HER2hY1RFDrpADXSDmqiFKJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+S6kP0=</latexit>
n
X 0 (2)[i] n
X
@l @l @a j @l
= (2)[i]
· = (2)[i]
@ j 0
i=1 @a j
@ j 0
i=1 @a j
<latexit sha1_base64="2OzzFHnbTuHimPhmKFR0ZerI7ww=">AAACv3icpVFNS+wwFE2r76nje89Rl26Cg+jbDK0KulDwY+NSwVFhWsttJh3jpGlJboUh9E+6EPw3ZsYBdRQ3Xggczj0nNzk3LaUwGATPnj8z++v33PxCY/HP339LzeWVK1NUmvEOK2Shb1IwXArFOyhQ8ptSc8hTya/Twemof/3AtRGFusRhyeMc+kpkggE6Kmk+RZkGZqMSNAqQVNZvOEo5QnJf00MamSpPrDgM61uravqNycJmndzf2q3t/10R107LegVOO6ZVXwz90dSk2QrawbjoZxBOQItM6jxpPka9glU5V8gkGNMNgxJjO7qeSV43osrwEtgA+rzroIKcm9iO86/phmN6NCu0OwrpmH3vsJAbM8xTp8wB78x0b0R+1etWmO3HVqiyQq7Y66CskhQLOlom7QnNGcqhA8C0cG+l7A5cSuhW3nAhhNNf/gyuttvhTju82G0dnUzimCdrZJ1skZDskSNyRs5JhzDvwEu9gSf9Y7/vK798lfrexLNKPpQ/fAFq+txw</latexit>
n
X 0 (2)[i] n
X
@l @l @a j @l 0 (2)[i]
= (2)[i]
· = (2)[i]
· zj
@ j 0
i=1 @a j
@ j 0
i=1 @a j
<latexit sha1_base64="2a+BRUZ8VjjyxkmZiTBOiAg5D9M=">AAAC2HicpVJBSxwxGM2Mta5b26569BJcpPayzGhBL8JiLx4VXJXujMM32cwaTTJDkhHWMOChpXj1p3nzV/gXzK4D1lV68YPA433v8ZLvS1pwpk0Q3Hv+zIfZj3ON+eanhc9fvrYWl450XipCeyTnuTpJQVPOJO0ZZjg9KRQFkXJ6nF78HPePL6nSLJeHZlTQWMBQsowRMI5KWg9RpoDYqABlGHDMq2ccDUEISM4rvIMjXYrEsp2wOrWywv9xWfhWJeendn3je5/FldOSQW6mHdOqt1LfF1vn2qsXfNJqB51gUvg1CGvQRnXtJ627aJCTUlBpCAet+2FQmNiOUwmnVTMqNS2AXMCQ9h2UIKiO7WQxFV5zzABnuXJHGjxh/3VYEFqPROqUAsyZnu6Nybd6/dJk27FlsigNleQpKCs5NjkebxkPmKLE8JEDQBRzd8XkDNzwjPsLTTeEcPrJr8HRRifc7IQHP9rd3XocDbSCVtE6CtEW6qI9tI96iHg9z3q/vT/+L//a/+vfPEl9r/Ysoxfl3z4Cdajl+g==</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 22
Backprop Beyond the BatchNorm Layer
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
<latexit sha1_base64="F7Swd/C4QN3QVLL72remZ0dqisQ=">AAAB8nicbVDLSgNBEJz1GeMr6tHLYBDiJeyqoMegF48RzAM2a5idzCZjZmeWmV4lLPsZXjwo4tWv8ebfOHkcNLGgoajqprsrTAQ34LrfztLyyuraemGjuLm1vbNb2ttvGpVqyhpUCaXbITFMcMkawEGwdqIZiUPBWuHweuy3Hpk2XMk7GCUsiElf8ohTAlbyn7rZQ36fVbyTvFsqu1V3ArxIvBkpoxnq3dJXp6doGjMJVBBjfM9NIMiIBk4Fy4ud1LCE0CHpM99SSWJmgmxyco6PrdLDkdK2JOCJ+nsiI7Exozi0nTGBgZn3xuJ/np9CdBlkXCYpMEmni6JUYFB4/D/ucc0oiJElhGpub8V0QDShYFMq2hC8+ZcXSfO06p1Vvdvzcu1qFkcBHaIjVEEeukA1dIPqqIEoUugZvaI3B5wX5935mLYuObOZA/QHzucP4a6Q+w==</latexit>
<latexit sha1_base64="yiocqbqzfqGkxQidQIigucf7A/c=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/Ssnc66RVLbsWdAS0TLyMlyFDvFb+6/YgkgkpDONa647mx8VOsDCOcTgrdRNMYkxEe0I6lEguq/XR28ASdWKWPwkjZkgbN1N8TKRZaj0VgOwU2Q73oTcX/vE5iwks/ZTJODJVkvihMODIRmn6P+kxRYvjYEkwUs7ciMsQKE2MzKtgQvMWXl0nzrOJVK97teal2lcWRhyM4hjJ4cAE1uIE6NICAgGd4hTdHOS/Ou/Mxb8052cwh/IHz+QP04o/Z</latexit>
<latexit sha1_base64="a7DEW9TA7F3/3elNIcTaVl6Dj6g=">AAAB8nicbVBNS8NAEN3Ur1q/qh69BItQLyWpgh6LXjxWsB+QxrLZbtq1m92wO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPzgpgzDY7zbRVWVtfWN4qbpa3tnd298v5BW8tEEdoikkvVDbCmnAnaAgacdmNFcRRw2gnG11O/80iVZlLcwSSmfoSHgoWMYDCS99RPH7L7tFo/zfrlilNzZrCXiZuTCsrR7Je/egNJkogKIBxr7blODH6KFTDCaVbqJZrGmIzxkHqGChxR7aezkzP7xCgDO5TKlAB7pv6eSHGk9SQKTGeEYaQXvan4n+clEF76KRNxAlSQ+aIw4TZIe/q/PWCKEuATQzBRzNxqkxFWmIBJqWRCcBdfXibtes09q7m355XGVR5HER2hY1RFLrpADXSDmqiFCJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+M0kPw=</latexit>
zj
<latexit sha1_base64="/U5BI6r2ubz7lzZTHbYvf7aPySc=">AAAB8HicbVBNSwMxEJ31s9avqkcvwSLUS9mtgh6LXjxWsB/SriWbZtvYJLskWaEu/RVePCji1Z/jzX9j2u5BWx8MPN6bYWZeEHOmjet+O0vLK6tr67mN/ObW9s5uYW+/oaNEEVonEY9UK8CaciZp3TDDaStWFIuA02YwvJr4zUeqNIvkrRnF1Be4L1nICDZWunvqPtynpcrJuFsoumV3CrRIvIwUIUOtW/jq9CKSCCoN4VjrtufGxk+xMoxwOs53Ek1jTIa4T9uWSiyo9tPpwWN0bJUeCiNlSxo0VX9PpFhoPRKB7RTYDPS8NxH/89qJCS/8lMk4MVSS2aIw4chEaPI96jFFieEjSzBRzN6KyAArTIzNKG9D8OZfXiSNStk7LXs3Z8XqZRZHDg7hCErgwTlU4RpqUAcCAp7hFd4c5bw4787HrHXJyWYO4A+czx8dI4/z</latexit>
<latexit sha1_base64="MjYPTcWhFsUmb5oT0mVBLyugRyU=">AAAB+XicbVDLTsMwEHTKq5RXgCMXiwpRLlVSkOBYwYVjkehDakPkuE5r6jiR7VQqVv6ECwcQ4sqfcONvcNscoGWklUYzu9rdCRJGpXKcb6uwsrq2vlHcLG1t7+zu2fsHLRmnApMmjlksOgGShFFOmooqRjqJICgKGGkHo5up3x4TIWnM79UkIV6EBpyGFCNlJN+29VOmTzNfP2YPulI7y3y77FSdGeAycXNSBjkavv3V68c4jQhXmCEpu66TKE8joShmJCv1UkkShEdoQLqGchQR6enZ5Rk8MUofhrEwxRWcqb8nNIqknESB6YyQGspFbyr+53VTFV55mvIkVYTj+aIwZVDFcBoD7FNBsGITQxAW1NwK8RAJhJUJq2RCcBdfXiatWtU9r7p3F+X6dR5HERyBY1ABLrgEdXALGqAJMBiDZ/AK3ixtvVjv1se8tWDlM4fgD6zPH1+Ok3k=</latexit>
<latexit sha1_base64="Mon6EHXCa11EjnXsym3XNpquRp0=">AAAB9XicbVBNT8JAEJ3iF+IX6tHLRmLEC2nRRI9ELx4xkY8ECtkuW1jZbpvdrYY0/R9ePGiMV/+LN/+NC/Sg4EsmeXlvJjPzvIgzpW3728qtrK6tb+Q3C1vbO7t7xf2DpgpjSWiDhDyUbQ8rypmgDc00p+1IUhx4nLa88c3Ubz1SqVgo7vUkom6Ah4L5jGBtpF6CT9N+8pD2knL1LO0XS3bFngEtEycjJchQ7xe/uoOQxAEVmnCsVMexI+0mWGpGOE0L3VjRCJMxHtKOoQIHVLnJ7OoUnRhlgPxQmhIazdTfEwkOlJoEnukMsB6pRW8q/ud1Yu1fuQkTUaypIPNFfsyRDtE0AjRgkhLNJ4ZgIpm5FZERlphoE1TBhOAsvrxMmtWKc15x7i5KtessjjwcwTGUwYFLqMEt1KEBBCQ8wyu8WU/Wi/Vufcxbc1Y2cwh/YH3+APCYkiM=</latexit>
<latexit sha1_base64="oFSf5mIvKefo1ut2K84bI/HuXwI=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRY9OKxgv2Qdi3ZNNvGJtklyQpl6a/w4kERr/4cb/4b03YP2vpg4PHeDDPzgpgzbVz328mtrK6tb+Q3C1vbO7t7xf2Dpo4SRWiDRDxS7QBrypmkDcMMp+1YUSwCTlvB6Hrqt56o0iySd2YcU1/ggWQhI9hY6R73Hh/ScvV00iuW3Io7A1omXkZKkKHeK351+xFJBJWGcKx1x3Nj46dYGUY4nRS6iaYxJiM8oB1LJRZU++ns4Ak6sUofhZGyJQ2aqb8nUiy0HovAdgpshnrRm4r/eZ3EhJd+ymScGCrJfFGYcGQiNP0e9ZmixPCxJZgoZm9FZIgVJsZmVLAheIsvL5NmteKdVbzb81LtKosjD0dwDGXw4AJqcAN1aAABAc/wCm+Ocl6cd+dj3ppzsplD+APn8wf2aI/a</latexit>
<latexit sha1_base64="9JMSM66KNyCTK9VIv0nIhGcw/P8=">AAAB8nicbVBNS8NAEN3Ur1q/qh69LBahXkpiBT0WvXisYD8gjWWz3bRrN9mwO1FKyM/w4kERr/4ab/4bt20O2vpg4PHeDDPz/FhwDbb9bRVWVtfWN4qbpa3tnd298v5BW8tEUdaiUkjV9YlmgkesBRwE68aKkdAXrOOPr6d+55EpzWV0B5OYeSEZRjzglICR3Kd++pDdp9X6adYvV+yaPQNeJk5OKihHs1/+6g0kTUIWARVEa9exY/BSooBTwbJSL9EsJnRMhsw1NCIh0146OznDJ0YZ4EAqUxHgmfp7IiWh1pPQN50hgZFe9Kbif56bQHDppTyKE2ARnS8KEoFB4un/eMAVoyAmhhCquLkV0xFRhIJJqWRCcBZfXibts5pTrzm355XGVR5HER2hY1RFDrpADXSDmqiFKJLoGb2iNwusF+vd+pi3Fqx85hD9gfX5A+S6kP0=</latexit>
(2)
@l 1 @l 1 @l 2(zj µj )
= (2)[i]
· + · + 2 ·
0
@z j j @µj n @ j n
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 23
class MultilayerPerceptron(torch.nn.Module): BatchNorm in PyTorch
def __init__(self, num_features, num_classes):
super(MultilayerPerceptron, self).__init__()
out = self.linear_2(out)
out = self.linear_2_bn(out)
out = F.relu(out)
logits = self.linear_out(out)
probas = F.softmax(logits, dim=1)
return logits, probas
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 25
class MultilayerPerceptron(torch.nn.Module): BatchNorm in PyTorch
def __init__(self, num_features, num_classes):
super(MultilayerPerceptron, self).__init__()
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 26
BatchNorm During Prediction ("Inference")
• Alternatively, can also use global training set mean and variance
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 27
BatchNorm Variants
Pre-Activation Post-Activation
"Original" version May make more sense,
as discussed in but less common
previous slides
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 34
Some Benchmarks
https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md#bn----before-or-
after-relu
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 35
Some Benchmarks
https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md#bn----before-or-
after-relu
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 36
Some Benchmarks
https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md#bn----before-or-
after-relu
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 37
Other Normalization Methods for Hidden Activations
4 Wu and He
H, W
H, W
H, W
H, W
C N C N C N C N
Figure 2. Normalization methods. Each subplot shows a feature map tensor. The
pixels in blue are normalized by the same mean and variance, computed by aggregating
the values of these pixels. Group Norm is illustrated using a group number of 2.
Wu, Y., & He, K. (2018). Group normalization. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 3-19).
Group-wise computation. Group convolutions have been presented by AlexNet
[28] for distributing a model into two GPUs. The concept of groups as a di-
mension for model design has been more widely studied recently. The work of
(will revisit
ResNeXt after introducing
[7] investigates the trade-offConvolutional
between depth, Neural
width, andNetworks)
groups, and
it suggests that a larger number of groups can improve accuracy under similar
computational cost. MobileNet [38] and Xception [39] exploit channel-wise (also
called “depth-wise”) convolutions, which are group convolutions with a group
number equal to the channel number. ShuffleNet [40] proposes a channel shuffle
operation that permutes the axes of grouped features. These methods all in-
Sebastian Raschka the channel
volve dividing STAT 453: Intro to Deep
dimension intoLearning
groups.andDespite
Generative
theModels SS 2020
relation to these 39
Part 2: Weight Initialization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 42
Weight Initialization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 43
Sidenote: Vanishing/Exploding Gradient Problems
(1)
w1,1
(1) a1 <latexit sha1_base64="51Rbp1GGPW28qr7Kl7NY0LPiq2o=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquCnosevFYwX5Iu5Zsmm1Dk+ySZIWy9Fd48aCIV3+ON/+N6XYP2vpg4PHeDDPzgpgzbVz32ymsrK6tbxQ3S1vbO7t75f2Dlo4SRWiTRDxSnQBrypmkTcMMp51YUSwCTtvB+Gbmt5+o0iyS92YSU1/goWQhI9hY6QH3vce06p1O++WKW3MzoGXi5aQCORr98ldvEJFEUGkIx1p3PTc2foqVYYTTaamXaBpjMsZD2rVUYkG1n2YHT9GJVQYojJQtaVCm/p5IsdB6IgLbKbAZ6UVvJv7ndRMTXvkpk3FiqCTzRWHCkYnQ7Hs0YIoSwyeWYKKYvRWREVaYGJtRyYbgLb68TFpnNe+85t5dVOrXeRxFOIJjqIIHl1CHW2hAEwgIeIZXeHOU8+K8Ox/z1oKTzxzCHzifP5zWj58=</latexit>
w1,1
(2)
y
<latexit sha1_base64="cs1Q9fet/6GNtc+Tzw/y6WCTX8Y=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0Io/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHkyXoR3QoecgZNVZqZP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03Mf6EKsOZwGmpl2pMKBvTIXYtlTRC7U/mh07JmVUGJIyVLWnIXP09MaGR1lkU2M6ImpFe9mbif143NeGNP+EySQ1KtlgUpoKYmMy+JgOukBmRWUKZ4vZWwkZUUWZsNiUbgrf88ippXVS9y6rbuKrUbvM4inACp3AOHlxDDe6hDk1ggPAMr/DmPDovzrvzsWgtOPnMMfyB8/kD6GeM/w==</latexit>
<latexit sha1_base64="5CoRH/4hNmmOELpSJVIbVc5Zpaw=">AAAB9XicbVBNSwMxEJ31s9avqkcvwSJUkLJRQY9FLx4r2A9otyWbZtvQbHZJspay9H948aCIV/+LN/+NabsHbX0w8Hhvhpl5fiy4Nq777aysrq1vbOa28ts7u3v7hYPDuo4SRVmNRiJSTZ9oJrhkNcONYM1YMRL6gjX84d3UbzwxpXkkH804Zl5I+pIHnBJjpc6om+JzhCedtITPJt1C0S27M6BlgjNShAzVbuGr3YtoEjJpqCBat7AbGy8lynAq2CTfTjSLCR2SPmtZKknItJfOrp6gU6v0UBApW9Kgmfp7IiWh1uPQt50hMQO96E3F/7xWYoIbL+UyTgyTdL4oSAQyEZpGgHpcMWrE2BJCFbe3IjogilBjg8rbEPDiy8ukflHGl2X34apYuc3iyMExnEAJMFxDBe6hCjWgoOAZXuHNGTkvzrvzMW9dcbKZI/gD5/MHvkORXA==</latexit>
<latexit sha1_base64="UUz+fFdIMuJxCEAyuchQQrqM+Xo=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/JkV0=</latexit>
x1 (2) (3)
<latexit sha1_base64="5HJHR/B9CHeIlPgqihTyAybn2c4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8eK9gPaUDbbSbt0swm7G7GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6f+p5vXLFrbozkGXi5aQCOeq98le3H7M0QmmYoFp3PDcxfkaV4UzgpNRNNSaUjegAO5ZKGqH2s9mpE3JilT4JY2VLGjJTf09kNNJ6HAW2M6JmqBe9qfif10lNeOVnXCapQcnmi8JUEBOT6d+kzxUyI8aWUKa4vZWwIVWUGZtOyYbgLb68TJpnVe+86t5dVGrXeRxFOIJjOAUPLqEGt1CHBjAYwDO8wpsjnBfn3fmYtxacfOYQ/sD5/AEMWo2i</latexit>
a1
<latexit sha1_base64="vfx38n+ae04OFRd5luhElMypRJ0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPue49puXo+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeXI+g</latexit>
w1,1
<latexit sha1_base64="2hc9HR5bv0+inQ8IsEn3Gd2v7nM=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpTECnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QE7aS8rV87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaV5WnGrFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8AcFPkV4=</latexit>
L(y, o) = l
(1) o
<latexit sha1_base64="xkDVhV2R7yGjiI8Bkoa6EodHAlw=">AAAB/nicbVDLSsNAFL2pr1pfUXHlZrAIFaQkKuhGKLpx4aKCfUAbymQ6aYdOJmFmIpRQ8FfcuFDErd/hzr9x0mah1QMDh3Pu5Z45fsyZ0o7zZRUWFpeWV4qrpbX1jc0te3unqaJEEtogEY9k28eKciZoQzPNaTuWFIc+py1/dJ35rQcqFYvEvR7H1AvxQLCAEayN1LP3uiHWQ4J5ejupjI9RdIQuEe/ZZafqTIH+EjcnZchR79mf3X5EkpAKTThWquM6sfZSLDUjnE5K3UTRGJMRHtCOoQKHVHnpNP4EHRqlj4JImic0mqo/N1IcKjUOfTOZhVXzXib+53USHVx4KRNxoqkgs0NBwpGOUNYF6jNJieZjQzCRzGRFZIglJto0VjIluPNf/kuaJ1X3tOrcnZVrV3kdRdiHA6iAC+dQgxuoQwMIpPAEL/BqPVrP1pv1PhstWPnOLvyC9fENUuuUZw==</latexit>
(1)
w1,2 a2
<latexit sha1_base64="UEIEXkJI4Qcu+777LfA5dwpJBR0=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrWvFqFffuolS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGeYI+g</latexit>
<latexit sha1_base64="zmvhV5w6wvufBjgJnplzs3qmpp8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69BIvgqSQq6LHoxWML9gPaUDbbSbt2sxt2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZemHCmjed9O4W19Y3NreJ2aWd3b/+gfHjU0jJVFJtUcqk6IdHImcCmYYZjJ1FI4pBjOxzfzfz2EyrNpHgwkwSDmAwFixglxkoN2S9XvKo3h7tK/JxUIEe9X/7qDSRNYxSGcqJ11/cSE2REGUY5Tku9VGNC6JgMsWupIDHqIJsfOnXPrDJwI6lsCePO1d8TGYm1nsSh7YyJGellbyb+53VTE90EGRNJalDQxaIo5a6R7uxrd8AUUsMnlhCqmL3VpSOiCDU2m5INwV9+eZW0Lqr+ZdVrXFVqt3kcRTiBUzgHH66hBvdQhyZQQHiGV3hzHp0X5935WLQWnHzmGP7A+fwB2T+M9Q==</latexit>
l <latexit sha1_base64="E5Kc1ZKr520j8ga7QDzfGA0mefk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lU0GPRi8cW7Ae0oWy2k3btZhN2N0IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4bua3n1BpHssHM0nQj+hQ8pAzaqzUEP1yxa26c5BV4uWkAjnq/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5oVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nXZMAVMiMmllCmuL2VsBFVlBmbTcmG4C2/vEpaF1Xvsuo2riq12zyOIpzAKZyDB9dQg3uoQxMYIDzDK7w5j86L8+58LFoLTj5zDH/gfP4A1LOM8g==</latexit>
(2)
<latexit sha1_base64="94K/4WUXlcb/JabfmqCJ0lfwAyA=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpSkCnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QNW0l5Sd87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaVYrzmXFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8Ab/OkV0=</latexit>
x2
<latexit sha1_base64="gBTwEt+X3BPX1KgMo6lYVWIC09o=">AAAB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj04rGi/YA2lM120y7dbMLuRCyhP8GLB0W8+ou8+W/ctjlo64OBx3szzMwLEikMuu63s7K6tr6xWdgqbu/s7u2XDg6bJk414w0Wy1i3A2q4FIo3UKDk7URzGgWSt4LRzdRvPXJtRKwecJxwP6IDJULBKFrp/qlX7ZXKbsWdgSwTLydlyFHvlb66/ZilEVfIJDWm47kJ+hnVKJjkk2I3NTyhbEQHvGOpohE3fjY7dUJOrdInYaxtKSQz9fdERiNjxlFgOyOKQ7PoTcX/vE6K4ZWfCZWkyBWbLwpTSTAm079JX2jOUI4toUwLeythQ6opQ5tO0YbgLb68TJrVindece8uyrXrPI4CHMMJnIEHl1CDW6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEN3o2j</latexit>
w3,1
(2) a2
<latexit sha1_base64="Rx/RXsiT+s/v11w3kFUY/JZyKRU=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXspuK+ix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/TcvV82i+W3Io7B1olXkZKkKHRL371BhFJBJWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LJRZU++n84Ck6s8oAhZGyJQ2aq78nUiy0nojAdgpsRnrZm4n/ed3EhFd+ymScGCrJYlGYcGQiNPseDZiixPCJJZgoZm9FZIQVJsZmVLAheMsvr5JWteLVKu7dRal+ncWRhxM4hTJ4cAl1uIUGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QOf5o+h</latexit>
<latexit sha1_base64="cEzid11tbdqRtUGPTS00ftF0Nlk=">AAAB9XicbVBNS8NAEJ34WetX1aOXxSJUkJK0gh6LXjxWsB/QpmWz3bRLN5uwu7GUkP/hxYMiXv0v3vw3btsctPXBwOO9GWbmeRFnStv2t7W2vrG5tZ3bye/u7R8cFo6OmyqMJaENEvJQtj2sKGeCNjTTnLYjSXHgcdryxnczv/VEpWKheNTTiLoBHgrmM4K1kXqTflK9RE7aS0qVi7RfKNplew60SpyMFCFDvV/46g5CEgdUaMKxUh3HjrSbYKkZ4TTNd2NFI0zGeEg7hgocUOUm86tTdG6UAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVB5E4Kz/PIqaVbKTrVsP1wVa7dZHDk4hTMogQPXUIN7qEMDCEh4hld4sybWi/VufSxa16xs5gT+wPr8AcLlkV8=</latexit>
(1) (1)
w1,3 <latexit sha1_base64="MjoriDaix6GZhq5KkAUBLNsPXo8=">AAAB9XicbVBNS8NAEJ3Ur1q/qh69LBahgpTECnosevFYwX5Am5bNdtMu3WzC7sZSQv6HFw+KePW/ePPfuG1z0NYHA4/3ZpiZ50WcKW3b31ZubX1jcyu/XdjZ3ds/KB4eNVUYS0IbJOShbHtYUc4EbWimOW1HkuLA47Tlje9mfuuJSsVC8ainEXUDPBTMZwRrI/Um/cS5QNW0l5Sd87RfLNkVew60SpyMlCBDvV/86g5CEgdUaMKxUh3HjrSbYKkZ4TQtdGNFI0zGeEg7hgocUOUm86tTdGaUAfJDaUpoNFd/TyQ4UGoaeKYzwHqklr2Z+J/XibV/4yZMRLGmgiwW+TFHOkSzCNCASUo0nxqCiWTmVkRGWGKiTVAFE4Kz/PIqaV5WnGrFfrgq1W6zOPJwAqdQBgeuoQb3UIcGEJDwDK/wZk2sF+vd+li05qxs5hj+wPr8AcFZkV4=</latexit>
a3 <latexit sha1_base64="F0cJIqijoEg/scv4wVZxoymO2Dc=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CRahXsquLeix6MVjBfsh7VqyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8c3Mbz9RpVkk780kpr7AQ8lCRrCx0gPuVx/Tsnc+7RdLbsWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwis/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0nrouJVK+5drVS/zuLIwwmcQhk8uIQ63EIDmkBAwDO8wpujnBfn3flYtOacbOYY/sD5/AGf6o+h</latexit>
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 44
Sidenote: Vanishing/Exploding Gradient problems
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 45
Sidenote: Vanishing/Exploding Gradient Problems
1 d e z
(z) = (z) = z 2
= (z)(1 (z))
<latexit sha1_base64="27EACsCo0KJp9Pu/nSJvZPikwFk=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0WoiCVRQTdC0Y3LCvYBTSyT6aQOnUzCzERoQ9Zu/BU3LhRx6xe482+ctFlo64ELh3Pu5d57vIhRqSzr2yjMzS8sLhWXSyura+sb5uZWU4axwKSBQxaKtockYZSThqKKkXYkCAo8Rlre4CrzWw9ESBryWzWMiBugPqc+xUhpqWvuOpL2A1QZHcAL6PgC4cROExseQnKXHI3StNQ1y1bVGgPOEjsnZZCj3jW/nF6I44BwhRmSsmNbkXITJBTFjKQlJ5YkQniA+qSjKUcBkW4yfiWF+1rpQT8UuriCY/X3RIICKYeBpzsDpO7ltJeJ/3mdWPnnbkJ5FCvC8WSRHzOoQpjlAntUEKzYUBOEBdW3QnyPdB5Kp5eFYE+/PEuax1X7pGrdnJZrl3kcRbAD9kAF2OAM1MA1qIMGwOARPINX8GY8GS/Gu/ExaS0Y+cw2+APj8wdpQZjN</latexit>
1+e z
<latexit sha1_base64="W9vRWm8kOss6A+x0J6A9OywGNA0=">AAACPHicbVDLSgMxFM34rPU16tJNsAgt0jJTBd0IRTcuK9oH9EUmk2lDMw+SjNAO82Fu/Ah3rty4UMStazPt4KP1QODcc+7l5h4rYFRIw3jSFhaXlldWM2vZ9Y3NrW19Z7cu/JBjUsM+83nTQoIw6pGapJKRZsAJci1GGtbwMvEbd4QL6nu3chSQjov6HnUoRlJJPf2m7XCEIzuO7HEM24L2XZQfF+A5nBqkGxXHcRzlTXgEp0WhW44T/7tXecWfspDt6TmjZEwA54mZkhxIUe3pj23bx6FLPIkZEqJlGoHsRIhLihmJs+1QkADhIeqTlqIeconoRJPjY3ioFBs6PlfPk3Ci/p6IkCvEyLVUp4vkQMx6ifif1wqlc9aJqBeEknh4usgJGZQ+TJKENuUESzZSBGFO1V8hHiAVmlR5JyGYsyfPk3q5ZB6XjOuTXOUijSMD9sEByAMTnIIKuAJVUAMY3INn8AretAftRXvXPqatC1o6swf+QPv8AoHHqxk=</latexit>
dz (1 + e )
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 46
Weight Initialization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 49
Weight Initialization
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 50
Custom Weight Initialization in PyTorch
class MLP(torch.nn.Module):
self.num_classes = num_classes
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 51
Weight Initialization -- Xavier Initialization
Glorot, Xavier, and Yoshua Bengio. "Understanding the difficulty of training deep feedforward neural networks."
Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010.
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 52
Weight Initialization -- Xavier Initialization
Glorot, Xavier, and Yoshua Bengio. "Understanding the difficulty of training deep feedforward neural networks."
Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010.
Method:
Step 1: Initialize weights from Gaussian or uniform distribution with (previous slide)
Step 2: Scale the weights proportional to the number of inputs to the layer
(For the first hidden layer, that is the number of features in the dataset;
for the second hidden layer, that is the number of units in the 1st hidden layer
etc.)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 53
Weight Initialization -- Xavier Initialization
Glorot, Xavier, and Yoshua Bengio. "Understanding the difficulty of training deep feedforward neural networks."
Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010.
Method:
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 54
Xavier Initialization in PyTorch
Semi-Automatic:
...
self.linear = torch.nn.Linear(...)
torch.nn.init.xavier_uniform_(conv1.weight)
...
model.apply(weights_init)
...
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 58
Weight Initialization -- Xavier Initialization
r
(l) (l) 1
W := W ·
<latexit sha1_base64="oPuJLBzfUCNgUUxhWs6eByv4t9g=">AAACL3icbVDLSgMxFM3Ud32NunQTLEJdWGZUUARBFMRlBWuFzrRk0owNzTxM7gglzB+58VfciCji1r8w03ah1gOBwzn3cnNOkAquwHFerdLU9Mzs3PxCeXFpeWXVXlu/UUkmKWvQRCTyNiCKCR6zBnAQ7DaVjESBYM2gf174zQcmFU/iaxikzI/IXcxDTgkYqWNfeBGBXhDqZt7WVbGT4+MTjCdEj3YTwJ66l6C9UBKq3VxHbd0Su66f53nHrjg1Zwg8SdwxqaAx6h372esmNItYDFQQpVquk4KviQROBcvLXqZYSmif3LGWoTGJmPL1MG+Ot43SxWEizYsBD9WfG5pESg2iwEwWQdRfrxD/81oZhEe+5nGaAYvp6FCYCQwJLsrDXS4ZBTEwhFDJzV8x7RFTB5iKy6YE92/kSXKzV3P3a3tXB5XTs3Ed82gTbaEqctEhOkWXqI4aiKJH9Ize0Lv1ZL1YH9bnaLRkjXc20C9YX9+Adalk</latexit>
m[l 1]
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 59
Weight Initialization -- Xavier Initialization
Glorot, Xavier, and Yoshua Bengio. "Understanding the difficulty of training deep feedforward neural networks."
Proceedings
derstanding of the thirteenth
the difficulty of traininginternational conference
deep feedforward on artificial intelligence and statistics. 2010.
neural networks
@zi+1
(17)
@zi
ve the same dimension, the av-
onds to the average ratio of in- Figure 7: Back-propagated gradients normalized his-
d from zi to zi+1 , as well as tograms with hyperbolic tangent activation, with standard
vation variance going from zi (top) vs normalized (bottom) initialization. Top: 0-peak
zed initialization, this ratio is decreases for higher layers.
standard initialization, it drops
What was initially really surprising is that even when the
back-propagated gradients become smaller (standard ini-
tialization), the variance of the weights gradients is roughly
Sebastian Raschka STATacross
constant 453: layers,
Intro toasDeep
shownLearning
on Figureand Generativethis
8. However, Models SS 2020 63
Weight Initialization -- He Initialization
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving deep into rectifiers: Surpassing human-level
performance on imagenet classification." In Proceedings of the IEEE international conference on computer vision, pp.
1026-1034. 2015.
• For ReLU, this is different, as the activations are not centered at zero anymore
• He initialization takes this into account (to see that worked out in math, see the
paper)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 64
Weight Initialization -- He Initialization
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving deep into rectifiers: Surpassing human-level
performance on imagenet classification." In Proceedings of the IEEE international conference on computer vision, pp.
1026-1034. 2015.
• For ReLU, this is different, as the activations are not centered at zero anymore
• He initialization takes this into account (to see that worked out in math, see the
paper)
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 65
PyTorch Default Weights
PyTorch uses the following scheme by default, which is
somewhat similar to Xavier initialization, and works ok
in practice most of the time
https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/linear.py#L148
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 66
PyTorch Default Weights
https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py#L55
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 67
Note that if BatchNorm is used,
initial feature weight choice is less important anyway
Sebastian Raschka STAT 453: Intro to Deep Learning and Generative Models SS 2020 68
Hands-on Fashion-MNIST
Exercise in pairs:
● Ouput: A report of ~3 pages describing a solution to the Self sorting
wardrobe problem using Fashion-MNIST dataset to be delivered in 3
weeks.
○ Describe the problem, analysis and results obtained following the Data Science
process cycle (see file hands-on Excercise.pptx slides 4-5)
● Read/run the suggested tutorial using Google Colab
○ https://www.tensorflow.org/tutorials/keras/classification?hl=pt-br
○ Try other neural network architectures
● Read/run the tutorial Self sorting wardrobe using Peltarion platform
○ https://peltarion.com/knowledge-center/documentation/tutorials/self-sorting-wardrobe
○ Optional 4