Conv2d Padding, Strides
Conv2d Padding, Strides
Yes 5
End
5
Convolution: zero padding
Half padding
3 1 0 1
?
Half padding
1 1 2 0
CONV Output
1 2 2 1
0 1 0 2
1 0 2
Input image
1 2 0
0 1 1
Filter’s kernel
6
Convolution: zero padding
Half padding
3 1 0 1
?
Half padding
1 1 2 0
CONV Output
1 2 2 1
0 1 0 2
1 0 2
Input image
4×4 1 2 0
0 1 1
Filter’s kernel
3×3
3 3
padding size = ⌊ ⌋ × ⌊ ⌋ = 1 × 1
2 2
The input is enlarged one pixel on left, right, top, and bottom
7
Convolution: zero padding
Half padding
3 1 0 1
3 1 0 1
3 3
1 1 2 0 padding size = ⌊ ⌋ × ⌊ ⌋ = 1 × 1
2 2 1 1 2 0
1 2 2 1 1 2 2 1
0 1 0 2 0 1 0 2
Input image
3 1 0 1
1 1 2 0
1 2 2 1 1 1 0
0 1 0 2 0 2 1
2 0 1
Output size = ?
9
Convolution: zero padding
Half padding
The first valid position
1 2 2 1 1 1 0
0 1 0 2 0 2 1
2 0 1
i1 Rotated kernel
10
Convolution: zero padding
Half padding
The last valid position (on horizontal direction)
The first valid position
i1
1x0 1x0 0x0
1 2 2 1 1 1 0
0 1 0 2 0 2 1
2 0 1
i1 Rotated kernel
11
Convolution: zero padding
Half padding
The last valid position (on horizontal direction)
The first valid position
i1
i1
1x0 1x0 0x0
i2 i2
1 2 2 1
0 1 0 2
Output
i1
Half padding (with unit stride): input and output have the same size
12
Convolution: zero padding
Half padding
3 1 0 1
Half padding
1 1 2 0
CONV
1 2 2 1
0 1 0 2
1 0 2
Input image Output
1 2 0
4×4 4×4
0 1 1
Filter’s kernel
3×3
Half padding
i1 × i2
* i1 × i2
k1 × k2
13
Convolution: zero padding
Half padding
3 1 0 1
Half padding
1 1 2 0
CONV
1 2 2 1
0 1 0 2
1 0 2
Input image Output
1 2 0
4×4 4×4
0 1 1
Filter’s kernel
3×3
Step-by-step computation
(see next slides)
14
Convolution: zero padding
Half padding
1 0 2 1 1 0
1 Rotation 180o
1 2 0 0 2 1
0 1 1 2 0 1
Flatten to vector
1 1 0 0 2 1 2 0 1
15
Convolution: zero padding
Half padding
2 3 1 0 1
3 3 3 1 0 1
1 1 2 0 padding size = ⌊ ⌋ × ⌊ ⌋ = 1 × 1
2 2
1 1 2 0
1 2 2 1
1 2 2 1
0 1 0 2
0 1 0 2
Input image
Output
4×4
16
Convolution: zero padding
Half padding
3 starting the cross-correlation process
3 1 0 1
1 1 2 0
1 2 2 1
0 1 0 2
17
Convolution: zero padding
Half padding
·
4 collecting sub-image Flattening
3 1
1 1
3 1 1 1
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8
Output
18
Convolution: zero padding
Half padding
·
4 collecting sub-image Flattening
3 1 0
1 1 2
3 1 0 1 1 2
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6
Output
19
Convolution: zero padding
Half padding
·
4 collecting sub-image Flattening
1 0 1
1 2 0
1 0 1 1 2 0
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3
Output
20
Convolution: zero padding
Half padding
·
4 collecting sub-image Flattening
0 1
1 2 0
0 1 2 0
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
Output
21
Convolution: zero padding
Half padding
·
3 1
4 collecting sub-image
1 1
Flattening
1 2
3 1 1 1 1 2
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
Output
22
Convolution: zero padding
Half padding
·
3 1 0
4 collecting sub-image
1 1 2
Flattening
1 2 2
3 1 0 1 1 2 1 2 2
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12
Output
23
Convolution: zero padding
Half padding
·
1 0 1
4 collecting sub-image
1 2 0
Flattening
2 2 1
1 0 1 1 2 0 2 2 1
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10
Output
24
Convolution: zero padding
Half padding
·
0 1
4 collecting sub-image
2 0
Flattening
2 1
0 1 2 0 2 1
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
Output
25
Convolution: zero padding
Half padding
·
1 1
4 collecting sub-image
1 2
Flattening
0 1
1 1 1 2 0 1
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
Output
26
Convolution: zero padding
Half padding
·
1 1 2
4 collecting sub-image
1 2 2
Flattening
0 1 0
1 1 2 1 2 2 0 1 0
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
6 8
Output
27
Convolution: zero padding
Half padding
·
2 0
4 collecting sub-image
2 1
Flattening
0 2
2 0 2 1 0 2
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
6 8 12 4
Output
28
Convolution: zero padding
Half padding
·
1 2
4 collecting sub-image
0 1
Flattening
1 2 0 1
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
6 8 12 4
Output
29
Convolution: zero padding
Half padding
·
1 2 2
4 collecting sub-image
0 1 0
Flattening
1 2 2 0 1 0
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
6 8 12 4
2 5
Output
30
Convolution: zero padding
Half padding
·
2 2 1
4 collecting sub-image
0 1 0
Flattening
2 2 1 1 0 2
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
6 8 12 4
2 5 6
Output
31
Convolution: zero padding
Half padding
·
2 1
4 collecting sub-image
0 2
Flattening
2 1 0 2
3 1 0 1
1 1 2 0 dot-product
1 2 2 1 1 1 0 0 2 1 2 0 1
0 1 0 2
8 6 3 6
8 12 10 5
6 8 12 4
2 5 6 7
Output
32
Convolution: zero padding
Half padding
Final result
3 1 0 1
8 6 3 6
Half padding
1 1 2 0
CONV 8 12 10 5
1 2 2 1
6 8 12 4
0 1 0 2
1 0 2 2 5 6 7
Input image
1 2 0 Output
0 1 1
Filter’s kernel
Convolution: zero padding
Full padding
Dr. Thanh-Sach LE
[email protected]
?
1 1 2 0
CONV Output
1 2 2 1
0 1 0 2
1 0 2
Input image
1 2 0
0 1 1
Filter’s kernel
35
Convolution: zero padding
Full padding
3 1 0 1
full padding
?
1 1 2 0
CONV Output
1 2 2 1
0 1 0 2
1 0 2
Input image
1 2 0
4×4
0 1 1
Filter’s kernel
3×3
padding size = (3 − 1) × (3 − 1) = 2 × 2
The input is enlarged two pixel on left, right, top, and bottom
36
Convolution: zero padding
Full padding
3 1 0 1
Input image 0 1 0 2
4×4
Output size = ?
37
Convolution: zero padding
Full padding
The first valid position
1 1 2 0
1 2 2 1
0 1 0 2
1 1 0
0 2 1
2 0 1
Padded image Rotated kernel
38
Convolution: zero padding
Full padding
The first valid position The last valid position (on horizontal direction)
1 1 2 0
1 2 2 1
0 1 0 2
1 1 0
0 2 1
2 0 1
Padded image Rotated kernel
39
Convolution: zero padding
Full padding
The first valid position The last valid position (on horizontal direction)
i1 + (k1 − 1)
1 1 2 0
1 2 2 1
0 1 0 2
(k1 − 1) i1
40
Convolution: zero padding
Full padding
3 1 0 1
1 1 2 0
CONV
1 2 2 1
0 1 0 2
1 0 2
Input image
i1 × i2 1 2 0
0 1 1 Output
i1 + (k1 − 1) × i2 + (k2 − 1)
Filter’s kernel
k1 × k2
41
Convolution: zero padding
Full padding
3 1 0 1
1 1 2 0
CONV
1 2 2 1
0 1 0 2
1 0 2
Input image
4×4 1 2 0
0 1 1 Output
6×6
Filter’s kernel
3×3
42
Convolution: zero padding
Full padding
3 1 0 1 3 1 6 3 0 2
full-padding
4 8 6 3 6 0
1 1 2 0
CONV 2 8 12 10 5 3
1 2 2 1
1 6 8 12 4 4
0 1 0 2 0 2 5 6 7 1
1 0 2
Input image 0 0 1 1 2 2
4×4 1 2 0
Output
0 1 1
6×6
Filter’s kernel
3×3
Yes 5
End
Default, the kernel moves to right and
down with stride of 1-unit.
1
✴ Stride > 1 can be used to reduce output
feature map => reduce the computation 2
in next layers.
3
Yes 5
End
46 Convolution: non-unit strides
1st valid position
1 1 2 0
1 2 2 1
0 1 0 2
1 1 0
0 2 1
2 0 1
Padded image stride = 2 Rotated kernel
47 Convolution: non-unit strides
1st valid position 2nd valid position
1 1 2 0
1 2 2 1
0 1 0 2
1 1 0
0 2 1
2 0 1
Padded image stride = 2 Rotated kernel
48 Convolution: non-unit strides
1st valid position
2nd valid position
3rd valid position
1 1 2 0
1 2 2 1
0 1 0 2
1 1 0
0 2 1
2 0 1
Padded image stride = 2 Rotated kernel
49 Convolution: non-unit strides
1st valid position stride on x = s1
2nd valid position
i1 + 2p1 − k1
⌊ ⌋
3rd valid position
+1
s1
1x0 1x0 0x0
1 1 2 0
1 2 2 1
k1
0 1 0 2
1 1 0
0 2 1
2 0 1
p1 i1 p1 Rotated kernel
50 Convolution: non-unit strides
p1 i1 p1
i1 + 2p1 − k1
⌊ ⌋
p2
+1
s1
3 1 0 1
1 1 2 0
i2
i2 + 2p2 − k2
⌊ ⌋
1 2 2 1
0 1 0 2 +1
s2
p2
k1
1 1 0
k2 0 2 1
stride on x = s1
2 0 1
stride on y = s2
Rotated kernel
51 Convolution: non-unit strides
p1 i1 p1 i1 = i2 = 4
k1 = k2 = 3
p2
full padding: p1 = p2 = 2
3 1 0 1 non-unit strides: s1 = s2 = 2
1 1 2 0
i2
1 2 2 1
0 1 0 2
p2
k1
1 1 0
k2 0 2 1
stride on x = s1
2 0 1
stride on y = s2
Rotated kernel
52 Convolution: non-unit strides
p1 i1 p1 i1 = i2 = 4
k1 = k2 = 3
p2
full padding: p1 = p2 = 2
3 1 0 1 non-unit strides: s1 = s2 = 2
1 1 2 0
i1 + 2p1 − k1
i2
⌊ ⌋ +1=3
1 2 2 1
0 1 0 2 s1
i2 + 2p2 − k2
⌊ ⌋ +1=3
p2
s2
k1 3
1 1 0
k2 0 2 1 3
stride on x = s1
2 0 1
stride on y = s2
Rotated kernel