0% found this document useful (0 votes)
85 views16 pages

DenseNet Presentation

Uploaded by

EMMANUEL NIKIEMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views16 pages

DenseNet Presentation

Uploaded by

EMMANUEL NIKIEMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Densely Connected Convolutional Networks (DenseNet)

CVPR 2017

Presented by: Joshua Juste NIKIEMA

Original Authors: Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger

August 4, 2025

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
1 / 16
Outline

1 Abstract

2 Architecture

3 Tasks and Applications

4 Baselines and Comparisons

5 Metrics, Results, and Datasets

6 Future Work

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
2 / 16
Abstract

Problem: Traditional CNNs suffer from vanishing gradient problem


as they get deeper
Key Insight: Networks can be substantially deeper, more accurate,
and efficient with shorter connections between layers
Solution: DenseNet connects each layer to every other layer in a
feed-forward fashion
Connections: Traditional L-layer networks have L connections,
DenseNet has L(L+1)
2 connections
Benefits:
Alleviates vanishing-gradient problem
Strengthens feature propagation
Encourages feature reuse
Substantially reduces parameters

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
3 / 16
DenseNet Architecture - Core Concept

Dense Connectivity Pattern:


Each layer receives feature-maps
from ALL preceding layers
Each layer passes its
feature-maps to ALL subsequent
layers
Features combined by
concatenation (not summation
like ResNet)
Mathematical Formulation:
Figure: 5-layer dense block with growth
xℓ = Hℓ ([x0 , x1 , ..., xℓ−1 ]) rate k=4

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
4 / 16
DenseNet Architecture - Key Components

Dense Blocks: Groups of densely connected layers


Transition Layers: Between blocks for down-sampling
Batch Normalization + 1×1 Conv + 2×2 Average Pooling
Composite Function Hℓ : BN → ReLU → 3×3 Conv
Growth Rate (k): Number of feature-maps each layer produces
Layer ℓ has k0 + k × (ℓ − 1) input feature-maps
Small growth rates (k=12) are sufficient
Architecture Variants:
DenseNet-B: With bottleneck layers (1×1 conv before 3×3)
DenseNet-C: With compression (θ < 1 compression factor)
DenseNet-BC: Both bottleneck and compression

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
5 / 16
Tasks They Are Solving
Primary Task: Image Classification
Datasets Evaluated:
CIFAR-10: 10 classes, 32×32 colored images
CIFAR-100: 100 classes, 32×32 colored images
SVHN: Street View House Numbers, 32×32 digit images
ImageNet: 1000 classes, large-scale image recognition
Key Challenges Addressed:
Vanishing gradient problem in very deep networks
Parameter efficiency vs. accuracy trade-off
Feature reuse and information flow
Overfitting in smaller datasets
Broader Applications Mentioned:
Feature extraction for various computer vision tasks
Transfer learning scenarios
Presented by: Joshua Juste NIKIEMA (Original
Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
6 / 16
Baseline Methods
Primary Comparison: ResNet and ResNet variants
ResNet-110, ResNet-1001
Pre-activation ResNet-164, ResNet-1001
Wide ResNet-16, ResNet-28
ResNet with Stochastic Depth
Other State-of-the-Art Methods:
Network in Network (NIN)
All-CNN
Deeply Supervised Net (DSN)
Highway Networks
FractalNet (with/without Dropout/Drop-path)
Fair Comparison Strategy:
Used publicly available ResNet implementation
Kept all experimental settings identical
Same data preprocessing and optimization settings
Presented by: Joshua Juste NIKIEMA (Original
Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
7 / 16
Experimental Setup

Training Configuration:
Optimizer: SGD with Nesterov momentum (0.9)
Weight decay: 10−4
CIFAR/SVHN: Batch size 64, 300/40 epochs
ImageNet: Batch size 256, 90 epochs
Learning rate: 0.1 initially, divided by 10 at 50% and 75% of training
Evaluation Metrics:
Classification Error Rate (%)
Top-1 and Top-5 Error (ImageNet)
Parameter Efficiency: Accuracy vs. number of parameters
Computational Efficiency: Accuracy vs. FLOPs

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
8 / 16
Key Results - CIFAR and SVHN

Method Params C10+ C100+ SVHN


ResNet-110 1.7M 6.41 27.22 2.01
ResNet-1001 10.2M 4.62 22.71 -
Wide ResNet-28 36.5M 4.17 20.50 -
FractalNet 38.6M 4.60 23.73 1.87
DenseNet-BC (k=24) 15.3M 3.62 17.60 1.74
DenseNet-BC (k=40) 25.6M 3.46 17.18 -

Key Achievements:
30% error reduction on C100 compared to previous best
Significantly fewer parameters than competing methods
State-of-the-art results across all datasets

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
9 / 16
Understanding Top-1 and Top-5 Error Metrics
What are Top-1 and Top-5 Errors?
Top-1 Error: Percentage of test samples where the highest
confidence prediction is wrong
Top-5 Error: Percentage of test samples where the correct class is
NOT among the top 5 predictions
Lower percentages = Better performance
Example: For an image of a ”cat”
Model predicts: [1st: dog 40%, 2nd: cat 35%, 3rd: wolf 15%, ...]
Top-1: WRONG (predicted dog, not cat) → contributes to Top-1
error
Top-5: CORRECT (cat is in top 5) → does NOT contribute to
Top-5 error
Why Two Metrics?
ImageNet has 1000 classes - many visually similar
Top-5 gives credit for ”reasonable” mistakes
Both metrics standard in computer vision research
Presented by: Joshua Juste NIKIEMA (Original
Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
10 / 16
Key Results - ImageNet

Model Top-1 Error (%) Top-5 Error (%)


DenseNet-121 25.02 / 23.61 7.71 / 6.66
DenseNet-169 23.80 / 22.08 6.85 / 5.92
DenseNet-201 22.58 / 21.46 6.34 / 5.54
DenseNet-264 22.15 / 20.80 6.12 / 5.29
Single-crop / 10-crop testing
Parameter Efficiency:
DenseNet-201 (20M params) ResNet-101 (40M+ params)
DenseNet requiring ResNet-50 computation ResNet-101 performance
3× fewer parameters than ResNet for comparable accuracy

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
11 / 16
Parameter and Computational Efficiency

Figure: ImageNet validation error vs. Figure: ImageNet validation error vs.
parameters FLOPs
Key Observations:
DenseNets achieve similar accuracy with significantly fewer
parameters
Computational efficiency (FLOPs) also favors DenseNets
Presented by: Joshua Juste NIKIEMA (Original
Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
12 / 16
Parameter Efficiency Analysis

Figure: Training curves


Figure: DenseNet variants Figure: DenseNet vs comparison
comparison ResNet efficiency
Key Findings:
DenseNet-BC is most parameter-efficient variant
3× fewer parameters than ResNet for same accuracy
100-layer DenseNet (0.8M params) 1001-layer ResNet (10.2M
params)

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
13 / 16
Future Work and Research Gaps
Authors’ Proposed Future Directions:
Feature Transfer: Study DenseNets as feature extractors for other
computer vision tasks
Hyperparameter Optimization: More extensive hyperparameter
search specifically for DenseNets (current settings optimized for
ResNets)
Memory Efficiency: Further improvements in memory-efficient
implementations
Identified Research Gaps:
Scalability: How do DenseNets perform with even deeper
architectures?
Other Vision Tasks: Object detection, semantic segmentation, etc.
Architectural Variations: Different connectivity patterns within
dense blocks
Theoretical Understanding: Why does dense connectivity work so
well?
Computational Optimization: Hardware-specific optimizations for
Presented by: Joshua Juste NIKIEMA (Original
Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
14 / 16
Conclusion
Key Contributions:
Novel Architecture: Dense connectivity pattern with L(L+1) 2
connections
Parameter Efficiency: Substantially fewer parameters than ResNets
State-of-the-Art Results: Superior performance on multiple
benchmarks
Theoretical Insights: Feature reuse and implicit deep supervision
Impact:
Challenges the assumption that deeper networks need more
parameters
Opens new research directions in network connectivity patterns
Provides a strong baseline for future architectural innovations
Practical Value:
More efficient models for resource-constrained environments
Better feature representations for transfer learning
Stable training for very deep networks
Presented by: Joshua Juste NIKIEMA (Original
Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
15 / 16
Questions?

Thank you for your attention!

Presented by: Joshua Juste NIKIEMA (Original


Densely
Authors:
Connected
Gao Huang,
Convolutional
Zhuang Liu,
Networks
Laurens
(DenseNet)
van der Maaten,
AugustKilian
4, 2025
Q. Weinberger)
16 / 16

You might also like