3D CNN
3D-CNN
❖3D Convolutional Neural Network (3D CNN) is a type of deep learning model
used for image segmentation in three-dimensional data, such as medical
volumetric images (e.g., CT scans, MRI scans) or video sequences.
❖3D CNNs process volumetric data and are designed to capture spatial and
temporal dependencies in 3D images.
Why 3D CNN?
❖3D CNN helps in capturing both temporal and spatial characteristics.
❖This feature makes possible for 3D CNN to analyse the relationships between frames in time,
thus making it excellent choice to analyse videos.
❖Usability in medical Imaging and Computer vision for action recognition and scene
understanding.
INPUT DATA
3D input like volumetric images (CT or MRI scan) or 2D image sequence (Video)
Structure
❖ 3D Convolution layers- It uses 3D filters
(kernels) to scan the input data. These
filters slide through the data in three
dimensions. As they move, they detect
features like edges, textures, and
shapes. The output of this layer is a set
of 3D feature maps.
❖ Activation Functions- After convolution,
the network applies an activation
function to the feature maps. Common
choices include ReLU and sigmoid
functions. These functions introduce
non-linearity, enabling the network to
learn more complex patterns.
3D Convolution
The network applies 3D convolutional filters to the input data. These filters move through the
height, width, and depth dimensions, detecting various features.
• 3D convolutions applies a 3-dimensional filter
to the dataset and the filter moves 3-
direction (x, y, z) to calculate the low-level
feature representations.
• Their output shape is a 3-dimensional
volume space such as cube or cuboid. They
are helpful in event detection in videos, 3D
medical images etc. They are not limited to 3d
space but can also be applied to 2d space
inputs such as images.
Structure
❖3D Pooling Layer- The pooling
layer reduces the size of the
feature maps. This process,
called downsampling, makes
the network more efficient.
❖It also helps in reducing the
computational load. Common
pooling methods include max
pooling and average pooling.
In 3D CNNs, pooling operates
in three dimensions.
Structure
3D Pooling- The network performs pooling to reduce the feature map size, making the
processing more manageable.
Structure
❖ Fully Connected Layers- The final
stages of a 3D CNN involve fully
connected layers. These layers take
the flattened feature maps and
process them further.
❖ They perform high-level reasoning
and generate predictions. The output
is typically a classification or
regression result.
INPUT- 64*64*240
3D Convolution Kernel Size- (kx, ky, kz) where kx and ky are spatial dimensions and
kz is temporal or depth dimensions. Eg- (3*3*3) kernel size
After Convolution layer we have pooling layer- eg: MaxPooling3D (pool size- 2*2*2).
The network repeats the convolution, activation, and pooling steps multiple
times. This process builds a hierarchy of features.
Flattening layer- The network flattens the final set of feature maps into a
one-dimensional vector.
Dense/ Fully connected layer- The fully connected layers process the
flattened data, making high-level inferences. The network generates the
final output, which could be a classification label or a regression value.
2D CNN vs 3D CNN
2D CNN 3D CNN
The convolutional kernel moves in 2- The convolutional kernel moves in 3-direction
direction (x,y) to calculate the (x,y,z) to calculate the convolutional output.
convolutional output.
The output shape of the output is a 2D Output-shape is 3D Volume
Matrix.
Captures Spatial dependencies Captures both spatial and temporal
dependencies.
Use cases: Image Classification, Use Case: Conv3D is mostly used with 3D image
Generating New Images, Image data such as Magnetic Resonance Imaging (MRI)
Inpainting, Image Colorization, etc. or Computerized Tomography (CT) Scan.
3D CNN Architectures
•V-Net: The V-Net architecture is designed for medical image segmentation and employs a
U-Net-like architecture for 3D data. It includes skip connections and is known for its
segmentation accuracy.
•3D U-Net: Similar to the V-Net, the 3D U-Net extends the popular 2D U-Net architecture
to 3D data, making use of skip connections.
•3D ResNet: 3D Residual Networks are adaptations of the well-known ResNet architecture
for 3D data, incorporating residual blocks to handle deep networks.
•3D DenseNet: DenseNet for 3D data connects each layer to every other layer, promoting
feature reuse and gradient flow.
•3D Inception: Inspired by Google’s Inception models, 3D Inception networks utilize multi-
scale convolutional filters to capture features at various resolutions.
HUMAN ACTIVITY RECOGNITION
Action recognition is being used in the
development of assistive technologies,
like smart homes, automation of
surveillance or security systems, and
virtual reality applications, such as
creating decentralized meeting spaces.
Medical Imaging
Currently, medical imaging is done by
capturing slices of the depth of the tissue to
be evaluated but because the body is made of
3D structures that move, all of the images
must be viewed in context to be useful.
By combining these static images with volume
or spatial context, processes such as
identification of cancerous cells, evaluation of
arterial health, and structural mapping of
brain tissue can be initially processed by a 3D
CNN, reducing the time needed for human
evaluation and allowing faster patient care.
References
https://training.galaxyproject.org/training-material/topics/statistics/tutorials/CNN/slides.html#1
https://sarosijbose.github.io/files/talks/A%20General%20Overview%20of%203D%20Convolutio
n%20.pdf
https://medium.com/@saba99/3d-cnn-4ccfab119cc2
https://www.tensorflow.org/tutorials/video/video_classification
Thank You
For more information, please visit the following links:
[email protected]
[email protected]
https://www.linkedin.com/in/gauravsingal789/
http://www.gauravsingal.in
15 February 2025 18