MODULE - IV
Segmentation by Fitting a Model: The Hough
Transform, Fitting Lines, Fitting Curves, Fitting as a
Probabilistic Inference Problem, Robustness
Geometric Camera Models: Elements of Analytical
Euclidean Geometry, Camera Parameters and the
Perspective Projection, Affine Cameras and Affine
Projection Equations
Geometric Camera Calibration: Least-Squares Parameter
Estimation, A Linear Approach to Camera
Calibration, Taking Radial Distortion into Account,
Analytical Photogrammetry, An Application: Mobile
Robot Localization
Segmentation by Fitting a Model in Computer Vision
Model fitting is a segmentation approach that assumes an
image contains objects or structures that follow a certain
parametric model. The goal is to fit a mathematical model
to the image data and use it to segment meaningful regions.
This method is widely used in medical imaging, object
detection, and motion analysis.
1. Why Use Model Fitting for Image Segmentation?
Handles Complex Structures: Works well for objects
that follow a known shape (e.g., circles, ellipses, planes).
Robust to Noise: Many techniques can ignore outliers
using robust fitting.
Strong Theoretical Foundation: Uses geometric and
probabilistic models for segmentation.
2. Common Model Fitting Techniques for Segmentation
(a) Hough Transform
Detects geometric shapes like lines, circles, and ellipses
in an image.
Uses parametric equations to describe shapes.
Accumulates votes in a parameter space to find the best-
fitting model.
📌 Example Use Case: Lane detection in self-driving cars.
(b) RANSAC (Random Sample Consensus)
A robust fitting algorithm that finds the best model
while ignoring outliers.
Randomly samples a subset of data points and fits a
model.
Iteratively refines the model by maximizing inliers.
📌 Example Use Case: Plane segmentation in 3D point
clouds.
(c) Active Contours (Snakes)
Uses an energy-minimization approach to fit a flexible
contour to object boundaries.
The contour evolves over time to match the object's
edges.
Works well for medical image segmentation.
📌 Example Use Case: Segmenting organs in MRI scans.
(d) Level Set Method
A more advanced version of active contours that can
handle topological changes (e.g., merging or splitting of
regions).
Represents contours as implicit functions rather than
explicit curves.
📌 Example Use Case: Tumor segmentation in medical
imaging.
(e) Gaussian Mixture Model (GMM) + Expectation
Maximization (EM)
Assumes that an image consists of multiple Gaussian-
distributed regions.
Uses the Expectation-Maximization (EM) algorithm to
estimate parameters.
Often combined with Markov Random Fields (MRF)
for spatial coherence.
📌 Example Use Case: Skin segmentation in face
recognition.
4. Applications of Model-Based Segmentation
Medical Imaging: Tumor detection using Active
Contours and Level Sets.
Autonomous Vehicles: Lane and road sign detection
with Hough Transform.
Augmented Reality: Object tracking using RANSAC-
based feature matching.
Remote Sensing: Land-use classification using Gaussian
Mixture Models.
5. Challenges and Limitations
Model Assumption: Requires prior knowledge of object
shape.
Computational Complexity: Some methods (e.g., Level
Sets) can be slow.
Parameter Sensitivity: Performance depends on the
choice of hyperparameters.
The Hough Transform in Computer Vision
The Hough Transform is a powerful technique used in
computer vision for detecting shapes (such as lines, circles,
and ellipses) in images. It is particularly useful when objects
have well-defined geometric forms, even in the presence of
noise or partial occlusion.
1. Why Use the Hough Transform?
✅ Detects Global Structures – Works well even when edges
are broken or noisy.
✅ Robust to Occlusion – Can detect shapes even if parts of
them are missing.
✅ Efficient for Line and Circle Detection – Finds
geometric features directly in an image.
2. How Does the Hough Transform Work?
The Hough Transform converts a geometric problem into a
voting problem in parameter space.
4. Applications of the Hough Transform
💡 Autonomous Vehicles: Lane detection in self-driving cars.
💡 Medical Imaging: Blood vessel and tumor detection.
💡 Robotics: Shape recognition for object grasping.
💡 Industrial Inspection: Detecting cracks in materials.
💡 Astronomy: Detecting celestial objects like craters or
galaxies.
5. Challenges and Limitations
⚠ Computational Complexity: High for large images with
many edge points.
⚠ Parameter Sensitivity: Requires fine-tuning of threshold
values.
⚠ Overlapping Shapes: Can struggle when shapes overlap in
noisy environments.
⚠ Gradient Sensitivity: Hough Gradient Circle detection
may fail with weak edges.
Line Fitting
If you're looking for line fitting to segment the image, we can
use techniques like:
1. Hough Transform – Detects straight lines in the image.
2. RANSAC (Random Sample Consensus) – Fits lines
while handling noise.
3. Least Squares Regression – Fits lines to edge points.
I'll perform segmentation by fitting lines using techniques
such as Hough Transform or RANSAC to detect and fit
lines in your image. I'll process the image and display the
results. Let me proceed with the analysis.
For segmentation by fitting curves, we can use various
curve-fitting techniques to detect and model edges or regions
in the image. Some common approaches include:
1. Polynomial Curve Fitting – Fits curves to detected
edges using polynomial regression.
2. Active Contours (Snakes) – Uses an iterative energy
minimization approach to find object boundaries.
3. Spline Fitting – Fits smooth curves through edge points.
4. RANSAC for Curves – Finds robust curves while
handling noise.
5. Elliptical or Circular Fitting – If the objects have
circular or elliptical shapes.
Fitting as a Probabilistic Inference Problem
Fitting a model for segmentation as a probabilistic inference
problem involves framing the segmentation task in a
probabilistic framework. This means estimating the most
likely segmentation given the observed image data. Here’s
how it can be approached:
2. Graph-Based Models (Markov & Conditional Random
Fields)
Markov Random Fields (MRFs): Model spatial
dependencies between pixels, encouraging smooth
segmentations.
Conditional Random Fields (CRFs): Improve
boundary accuracy by conditioning on image features.
3. Expectation-Maximization (EM) for Mixture Models
Model pixel intensities as a Gaussian Mixture Model
(GMM) and use EM algorithm to iteratively estimate
segment assignments.
4. Variational Inference & Monte Carlo Sampling
Use Variational Bayes to approximate the posterior
distribution of segment labels.
MCMC (Markov Chain Monte Carlo Sampling) (e.g.,
Gibbs sampling) to generate samples from the posterior
and refine segmentation.
Robustness
To improve robustness in segmentation and curve fitting
using probabilistic inference, we can consider the following
techniques:
1. Handling Noise with RANSAC (Robust Fitting)
Instead of directly fitting a polynomial curve, we use
RANSAC (Random Sample Consensus) to filter out
noise and outliers.
RANSAC selects subsets of points iteratively and finds
the best fit while ignoring outliers.
2. Bayesian Uncertainty Estimation (MCMC Sampling)
Instead of a single deterministic fit, we can use Markov
Chain Monte Carlo (MCMC) sampling to estimate the
posterior distribution of curve parameters.
This approach provides confidence intervals and
identifies uncertain regions in the segmentation.
3. Regularization and Priors
Adding a prior distribution (e.g., Gaussian prior on
curve smoothness) ensures that the segmentation remains
stable even with noise.
4. Edge Refinement with Conditional Random Fields
(CRFs)
CRFs improve segmentation boundaries by modeling
dependencies between neighboring pixels.
Geometric Camera Models and Analytical Euclidean
Geometry
Geometric camera models describe how a 3D world is
projected onto a 2D image plane using mathematical
transformations. These models are fundamental in computer
vision, photogrammetry, and robotics.
1. Elements of Analytical Euclidean Geometry
Analytical Euclidean Geometry provides a framework for
describing the position, orientation, and projection of points
in space using coordinate systems.
2. Camera Projection Models
Cameras capture 3D scenes and convert them into 2D images.
The geometric projection model explains this process.
3. Applications
3D Reconstruction
Augmented Reality
Stereo Vision
SLAM (Simultaneous Localization and Mapping)
Affine Cameras and Affine Projection Equations
Affine cameras and affine projection equations are
fundamental concepts in computer vision and 3D graphics,
especially in the context of geometric transformations
between 3D world coordinates and 2D image coordinates.
Affine Camera Model
In computer vision, an affine camera model is a simplified
camera model that assumes linear transformations from 3D
world coordinates to 2D image coordinates. It doesn't account
for perspective distortion but is often used in situations where
the camera's viewpoint is close to an orthographic projection
or where simplicity is desired.
The affine camera model maps 3D points in the world
coordinate system to 2D points in the image plane, using an
affine transformation. This transformation can be described by
the following equation:
Segmentation by fitting curves
Segmentation by fitting curves involves using curve fitting
techniques to identify regions or objects in an image or data
by finding curves that best represent the boundaries or shapes
of those regions.
Segmentation by fitting curves is a method where you use
mathematical curves (like lines, polynomials, or splines) to
model and separate different parts of an image or data.
For segmentation by fitting curves, we can use various
curve-fitting techniques to detect and model edges or regions
in the image.
Some common approaches include:
1. Polynomial Curve Fitting – Fits curves to detected
edges using polynomial regression.
2. Active Contours (Snakes) – Uses an iterative energy
minimization approach to find object boundaries.
3. Spline Fitting – Fits smooth curves through edge points.
4. RANSAC for Curves – Finds robust curves while
handling noise.
5. Elliptical or Circular Fitting – If the objects have
circular or elliptical shapes.
How it works:
Data Representation: The image or data is represented as a
set of points or features.
Curve Fitting: Curve fitting algorithms find the "best fit"
curve (or curves) that passes through or closely approximates
the data points.
Segmentation: The fitted curves are then used to define
boundaries or regions, effectively segmenting the image or
data.
Types of Curves:
Polynomials: Simple curves that can be used to model
smooth shapes.
Splines: More flexible curves that can model complex shapes.
Other Curves: Depending on the application, other types of
curves can be used, such as ellipses, circles, or Bézier curves.
Examples of Applications:
Medical Imaging: Segmenting organs or tissues in medical
images.
Image Analysis: Identifying objects or features in images.
Data Visualization: Creating visualizations of data by fitting
curves to represent trends or patterns.
Advantages:
Robustness: Curve fitting can be robust to noise and
variations in the data.
Efficiency: Curve fitting algorithms can be computationally
efficient.
Flexibility: Different types of curves can be used to model
different types of shapes.
Challenges:
Parameter Selection: Choosing the right type of curve and
its parameters can be challenging.
Overfitting: Fitting the curve too closely to the data can lead
to overfitting, where the model doesn't generalize well to new
data.
Computational Cost: For very large datasets, curve fitting
can be computationally expensive.
Fitting as a Probabilistic Inference Problem
Fitting a model for segmentation as a probabilistic inference
problem involves framing the segmentation task in a
probabilistic framework.
This means estimating the most likely segmentation given the
observed image data.
Here’s how it can be approached:
2. Graph-Based Models (Markov & Conditional Random
Fields)
Markov Random Fields (MRFs): Model spatial
dependencies between pixels, encouraging smooth
segmentations.
Conditional Random Fields (CRFs): Improve boundary
accuracy by conditioning on image features.
3. Expectation-Maximization (EM) for Mixture Models
Model pixel intensities as a Gaussian Mixture Model
(GMM) and use EM algorithm to iteratively estimate
segment assignments.
4. Variational Inference & Monte Carlo Sampling
Use Variational Bayes to approximate the posterior
distribution of segment labels.
MCMC(Markov Chain Monte Carlo Sampling) (e.g., Gibbs
sampling) to generate samples from the posterior and refine
segmentation.
Robustness
To improve robustness in segmentation and curve fitting
using probabilistic inference, we can consider the following
techniques:
1. Handling Noise with RANSAC (Robust Fitting)
• Instead of directly fitting a polynomial curve, we use
RANSAC (Random Sample Consensus) to filter out
noise and outliers.
• RANSAC selects subsets of points iteratively and finds
the best fit while ignoring outliers.
2. Bayesian Uncertainty Estimation (MCMC Sampling)
• Instead of a single deterministic fit, we can use Markov
Chain Monte Carlo (MCMC) sampling to estimate the
posterior distribution of curve parameters.
• This approach provides confidence intervals and
identifies uncertain regions in the segmentation.
3. Regularization and Priors
Adding a prior distribution (e.g., Gaussian prior on curve
smoothness) ensures that the segmentation remains stable
even with noise.
4. Edge Refinement with Conditional Random Fields
(CRFs)
CRFs improve segmentation boundaries by modeling
dependencies between neighboring pixels.
Geometric Camera Models and Analytical Euclidean
Geometry
Geometric camera models describe how a 3D world is
projected onto a 2D image plane using mathematical
transformations.
These models are fundamental in computer vision,
photogrammetry, and robotics.
1. Elements of Analytical Euclidean Geometry
Analytical Euclidean Geometry provides a framework for
describing the position, orientation, and projection of points
in space using coordinate systems.
2. Camera Projection Models
Cameras capture 3D scenes and convert them into 2D images.
The geometric projection model explains this process.
3. Applications
• 3D Reconstruction
• Augmented Reality
• Stereo Vision
• SLAM (Simultaneous Localization and Mapping)
Affine Cameras and Affine Projection Equations
Affine cameras and affine projection equations are
fundamental concepts in computer vision and 3D graphics
especially in the context of geometric transformations
between 3D world coordinates and 2D image coordinates.
Affine Camera Model
In computer vision, an affine camera model is a simplified
camera model that assumes linear transformations from 3D
world coordinates to 2D image coordinates.
It doesn't account for perspective distortion but is often used
in situations where the camera's viewpoint is close to an
orthographic projection or where simplicity is desired.
The affine camera model maps 3D points in the world
coordinate system to 2D points in the image plane, using an
affine transformation.
This transformation can be described by the following
equation:
Geometric Camera Calibration
Least-squares parameter estimation
Least-squares parameter estimation is a method for finding
the best values for parameters in a model by minimizing
the sum of squared differences between observed data and the
model's predictions.
This technique is widely used in regression analysis and other
areas where a model needs to be fitted to a set of data.
1. The Goal:
The core principle is to find the parameter values that make
the model's predictions (or fitted values) as close as possible
to the actual observed values.
This is achieved by minimizing the "residuals," which are the
differences between the observed and predicted values.
Specifically, the least-squares method minimizes the sum of
the squared residuals
2. How it Works:
Linear Models:
For linear models, the minimization is often done analytically,
using calculus.
Non-linear Models:
For non-linear models, numerical optimization algorithms are
typically used to find the parameter values that minimize the
sum of squared residuals.
Example:
Consider a linear regression model where you want to find the
best-fit line for a set of data points. The least-squares method
would find the slope and intercept of the line that minimizes
the sum of the squared vertical distances between the data
points and the line.
3. Advantages:
Ease of Calculation:
In many cases, least-squares calculations can be done
relatively easily, especially for linear models.
Good Properties:
Under certain assumptions (such as normally distributed
errors), least-squares estimators are known to be unbiased and
have the minimum variance among all unbiased estimators.
Versatility:
Least-squares can be applied to various types of models,
including linear regression, multiple linear regression, and
polynomial regression.
4. Applications:
Regression Analysis: Finding relationships between
variables.
Model Fitting: Fitting probability distributions to data.
Signal Processing: Estimating frequency spectra.
System Identification: Determining the parameters of a
system based on input-output data.
5. Limitations:
Sensitivity to Outliers:
Least-squares can be sensitive to outliers in the data, as the
squared differences are magnified for larger errors.
Radial distortion
Radial distortion, a common issue in optics and image
processing, occurs when straight lines in an image appear
curved, especially near the edges.
It's caused by the lens not perfectly projecting straight lines
onto the image plane.
To account for radial distortion, one typically needs
to calibrate the camera and apply a correction based on the
identified distortion coefficients.
Explanation:
Cause:
Radial distortion arises from the lens not being perfectly
rectilinear, meaning it doesn't map straight lines in the real
world to straight lines in the image. The distortion can be
either barrel-shaped (lines bulge outwards) or pincushion-
shaped (lines bulge inwards).
Calibration:
Camera calibration involves determining the camera's intrinsic
parameters (e.g., focal length, principal point) and extrinsic
parameters (e.g., rotation and translation of the camera in the
world). It also includes identifying the radial distortion
coefficients.
Correction:
Once the camera is calibrated, a correction can be applied to
undistort the image. This involves using the distortion
coefficients to map the distorted image coordinates to their
corresponding undistorted coordinates.
Mathematical Models:
Radial distortion is often modeled using a polynomial
equation with a few coefficients, typically 2 or 3 (k1, k2, k3)
in camera calibration.
These coefficients quantify the magnitude of the distortion.
Real-World Applications:
Understanding and correcting for radial distortion is crucial in
various computer vision and image processing applications,
such as object detection, semantic recognition, and 3D
reconstruction.
In essence, accounting for radial distortion involves:
• Identifying the presence of radial distortion in images.
• Calibrating the camera to determine the distortion
coefficients.
• Applying the correction to undistort the image and
restore the straight lines.
Analytical photogrammetry
Analytical photogrammetry is a method that uses
mathematical calculations to determine the three-dimensional
coordinates of points in the real world based on camera
parameters, image coordinates, and ground control points.
It rigorously accounts for any camera tilts and distortions,
unlike simpler methods.
Key Concepts:
Perspective Geometry:
Analytical photogrammetry relies on the principles of
perspective geometry, where points in the real world project
onto the image plane.
Camera Parameters:
These include the focal length, principal point, and camera
orientation (rotation and translation).
Image Coordinates:
These are the x and y coordinates of a point in the image.
Ground Control Points:
These are points with known coordinates in the real world,
used to establish a tie between the image and the real world.
Collinearity Condition:
This principle states that the exposure station (the point where
the camera lens is located), a real-world point, and its image
point all lie on a straight line.
How it works:
Data Acquisition: Images are taken of the object or scene
from multiple viewpoints.
Image Coordinate Measurement: Precise measurements of
the x and y coordinates of points of interest are made on the
images.
Ground Control Measurement: Coordinates of ground
control points are established in the real world.
Mathematical Calculations: Using the camera parameters,
image coordinates, and ground control points, mathematical
equations are used to calculate the three-dimensional
coordinates of the points in the real world.
Orientation: The camera's position and orientation are
determined using the ground control points.
Geometrical Reconstruction: The calculated coordinates are
used to create a three-dimensional model of the object or
scene.
Benefits of Analytical Photogrammetry:
Accurate Measurements:
Analytical photogrammetry provides precise and accurate
measurements of real-world objects and features.
Detailed Models:
It allows for the creation of detailed three-dimensional models
of complex scenes.
Automated Processes:
The use of software and mathematical calculations allows for
automated and efficient processing of images.
Applications:
Mapping and Surveying: Creating accurate maps and
surveys of land areas.
Engineering and Architecture: Analyzing and visualizing
structures and designs.
Forensic Science: Measuring and documenting crime scenes.
Archaeology and Cultural Heritage: Documenting and
preserving ancient sites and artifacts.
Geological Studies: Monitoring and mapping landforms and
geological features.
An Application : Mobile robot localization
Mobile robot localization is the process of determining a
robot's position and orientation within its environment, crucial
for autonomous navigation and task execution.
It involves using sensors and algorithms to estimate the
robot's location relative to a known map or coordinate system,
allowing it to navigate, interact with its surroundings, and
perform various tasks.
How Localization Works:
Sensors:
Mobile robots utilize various sensors like odometry, vision,
LiDAR, and ultrasonic sensors to gather information about
their environment.
Algorithms:
This sensor data is processed by localization algorithms,
which can be broadly categorized as:
• Passive Localization: Relies on environmental features
(landmarks, beacons) and sensor data without actively
controlling the robot's movement.
• Active Localization: Incorporates the robot's movement
and control capabilities to enhance localization
accuracy.
• Map:
Localization often involves integrating sensor data with a
predefined map or a dynamically created map through
techniques like Simultaneous Localization and Mapping
(SLAM).
Importance of Localization:
Autonomous Navigation:
Accurate localization is fundamental for robots to navigate
complex environments, avoid obstacles, and reach their
destinations efficiently.
Task Execution:
Localization enables robots to perform tasks like material
handling, inspection, and maintenance by accurately locating
objects and specific areas within their workspace.
Collaboration and Coordination:
Localization allows robots to work together effectively in
shared spaces, ensuring safety and efficient collaboration.
Applications:
Industrial Automation:
Localization is essential for mobile robots used in
manufacturing, logistics, and other industrial settings for tasks
like material transportation, assembly, and quality control.
Service Robotics:
Localization enables robots to navigate in various indoor
environments, such as homes, hospitals, and warehouses, for
tasks like cleaning, security, and healthcare assistance.
Exploration and Research:
Localization is crucial for robots deployed in challenging
environments like disaster zones, underwater settings, or
space exploration, enabling them to navigate and gather data.