EC-803 Computer Vision
Lecture-1:
Course Introduction
Basic Transformations- Translation,
Scaling and Rotation, both in 2D & 3D
MATLAB or OpenCV
NUST College of E&ME, Spring 2015
Course Introduction
Instructor:
Mahmood Akhtar, PhD
(MAHMOOD@[Link])
Lecture Timing: Thu 17302030 hrs, CR (DCE)-16
Topics:
Basic Transformations, Camera Model and Imaging
Geometry, Camera Calibration, Multiview Geometry,
Stereopsis, Structure From Motion, Linear Filters, Edges,
Texture, Segmentation by: Clustering Pixels; Split and
Merge; Mean Shift Algorithm; Graph-Theoretic
Clustering; Fitting a Model- Hough Transform; etc,
Tracking, Model-Based Vision, Finding Templates Using
Classifiers
NUST College of E&ME, Spring 2015
Geometric Transformations- to change sets of
points representing some object (study about
translation, scaling, rotation, etc)
Camera Model and Imaging Geometry- image
formation process, camera coordinates and 3D
world coordinates aligned / not aligned, how to
deal with different situations
Camera Calibration- process of estimating the
parameters of a pinhole camera model,
approximating the camera that produced a
given photograph or video, camera matrix
NUST College of E&ME, Spring 2015
Multiview Geometry- to understand how
several views of the same scene constrain its 3D
structure and camera configurations
Stereopsis- algorithms that mimic our ability to
fusing pictures recorded by two eyes and
exploiting the difference between them to gain
a strong sense of depth
Structure From Motion- to estimate the 3D
shape of a scene from multiple pictures when
cameras positions and parameters are a priori
unknown and may change over time
NUST College of E&ME, Spring 2015
Linear Filters- smoothing by averaging, Gaussian,
derivatives and finite differences, filters and
templates, scale and image pyramids
Edges and Texture- noise and edge detectorsLaplacian and gradient-based; extracting image
structure, analysis and synthesis using oriented
pyramids
Segmentation- subdivides an image or video into
its constituent regions or objects as required,
applications: summarising videos, finding machine
parts, finding people, finding building in satellite
images and searching a collection of images
NUST College of E&ME, Spring 2015
Tracking- problem of generating an inference
about the motion of an object given a sequence
of images. Major applications: motion capture,
recognition from motion, surveillance, and
targeting
Model-Based Vision- object recognition as a
correspondence problem- understanding of the
relationship between the position of image
features, and the position and orientation of an
object; application: registration of VOI in medical
imaging system
NUST College of E&ME, Spring 2015
Finding Templates Using Classifiers- a classifier is
anything that takes a feature set as an input and
produces a class label. Here, we would learn
about techniques for building classifiers with
example of their use in vision applications
NUST College of E&ME, Spring 2015
Text Book & References:
David A. Forsyth and Jean Ponce,
Computer Vision A Modern Approach,
2002 Ed (available from local market)
Class slides & selected research papers
to be distributed by the instructor
Mubarak Shah, Fundamentals of Computer Vision, 1997
(soft copy available online)
Linda Shapiro and George Stockman, Computer Vision, 2000
(soft copy available online)
Rafael C. Gonzalez and Richard E. Woods, Digital Image
Processing, 3rd Edition, 2009 (available from local market)
NUST College of E&ME, Spring 2015
Prerequisites:
Digital image processing
Working knowledge of C++ programming
Knowledge related to:
Euclidean and projective geometry
Linear Algebra
Vector calculus
Probability & Statistics
Yahoo Group:
NUST College of E&ME, Spring 2015
CV_CEME_S2015
Grading Policy*:
Surprise quizzes (Min 6)
8%
Programming assignments (Min 3)
7%
Sessional exam I
15%
Sessional exam II
15%
Project
15%
Final exam
40%
*Relative final grading policy applies
NUST College of E&ME, Spring 2015
10
Quizzes & Assignments:
Please make sure you visit CV_CEME_S2015 group every
day, for notifications about assignments & other related
material to be uploaded from time to time
Quizzes: 6 to 8, carrying 8% weight in the total marks
(best x out of y can be considered in the benefit of
students)
Assignments: min 3, carrying 7% weight in the total
marks. It may be written assignments or programming
assignments. Submission deadline will be given with the
assignment. Assignments submitted after the deadline
will not be accepted and will carry ZERO MARKS. Cheated
(i.e., matching) assignments will get ZERO MARKS.
NUST College of E&ME, Spring 2015
11
Project:
Project will carry 15% weight in the total marks
Project is supposed to be conducted individually (i.e., no
grouping)
Your project is most likely going to be an OpenCV
implementation of a recent CV related algorithm / work
Students are encouraged to visit IEEE Explore for 27th IEEE conf
on CVPR and they should start looking into different research
articles (published in 2014)
Project topics / problems should be selected and approval
should be obtained within the first four weeks of the course.
Project presentations will commence from week 13 onwards and
projects (i.e., CD containing draft of proposed novel work,
implementation code, presentation, etc) will not be accepted
after the submission deadline.
Projects consisting of downloaded codes or presentations will
not be accepted and will carry ZERO MARKS
NUST College of E&ME, Spring 2015
12
Vision
Process of discovering what is present in the world
and where it is by looking
NUST College of E&ME, Spring 2015
13
What is Computer Vision?
given an image or more, extract properties of the 3D
world:
- Traffic scene
- Number of vehicles
- Type of vehicles
- Location of closest obstacle
- Assessment of congestion
- Location of the scene captured
-
NUST College of E&ME, Spring 2015
14
Computer Vision
goal is to emulate human vision (which is limited to
the visual band of electromagnetic (EM) spectrum),
including learning and being able to make inferences
and take actions based on visual inputs
NUST College of E&ME, Spring 2015
15
Why Computer Vision?
An image is worth 1000 words
Many biological systems rely on vision
The world is 3D and dynamic
Cameras and computers are cheap
NUST College of E&ME, Spring 2015
16
Applications of Computer Vision
Autonomous cars, Planes, Missiles, Robots, ...
Space exploration
Aid to the blind, Sign language recognitions
Manufacturing, Quality control
Surveillance, Security, Biometrics
Image retrieval
Medical imaging & analysis
...
NUST College of E&ME, Spring 2015
17
Overview
Real World
Image Formation and
Camera Geometry
Modeling and Calibration
Image rectification
Recognition
Recognize
objects using
probabilistic
techniques
Processing on
Single Image
Linear Filters
Edge detection
Texture
Multiple Images
Multi-view geometry
Stereo imaging
Structure from motion
Segmentation
Interpretation
Interpret objects
using geometric
information
Impose some order on
group of pixels to
separate them from
each other or infer
shape information
Action
NUST College of E&ME, Spring 2015
18
Computer Vision Focuses on:
What information should be extracted?
How can it be extracted?
How should it be represented?
How can it be used to achieve the goal?
NUST College of E&ME, Spring 2015
19
Related Disciplines
Image processing
Pattern recognition
Computer graphics
Artificial intelligence
Machine learning
NUST College of E&ME, Spring 2015
20
Related Disciplines
Data
Processing
Computer
Vision
DATA
Computer
Graphics
IMAGES
Image
Processing
NUST College of E&ME, Spring 2015
21
Active Research Topics
Object recognition
Human behavior analysis
Internet and computer vision
Biometrics and soft biometrics
Large scale 3D reconstruction (city level)
Medical image processing
Vision for robotics
NUST College of E&ME, Spring 2015
22
Computer Vision Publications
Journals
IEEE Trans. on Pattern Analysis and Machine
Intelligence (TPAMI)
Internal Journal of Computer Vision (IJCV)
IEEE Trans. on Image Processing
NUST College of E&ME, Spring 2015
23
Computer Vision Publications
Conferences
International Conference on Computer Vision
(ICCV), once every two years
IEEE Conf. of Computer Vision and Pattern
Recognition (CVPR), once a year
Europe Conference on Computer Vision (ECCV),
once every two years
NUST College of E&ME, Spring 2015
24
Basic Transformations
Translation:
( x' = x + x0 , y' = y + y0 , z' = z + z0 )
x 1 0 x0 x
y = 0 1 y y
0
1 0 0 1 1
(2D)
NUST College of E&ME, Spring 2015
x 1
y 0
=
z 0
1 0
Images courtesy of Dr Imtiaz A Taj (MAJU)
0 0
1 0
0 1
0 0
(3D)
x0 x
y0 y
z0 z
1 1
25
Cartesian Coordinate System
Homogeneous Coordinate System
(Euclidean Geometry)
(Projective Geometry)
X
W = Y
Z
kX
kY
Wh =
kZ
k
W1 Wh1 Wh 4
W = W2 = Wh 2 Wh 4
W3 Wh 3 Wh 4
NUST College of E&ME, Spring 2015
26
Basic Transformations
Scaling:
( x' = S x x , y' = S y y , z' = S z z )
x s x
y = 0
1 0
0
sy
0
(2D)
NUST College of E&ME, Spring 2015
0 x
0 y
1 1
x s x
y 0
=
z 0
1 0
0
sy
0
0
0
sz
0 0
(3D)
0 x
0 y
0 z
1 1
27
Basic Transformations
Rotation (2D):
- around origin
x Cos
y = Sin
1 0
Sin
Cos
0
0 x
0 y
1 1
- around an arbitrary point
(not origin)
T-r p(R Tr p)
NUST College of E&ME, Spring 2015
r
28
MATLAB, or OpenCV
Image processing process of manipulating
image data in order to make it suitable for computer
vision applications or to make it suitable to present it
to humans
Computer vision goes beyond image
processing, helps to obtain relevant information
from images and make decisions based on that
information
Steps for a typical computer vision application:
Image acquisition Image manipulation
relevant information Decision making
NUST College of E&ME, Spring 2015
Obtaining
29
Most popular methods to develop computer vision
applications: OpenCV with C/C++, MATLAB and Aforge
MATLAB is the most easiest and the inefficient way to
process images
- an interpreter, not made to go fast but gives you the
opportunity to play with its functionalities
OpenCV is computationally the most efficient framework
-
designed for real time applications
code written in optimized C / C++
can take advantage of multicore processors
further automatic optimization possible using IPP libraries
AForge has qualities in between OpenCV and MATLAB
Matlab is a kind of sandbox for "playing" and learning (and
relatively slow). OpenCV is dedicated and specific (and fast)
NUST College of E&ME, Spring 2015
30
OpenCV has become hardest only because there is
no proper documentation and error handling codes
But OpenCV has lots of basic inbuilt image
processing functions (over 500 functions),
It is worthy to learn computer vision with OpenCV
Useful webpages on this topic:
[Link]
[Link]
NUST College of E&ME, Spring 2015
31
Assignment- 1
Download and install the latest release of OpenCV.
Build and run your first openCV program.
Related Tutorials:
- Installing OpenCV 3 on Ubuntu:
[Link]
- Using OpenCV 3 with Eclipse:
[Link]
NUST College of E&ME, Spring 2015
32