0% found this document useful (0 votes)
58 views23 pages

Digital Image Processing: Thematic Information Extraction III: Unsupervised Classification

This document discusses unsupervised classification techniques in digital image processing, including the Chain Method, ISODATA, and cluster busting. The Chain Method examines data in two passes to build spectral clusters and assign pixels to clusters. ISODATA is an iterative technique that distributes initial cluster means and assigns pixels to the nearest cluster mean over multiple iterations. Cluster busting extracts mixed pixels from an initial classification to run additional clustering and improve classification accuracy.

Uploaded by

idhas
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views23 pages

Digital Image Processing: Thematic Information Extraction III: Unsupervised Classification

This document discusses unsupervised classification techniques in digital image processing, including the Chain Method, ISODATA, and cluster busting. The Chain Method examines data in two passes to build spectral clusters and assign pixels to clusters. ISODATA is an iterative technique that distributes initial cluster means and assigns pixels to the nearest cluster mean over multiple iterations. Cluster busting extracts mixed pixels from an initial classification to run additional clustering and improve classification accuracy.

Uploaded by

idhas
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Digital Image Processing

Thematic Information Extraction III:


Unsupervised Classification

Arnis Asmat
Environmental information System
EVT 552
Outline
• Introduction

• The “Chain Method”

• ISODATA

• Unsupervised “Cluster Busting”


Introduction
• Unsupervised classification requires minimal
input from the analyst.

• Numerical operators search for natural groupings


of the spectral properties of pixels in
multispectral feature space.

• After classification, the analyst must assign


spectral classes to information classes of interest.
Introduction
• This is not always an easy task.

• Some clusters are meaningless because they


represent mixed classes of Earth surface
materials.

• An analyst must understand the spatial and


spectral characteristics of the terrain in order to
accurately label spectral clusters.
The “Chain Method”
• This algorithm works in two passes (examines
the data set twice).

• The first pass reads through the data and builds


spectral clusters (calculates mean vectors).

• The second pass uses a minimum distance


decision rule to assign pixels to a cluster.
The “Chain Method”
• During the first pass, the analyst must supply four pieces of
information:

– R, radius in multispectral space used to determine when a new


cluster should be formed (e.g. 15);

– C, a spectral space distance used when merging clusters (e.g. 30);

– N, the number of pixels to be evaluated between each major


merging (e.g. 2000 pixels).

– Cmax, the maximum number of clusters to be identified (e.g. 20).


Consider an image with 2
bands only.

Algorithm looks at one pixel


at a time beginning with the
first line of data (this is
where the term “chain”
comes from).

Pixel values are plotted in


multispectral space.
Adapted from Jensen (1996)
Pixel 1 (10,10) becomes
Cluster 1.
R is set to 15.

Pixel 2 is plotted in
multispectral space and
compared to Cluster 1.

The distance (D) between


Pixel 2 and Cluster 1 is
14.14.

Thus, Pixel 2 does not


satisfy the criteria of being Adapted from Jensen (1996)
its own cluster, as D < 15.
The mean data vectors for
Pixel 1 and Pixel 2 are
averaged, yielding a new
center for Cluster 1.

Next, Pixel 3 is evaluated.

D between Pixel 3 and


Cluster 1 is > 15, so Pixel 3
becomes the center of a new
cluster (Cluster 2).
Adapted from Jensen (1996)
The next pixel will be
compared to both clusters 1
and 2.
The cluster accumulation continues until the number of
pixels evaluated is greater than N.
At that point, the program stops evaluating individual
pixels and looks at the nature of the clusters formed so far.
A distance between each cluster and every other cluster is
calculated.
Any two clusters separated by a distance less than C are
merged into a single cluster.
This process proceeds until there are no clusters with
separation less than C.
Then, the next pixel is considered, and the process iterates
until the entire data set is examined.
The “Chain Method”
• During the second pass, final cluster means are used
to classify all image pixels into one of the Cmax
clusters.

• The process follows the minimum distance approach


(minimizes the distance between cluster mean and
pixel value).

• Then comes the difficult task … assigning class


names (informational classes) to spectral clusters.
ISODATA
• ISODATA stands for Iterative Self-
Organizing Data Analysis Technique.

• ISODATA is iterative because it makes


several passes through the data set.

• ISODATA is self-organizing because it


requires little human input.
ISODATA
• ISODATA does not allocate its initial mean
vectors based upon the analysis of pixels in
the first line of data.

• Rather, an initial arbitrary assignment of


Cmax cluster means takes place in feature
space (based on band means and standard
deviations).
Here, five mean
vectors are distributed
along the vector
beginning at 3 - 3, 4
- 4 and ending at 3 +
3, 4 + 4.

This ensures that the


first few lines of data
do not bias the
creation of cluster
means.

Note that the


parallelpiped does not
capture all pixel Adapted from Jensen (1996)
values in this example.
ISODATA
• First iteration:

– When Cmax clusters are in place, each pixel in


the image is compared to each cluster mean and
assigned to a cluster based upon minimum
distance.

– Those pixels with values outside a distance


threshold are excluded from clustering.
Adapted from Jensen (1996)
ISODATA
• Second through nth iterations:

– After the first iteration, a new mean for each cluster is


calculated based upon actual spectral locations of pixels
included in each cluster.

– The process is repeated with each candidate pixel again


compared and assigned to the nearest cluster mean.

– Some pixels switch class alignment.


Adapted from Jensen (1996)
ISODATA

• The process continues until:

– 1) There is little change in class assignment


between iterations (a threshold is reached); or

– 2) A maximum number of iterations is


reached.
Adapted from Jensen (1996)
Unsupervised “Cluster Busting”

• It is not uncommon to be able to label only


about 60-70% of spectral clusters with
confidence.

• Confused classes result from terrain


distortion, soil moisture and soil type
variation, and spectral mixing within pixels.
Unsupervised “Cluster Busting”
• When this occurs, it is possible to perform
cluster busting.

• First, mixed clusters are recoded to a value


of 1 to create a binary mask.

• The mask is used to extract confused pixels


from the original data set.
Unsupervised “Cluster Busting”

• The output file is a new multi-band image


file consisting of only those pixels that
could not be labeled with confidence.

• A new unsupervised classification is run on


only those pixels, yielding an additional
Cmax number of clusters.

You might also like