Report for the IS Project
Fingerprint Classification through Self Organization
Maps Modified to Treat Uncertainties
Wu Zhili
(99050056)
SCI 3740 Intelligent Systems
Lecturer: Dr. Jiming Liu
May 3, 2002
Table of Contents
Project Overview…………………………………………………………………P3
Project Objective…………………………………………………………………P3
Algorithms and Implementation………………………………………………….P4
Experiment……………………………………………………………………….P11
User Manual ……………………………………………………………………..P13
References………………………………………………………………………..P14
2
Project Overview:
In my mini-project, I try to do fingerprint classification by using self-
organization map (SOM). Mainly guided by the paper [1], both a conventional and a
modified SOM algorithm are used to fulfill the task. Also different approaches from
the literature for retrieving the feature vectors of fingerprints and for fully making use
of the available SOM toolbox [2] play a critical role to keep my classifier system
efficient and robust.
The reported performance in [1] is partially achieved in my classifier system.
The SOM is proved to be truly useful for classifying the difficult fingerprint
classification problem. Also the classifier has the strength of being extended by
introducing more complicated SOM architectures or further utilizing advanced
statistical method like principal component analysis (PCA).
Project Objective:
1. To explore the industrial utilities of SOM.
2. To extend my honors project which concentrates on the fingerprint
preprocessing and matching to the new classification field.
3
Algorithm and Implementation Details:
[Link] vector Generation
A 256-dimension feature vector for each fingerprint image is generated. The
feature vector generation stage is an important part for keeping the accuracy of the
classification system. Nearly half time is paid to this part.
The idea can be concluded into four steps. Firstly the block directional image
and the certainty value associated with each block are calculated. Then the
segmentation of the fingerprint image so that noisy and corrupted parts that do not
carry valid information are deleted. Meanwhile the core point to be taken as the
reference center is extracted. Then a 16x16 block directions surrounding to the core
are composed into the feature vector.
A. Block Direction Estimation
The block direction estimation program operates on the gray level
fingerprint image and then obtains an approximate direction for each
image block with size WxW(16x16 by default). The method used in my
project is originated from [3], which calculates the horizontal and vertical
gradients of each pixel and then combines all the gradients within the
block to get an estimated direction. It has large consistent with the ridge
flows of the original fingerprint image in most testing cases. The other
reason I choose this algorithm rather than the one used by [1] is that the
certainty values can be generated simultaneously during the direction
estimation, which definitely reduces the computation cost comparing with
the two independent procedures adopted by [1]. All the directions are then
4
normalized into the domain from 0 to pi, and the certain values are in the
interval from 0 to 1.
B. Segmentation
An inevitable problem after step A is that we cannot eliminate the
background noise completely although we have already discarded the
image areas that have too small certainty values. Also we need to extract
the tightly bounded fingerprint region from the image to benefit the
consequent steps.
A series of complicated operations are used to fulfill the segmentation
task, which are also totally different from those ones used by [1], but still
keep the efficiency and accuracy high. They include Histogram
Equalization, Image enhancement and coarse segmentation by Fourier
Transform, Image binarization, Interest region locating by Morphological
operations. They are from multiple sources such as literatures, student
projects and my own implementation for the honors project. They will not
be explained in detail here since we concentrate on the SOM part.
Interesting parties can refer to [3][4][6].
C. Core point detection
The one remaining fundamental step before classification is the core point
extraction, that is, the automatic detection of the core point of the
fingerprint. This step is particularly important since a reference center is
required in order to correctly compare two fingerprints.
But several assumptions and restrictions have to be clarified first.
Automated core detection can only find the most likely center of the image
5
without regard whether there is a meaningful core exists or not.
Furthermore, the alignment according to the core point only partially
remedies the misalignment of two fingerprints since we do not consider
the scaling, rotation, and clipping factors.
An algorithm developed by [3] is used to detect the core point but with a
slight difference. [3] maps all the block directions to the interval from –0.5
to 0.5 and then simply regards the value 0.5 corresponds to the core. But in
my test, values through such a mapping do not have an exact 0.5 in most
case, so the region with a largest value in center, which also continuously
connects with a largest neighborhood area before the value reduces to a
threshold, is used to locate the core. Proved by testing, 90% of fingerprint
images can be located the core correctly.
D. Regulate the feature vector
Preparing for the feature vector is actually a trivial task after successfully
conducting the above three steps. Simply we extract a 16x16 block central
at the core and then reconstruct it as a 1x256 vector. For the parts of the
block in the background region or outside the image region, we padding
NaN as the value, which will not be regarded effect vector value in further
process. The same operations are enforced to the certainty vector.
6
[Link] construction and training
A free SOM toolbox is adopted to assist in the mini-project. So the discussion
of SOM at the algorithm level is combined with the introduction of the SOM
toolbox to make the explanation specific.
A. The conventional SOM
1. Construct a SOM with size mxm and assign random values to all the
weights corresponding to the mxm map neurons. It is easily achieved
by calling the routine of SOM_RANDINIT.
2. Sequentially train feature vectors:
Total N feature vectors each with 256 dimensions in size are
fed into the SOM. For simplicity, all the N vectors are trained
one by one, so N trainings going through each vector one time
are regarded as a run of training. Such kind of run is taken K
times.
For each input vector, the detailed learning procedures are as
follows:
1. Find the winning output node dmin by the following
criterion:
dmin = min{||x-wj||}
Where ||.|| denotes the Euclidean norm and w j is the
weight vector connecting input nodes to output node j;
2. Adjust the weight vectors according to the following
update formula:
wij(t+1) = wij(t) + L(t)[xi(t) – wij(t)] N(j,t)
7
Where wij is jth component of the weight vector wj, L(t)
is the learning rate and N(j,t) is the neighborhood
function.
The neighborhood function is a window centered around the
winning node dmin , whose radius decreases with time. In the
implementation, the radiuses at the beginning and the end can
be set explicitly when calling the function SOM_SEQTRAIN.
It is simply set to decrease from the map size m to 1 during all
the K runs. The learning rate function L(t) is also a decaying
function. It is kept large at the beginning of training and
decreased gradually as learning proceeds.
[Link] SOM (MSOM)
The modified SOM accepts the certainty vector as well and considers
the certainties during the training stage.
The toolbox does not provide such a modified SOM, so some manual
work has to be done to implement the MSOM.
The detailed MSOM is like that:
1. Initialize a mxm SOM. For reducing the variation, all weights of
SOM neurons are set to zero.
2. Construct a structure containing a feature vector x and its certainty
vector c. And then input the structure into the SOM.
As described in [1], the current step reforms the feature vector x to
xc according to:
8
Xc = cx + (1-c)xavg . where xavg is the vector holding the average
values xk (k from 1-256) over the whole training sample space.
But in my approach, such a distortion of input data is not enforced,
because the Xc after the interpolation might be drastically
reformed. It actually cannot be used to “stimulate the components
of the input vectors of which we are certain and inhibit the less
certain ones” as declared as [1] because attenuating the block with
a small certainty or raising the block direction with a good
certainty do not ensure the best-matched winning node will be
associated due to the large variety of weight values of SOM
neurons.
My way to bind the certainty values is explained in the later step.
3. Find the winning output node dmin by the following criterion:
dmin = min{||c (x-wj)||}
Where ||.|| denotes the Euclidean norm and wj is the weight vector
connecting input nodes to output node j.
Note that the criterion for dmin is slightly different from the one for
the conventional SOM case, because the calculation for the
Euclidean distance is now weighted by the certainty values. It is
more reasonable than the simply distortion the input vector by
interpolating certainty values.
9
4. Adjust the weight vectors of the SOM neurons in the neighborhood
area according to the following formula:
wij(t+1) = wij(t) + L(t)[xi(t) – wij(t)] N(j,t)*ci
Where wij is jth component of the weight vector wj, L(t) is the learning
rate and N(j,t) is the neighborhood function.
10
Experiment Details:
1. Fingerprint images:
Format: TIF.
Source: All the fingerprint images are from FVC2000 and FVC2002 [3].
2. Database size: 60. Each fingerprint has three different rolling. Two rolling of each
fingerprint are used as training samples, the other one is used as testing sample.
3. Result
SEARCH% RECOGNI ON%
3x3 4x4 5x5 8x8 10x10
M 10 12.1 18.8 28.8 91.8 100
S 20 26.0 38.7 54.0 100
O 30 40.1 61.3 80.5
M 40 55.9 86.6 100
50 71.9 100
60 88.5
70 100
80
90
100
10 19.6 16.4 26.4 40.0 62.7
S 20 39.6 38.1 53.1 100 100
O 30 57.3 60.9 82.5
M 40 72.9 84.8 100
50 86.3 100
60 97.5
70 100
80
90
100
The above table gives the worst-case search performance for various sizes of SOM /
MSOM networks. The row of column heading with bold font shows the map size of
the networks used.
11
Each entry of the table denotes the worst-case percentage of the testing database that
is indexed into certain classes through the well-trained SOM by using the search
space percentage given in the row heading. The worst case is that all elements in the
class indexed by a testing fingerprint image are searched. Its search price simply
equals to the size of the currently visited class.
Take the SOM part of the table as an example to give interpretation for the values.
The 19.6 in the 3x3 column means when 19.6% fingerprint images in the testing
database are indexed, 10% fingerprint images in the training database which have
already been classified during the training stage are searched to locate the fingerprints
sharing the same origin with those ones in the testing database.
Less searched percentage of the training database implies that the class categories are
more uniformly distributed. It is obvious that uniform distribution reduces the worst-
case search and makes the classification better. But another evaluation index is the
percentage of fingerprint images which are exactly hashed into the same class as their
identical rolling. In my test, 40%~60% fingerprints coming from the same finger are
within the same bin and they occupy 8-20 classes for different map size.
From the above table, we can also observe that MSOM performs better than SOM for
most cases. Also the results improve with the increasing network size.
12
User Manual
1. GUI mode for inspecting the Feature Vector Generation:
1. Add the directory source and somtoolbox2 to the PATH of MATLAB,
and change the current directory of Matlab to c9050056/db
2. Type start_gui_single_mode command
3. Click Load to load a image file, which can be found in the folder:
c9050056/db
4. Click Direction to get the block directions
5. Click ROIArea to extract the foreground fingerprint image
6. Click Getcore to watch the core detection
2. Batch processes
1. Open the batch_getTrainingVector.m and batch_getTestVector.m to
modify the path to load image files.
2. run batch_getTrainingVector.m and batch_getTestVector.m to get the
[Link], [Link], [Link], cV-dat.
3. Open the batch_test_som.m and modify variables sizem and stepL to
arbitrary values
4. run batch_test_som.m to get the experiment result. The data matrix
stores the result.
5. run step 3 and 4 with different parameters many times to get a set of
experiment result.
6. run step 3,4 and 5, but the tested file is batch_test_msom.m
13
References
[1] Halici U., Ongun G. “Fingerprint Classification through Self Organization Maps
Modified to Treat Uncertainties”, Proceedings of the IEEE, Vol84, No10, pp1497-
1512, October 1996
[2] SOM Matlab ToolBox. [Link]
[3] Lin Hong. "Automatic Personal Identification Using Fingerprints", Ph.D.
Thesis,1998.
[4] L.C. Jain, [Link], I. Hayashi, S.B. Lee and [Link]. Intelligent biometric
techniques in fingerprint and face recognition. 1999, the CRC Press.
[5] Lakhmi [Link], [Link] Vemuri. Industrial applications of neural networks 1999,
the CRC Press.
[6] Image Systems Engineering Program, Stanford University. Student project By
Thomas Yeo, Wee Peng Tay, Ying Yu Tai.
[Link]
14