Novel Gradient-Based Texture Features
Novel Gradient-Based Texture Features
Texture Classification
by
G M Mashrur E Elahi
Master of Science
University of Alberta
processing and has many applications. For example, texture features in image
tation, of which there are many methods. Among them, gradient-based methods
have become popular in classification problems. One of the gradient based methods
and differential properties for a texture. But it discards some important texture
information due to the use of sub-regions. In this thesis, based on the original
CoHOG method, three novel feature extraction methods are proposed. All the
methods use the whole image instead of sub-regions for feature calculation. Also
we use a larger neighborhood size for the methods. In the first method, we use Sobel
operators for gradient calculation named S-CoHOG. The second method uses Gaus-
sian Derivative (GD) operators named GD-CoHOG and the third method named
for gradient calculations. The extracted feature vector size is very large and classifi-
cation using a large number of similar features does not provide the best results. In
our proposed methods, only a minimum number of significant features are selected
ii
using area under the receiver operator characteristic (ROC) curve (AUC) thresh-
olds. The selected features are used in a linear support vector machine (SVM)
proposed methods are compared with that of the original CoHOG method using
three well-known texture datasets. The classification results show that the proposed
methods achieve the best classification results in all the datasets. The proposed
methods are also evaluated for medical image classification. Three different cohort
datasets of 2D Magnetic Resonance Images (MRI) are used along with a multicen-
ter dataset to compare the classification results of the proposed methods with that
of the gray level co-occurrence matrix (GLCM) method. The experimental results
show that the proposed methods outperform that of the GLCM method.
iii
Acknowledgements
iv
Contents
Abstract ii
Acknowledgements iv
List of Figures xi
1 Introduction 1
1.1 Texture Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Thesis Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Summery of Contributions . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Related Works 7
2.1 2D Texture Analysis Methods . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Local Binary Patterns (LBP) . . . . . . . . . . . . . . . . . 8
2.1.2 Gray Level Co-occurrence Matrix (GLCM) . . . . . . . . . . 9
2.1.3 The Run Length Matrix (RLM) . . . . . . . . . . . . . . . . 11
2.1.4 Gradient Orientation based Texture Methods . . . . . . . . 11
3 Proposed Methodology 15
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
v
3.2.1 Gradient Orientation and Quantization . . . . . . . . . . . . 17
3.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.1 Co-occurrence Matrix (CM) Calculation . . . . . . . . . . . 19
3.3.2 Feature Vector Generation . . . . . . . . . . . . . . . . . . . 22
3.4 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Experimental Results 26
4.1 Texture Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.1.1 INRIA Person Dataset . . . . . . . . . . . . . . . . . . . . . 27
4.1.2 CUReT Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.3 UIUC Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Human MRI Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2.1 Amyotrophic Lateral Sclerosis (ALS) . . . . . . . . . . . . . 30
4.2.2 ROI Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.3 Downsampling . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.4 MR Dataset 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.5 MR Dataset 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.6 MR Dataset 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Classification Results of Texture Datasets . . . . . . . . . . . . . . 36
4.3.1 Classification of INRIA Person Dataset . . . . . . . . . . . . 36
4.3.2 Classification of CUReT Dataset . . . . . . . . . . . . . . . 38
4.3.3 Classification of the UIUC Dataset . . . . . . . . . . . . . . 39
4.3.4 Comparison with other CoHOG methods . . . . . . . . . . . 40
4.4 Classification Results of MRI Datasets . . . . . . . . . . . . . . . . 41
4.4.1 ROC Analysis of MR Dataset 1 . . . . . . . . . . . . . . . . 41
4.4.2 ROC Analysis of MR Dataset 2 . . . . . . . . . . . . . . . . 42
4.4.3 ROC Analysis of MR Dataset 3 . . . . . . . . . . . . . . . . 43
4.4.4 ROC Analysis using different Gradient Operators . . . . . . 46
vi
4.4.5 Region Based Analysis . . . . . . . . . . . . . . . . . . . . . 47
4.4.6 Comparison with the GLCM Method . . . . . . . . . . . . . 48
4.4.7 ROC Analysis using Randomly Selected Slices . . . . . . . . 49
4.5 ROC Analysis of Multicenter Dataset . . . . . . . . . . . . . . . . . 50
4.5.1 Classification using Different Centers for Training and Testing 53
5 Conclusion 55
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Bibliography 60
vii
List of Tables
viii
4.10 Number of features selected using AUC threshold for UIUC dataset
using two neighborhood sizes. . . . . . . . . . . . . . . . . . . . . . 40
4.11 Comparison of the classification accuracies (CA) of the proposed
methods with the original CoHOG and the Eig(Hess)-CoHOG meth-
ods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.12 ROC analysis of MR Dataset 1 using S-CoHOG features extracted
for neighborhood radius of 4. . . . . . . . . . . . . . . . . . . . . . . 42
4.13 ROC analysis of MR Dataset 1 using S-CoHOG features extracted
for neighborhood radius of 8. . . . . . . . . . . . . . . . . . . . . . . 42
4.14 ROC analysis of MR Dataset 2 using S-CoHOG features extracted
for neighborhood radius of 4. . . . . . . . . . . . . . . . . . . . . . . 43
4.15 ROC analysis of MR Dataset 2 using S-CoHOG features extracted
for neighborhood radius of 8. . . . . . . . . . . . . . . . . . . . . . . 43
4.16 ROC analysis of MR Dataset 3 using S-CoHOG features extracted
for neighborhood radius of 4. . . . . . . . . . . . . . . . . . . . . . . 44
4.17 ROC analysis of MR Dataset 3 using S-CoHOG features extracted
for neighborhood radius of 8. . . . . . . . . . . . . . . . . . . . . . . 44
4.18 ROC analysis of MR Dataset 1 using three proposed methods. Co-
HOG features extracted using a neighborhood radius of 8. . . . . . 46
4.19 ROC analysis of MR Dataset 2 using three proposed methods. Co-
HOG features extracted using a neighborhood radius of 8. . . . . . 47
4.20 Comparison of ROC analysis between GD-CoHOG and GLCM meth-
ods using MR dataset 1. . . . . . . . . . . . . . . . . . . . . . . . . 48
4.21 Comparison of ROC analysis between GD-CoHOG and GLCM meth-
ods using MR dataset 2. . . . . . . . . . . . . . . . . . . . . . . . . 49
4.22 MRI acquisition parameters for five different centers. . . . . . . . . 51
4.23 Multicenter (MC) dataset details. . . . . . . . . . . . . . . . . . . . 52
4.24 ROC analysis of MC datasets using proposed methods with a neigh-
boring radius of 8 and GLCM. . . . . . . . . . . . . . . . . . . . . . 53
ix
4.25 Classification accuracy using data from one center for training and
the other center for testing. . . . . . . . . . . . . . . . . . . . . . . 54
x
List of Figures
2.1 LBP process (R = 1, P = 8). (a) A gray level image, (b) neighbors’
values after thresholding, (c) Binary encoding and the corresponding
decimal value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 GLCM computation process (d = 1, θ = 90◦ ). (a) A gray level image,
(b) corresponding GLCM. . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Generation of (b) RLM matrix from a (a) gray level image using 00
direction of run. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Basic HOG calculation process. (a) Gradient orientation of the image
pixels, (b) histogram of gradient orientation of each sub-region, (c)
feature vector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Basic CoHOG calculation process. (a) Gradient orientation of the
image pixels, (b) Combination of of sub-regions and offsets for CM
calculation, (c) CM for each sub-region, and (d) feature vector. . . . 13
xi
4.1 Three sample images of (a) human and (b) nonhuman classes from
the INRIA Person dataset. . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Three sample images of (a) class 1 (b) class 3 and (c) class 5 from
the CUReT dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Three sample images of (a) class 1 (b) class 3 and (c) class 5 from
the UIUC dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 (a) Sagittal, (b) Axial and (c) Coronal image slices. Coronal imaging
is used in texture feature extraction. . . . . . . . . . . . . . . . . . 31
4.5 ROI selection from a coronal image slice. The highlighted regions
are selected as ROI. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.6 Three sample image slices of (a) controls and (b) patients from MR
Dataset 1. Patients and controls are not distinguishable by visual
inspection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.7 Three sample image slices of (a) controls and (b) patients from MR
Dataset 2. Patients and controls are not distinguishable by visual
inspection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.8 Three sample image slices of (a) controls and (b) patients from MR
Dataset 3. Patients and controls are not distinguishable by visual
inspection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.9 Classification accuracy for selected features using different AUC thresh-
olds for MR dataset 1. . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.10 Mean feature values with standard deviations of the mean feature
values between patients and controls of ten selected features for (a)
MR Dataset 1 and (b) MR Dataset 2. . . . . . . . . . . . . . . . . . 45
4.11 Region based analysis of the subjects of MR dataset 1. Significant
regions are marked by the colored boxes and classification accuracy
of the corresponding boxes. . . . . . . . . . . . . . . . . . . . . . . 48
xii
4.12 Classification accuracies for 10 random slice selection experiments
and the mean and the standard deviation of the classification accu-
racies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.13 Sample image slices of (a) controls and (b) patients of each center
from Multicenter dataset. Patients and controls are not distinguish-
able by visual inspection. . . . . . . . . . . . . . . . . . . . . . . . . 52
xiii
Chapter 1
Introduction
In the real world, textures on the surface of objects can be classified as either
micro structures or macro structures. The arrangement of bricks on a wall is an
example of macro structures while the graininess pattern on a brick is an example
of micro structures.
Texture analysis characterizes and quantifies pattern variations in images, in-
cluding those that are imperceptible to the human eye. Textures are used as visual
1
cues to differentiate among different image regions or different images.
Texture analysis has been a major research topic for the last four decades. It has
been used in different applications which include document processing [1], remote
sensing [2], automated inspection [3], fingerprint identification [4], and medical
image analysis [5], [6].
In texture analysis, texture features that are invariant to geometric transfor-
mation, noise, blurriness, and illumination changes are desired. Some well-known
2D texture methods are the Local Binary Patterns (LBP) [7] and its variants, gray
level co-occurrence matrix (GLCM) [8], the gray level Aura matrix (GLAM) [9],
the Run Length Matrix (RLM) [10], filter responses in the frequency [11] and spa-
tial [12], [13] domains, Wavelets [14], Orientation Pyramid [15], Markov Random
Fields (MRFs) [16], Gaussian MRFs [17], and spin images [18]. However, none of
the above methods are insensitive to illumination changes.
Recently, gradient orientation based texture methods have become popular in
computer vision and image processing. Among them Scale Invariant Feature Trans-
form (SIFT) [19], Histograms of Oriented Gradients (HOG) [20] and Co-occurrence
Histograms of Oriented gradients (CoHOG) [21] are commonly used in object de-
tection. The CoHOG method and its variants [22], [23] have been successfully used
in pedestrian detection [21], face recognition [24], fine-grained activity recognition
[22], etc.
In recent years, texture analysis methods have also found applications in med-
ical imaging that include texture analysis of MRI images. In particular, diagnosis
of dementia using GLCM and Gabor filter responses [25], the study of pathological
changes of the hippocampus in patients with Alzheimers disease and mild cognitive
impairment using GLCM and RLM [26], brain tumor detection [27] and the study
of epilepsy using Wavelet features [28] are some important contributions. Unfor-
tunately, there is still no well-known feature descriptor for Amyotrophic Lateral
Sclerosis (ALS) detection. A 2D region of interest based approach is used to find
texture changes in ALS [29]. GLCM has been used for 3D texture analysis in ALS
2
[30]. Unfortunately, the sensitivity and specificity of these methods are sub-optimal
for clinical utility. Therefore, a method with improved sensitivity and specificity as
well as classification accuracy is much desired.
3
incomplete for each sub-region. For the first time in this thesis, we are applying
CoHOG to the whole image for texture feature extraction without subdividing the
image into sub-regions.
For each of the three proposed methods, the gradient orientations of each pixel
of an image are calculated using the respective gradient operators. The gradient
orientations of the pixels are then quantized into N bins. The co-occurrences of
the gradients are summed for each offset and stored into an N × N co-occurrence
matrix. An offset is defined by a distance and a direction. Offsets are limited to
a radius specified by the distance from the pixel. All the co-occurrence matrices
calculated from all the offsets are combined to create the feature vector (FV).
The size of the FV can be very large depending on the number of offsets and not
all features are significant. Using this large number of similar features creates am-
biguity in creation of an optimal hyperplane and leads to an incorrect classification
by a classifier. So, we apply receiver operator characteristic (ROC) curve analysis
to select significant features. The selected features vector (SFV) size is less than the
original FV size. Most importantly, using the selected features, a significant differ-
ence between two classes is obtained. Particularly, for medical dataset, it is difficult
to find differences between the patients and the controls without feature selection.
We employ a linear support vector machine (SVM) [34] classifier to calculate the
classification accuracy between two classes.
• The proposed three methods use the whole image instead of subdividing it
into sub-regions. The use of sub-regions limits the accuracy of co-occurrence
matrix for boundary pixels and thus some information is incomplete for each
sub-region. Also it increases the size of the feature vector. Thus, using the
whole image not only reduces the boundary pixels problem in sub-regions but
4
also reduces the size of feature vector.
• The original CoHOG method uses Sobel operators for gradient calculation.
For the first time, we adopt two gradient operators GD and LFDG for the pro-
posed GD-CoHOG and LFDG-CoHOG methods, respectively. The proposed
methods are compared to determine the impact of the gradient operators on
classification accuracy using the whole image.
• Texture features are extracted using two different neighborhood sizes. The
original CoHOG method uses a maximum neighborhood size of 4. We use
a larger neighborhood size of 8 to evaluate the effect of using more global
information for co-occurrence calculation on classification accuracy.
• The extracted feature vector size using the CoHOG method is very large with
many similar features. Using this large number of similar features creates am-
biguity in optimal hyperplane creation and leads to the wrong classification by
an SVM classifier. We are the first to use a feature selection method to reduce
the number of features used in CoHOG. In particular, we select significant
features using area under the ROC curve (AUC) analysis for classification.
Only features that contain significant differences between classes are selected
using an AUC threshold. The experimental results show that the performance
with feature selection outperforms that without feature selection.
5
1.4 Thesis Outline
The rest of the thesis is organized as follows. Some important related works are
discussed in Chapter 2. In Chapter 3, we explain the proposed approach of feature
extraction and selection. The experimental results and discussions are presented in
Chapter 4. Chapter 5 concludes the thesis.
6
Chapter 2
Related Works
Texture analysis is a promising topic in computer vision and image processing. Two
dimensional texture analysis methods have been used for document processing [1],
remote sensing [2], automated inspection [3], fingerprint identification [4], medical
image analysis [6], etc. Some of the representative 2D methods are discussed below.
7
2.1.1 Local Binary Patterns (LBP)
LBP uses the gray level differences between the center pixel and its neighbors and
assigns either 0 or 1 to each of its neighbors depending on the difference as shown
in Eq. 2.1 [7],
1 if x ≥ 0
S(x) = , (2.1)
0 if x < 0
where x = (gi − gc ). Here gc and gi are the gray levels of the center pixel and
its neighbor pixel (i), respectively. These values are used to form a binary local
pattern. Then this binary pattern is converted into the corresponding decimal value
using Eq 2.2 [7],
P
X
LBPP,R = S(gp − gc )2p , (2.2)
p=0
where P is the number of neighbors and R the radius of the neighboring pixels. An
example of calculating the LBP decimal code for a neighboring radius R = 1 and
number of neighbors P = 8 is shown in the Fig 2.1.
Figure 2.1: LBP process (R = 1, P = 8). (a) A gray level image, (b) neighbors’
values after thresholding, (c) Binary encoding and the corresponding decimal value.
The center gray value of the window is compared with all the neighboring gray
values (see Fig. 2.1(a)) and 1 or 0 is assigned to the corresponding neighbor based
on Eq. 2.1 (see Fig. 2.1(b)). Finally these bits are encoded into a binary code and
converted into the corresponding decimal code (see Fig. 2.1(c)).
8
After computing the LBP codes for all the pixels in the image a histogram is
built from these decimal codes to represent the texture image. To achieve rotation
invariant, the minimum right shifted binary pattern is used.
In GLCM [8], image intensities are quantized into a fixed number of gray levels
and a co-occurrence matrix is formed by summing the co-occurrences of a specific
pair of gray levels. The process of GLCM can be divided into three steps. First,
each pixel values of a given gray image is quantized into G number of gray levels.
Then, using this gray level information a GLCM is formed. A GLCM is defined
for a given direction (θ) and distance (d). A vector with distance d and direction
angle θ connects image pixel I(x1, y1) to I(x2, y2) such that x2 = x1 + d cos(θ) and
y2 = y1 + d sin(θ). GLCMd,θ for distance d and direction angle θ is a G × G matrix
where each entry GLCMd,θ (i, j) shows the number of times that I(x1, y1) = i and
I(x2, y2) = j, where i and j are the gray levels at the corresponding locations.
Simply, GLCM counts the number of times a particular gray level pair co-occurs.
An example of the process of computing the GLCM is shown in Fig. 2.2.
Figure 2.2: GLCM computation process (d = 1, θ = 90◦ ). (a) A gray level image,
(b) corresponding GLCM.
Usually, GLCM uses one of the eight directions ( 00 , ±450 , ±900 , ±1350 , 1800 ).
Symmetric GLCM uses four directions instead of eight as diagonally opposite direc-
tions are symmetric. Finally, the GLCM is normalized to compute texture features.
The normalization can be done using Eq. 2.3,
9
norm GLCMd,θ (i, j)
GLCMd,θ (i, j) = PG−1 PG−1 . (2.3)
i=0 j=0 GLCMd,θ (i, j)
Twelve well known features of GLCM are defined and used. The texture fea-
tures are listed in Table 2.1. Here P is the normalized GLCM, G the number
of gray levels. µx , µy , σx , and σy indicate means and standard deviations of
the row and column sums of P.Px+y (k) = G
P PG
i=1 j=1 i+j=k P (i, j) and Px−y (k) =
PG PG
i=1 j=1 |i−j|=k P (i, j).
Recently, GLCM has been used in medical imaging for the diagnosis of dementia
[25], the study of pathological changes of hippocampus in patients with Alzheimer
disease and mild cognitive impairment [26], and brain tumors detection [27]. A 3D
variants of GLCM has been used for 3D texture analysis in Amyotrophic Lateral
Sclerosis (ALS) [30]. The major limitation of GLCM is that it works with the
intensity level of gray scale images, which will have unpredictable performance
when the acquisition equipment or the scanning protocol changes.
10
2.1.3 The Run Length Matrix (RLM)
The RLM uses the gray level runs. A set of consecutive, co-linear pixels in an image
having the same gray level value is called a gray level run [10]. The length of the
run is defined by the number of pixels in the run. For a given run direction, the
run length matrix of an image can be calculated. Fig. 2.3 shows an example of
creating a RLM for 00 degree run direction.
Figure 2.3: Generation of (b) RLM matrix from a (a) gray level image using 00
direction of run.
An RLM matrix element (i, j) is the number of times the gray level i appears
in the image with run length j in specific run direction. The number of run lengths
depends on the given gray level image size and the size of the RLM is equal to the
number of run length × number of gray levels of the image.
The numerical texture features are computed using some well-known functions
that are used in the Gray Level Co-occurrence Matrix (GLCM) [8] method for
feature calculation.
Gradient orientation based texture methods have become popular in recent years
for their robustness against image intensity changes, blurriness and deformations.
Moreover, gradient orientation based methods have better classification accuracy
than LBP-like methods [35]. This is because LBP-like methods merely count the
11
number of patterns around pixels and lack gradient orientation related information
[36].
Histograms of Oriented Gradients (HOG) [20] and Co-occurrence Histograms
of Oriented gradients (CoHOG) are two such commonly used methods that have
been used for objects detection[20], pedestrian detection [21], face recognition [24],
fine-grained activity recognition [22], etc. We give a brief description of HOG and
CoHOG below.
The HOG method uses a gradient oriented image as input. The gradient orienta-
tions are quantized into N bins. Then the image is subdivided into M number of
equal sub-regions. For each sub-region, a histogram of orientations is computed.
The histogram is formed by simply counting different groups of orientations. The
size of the histogram is N . For M sub-regions, there are M different histograms
each of size N . Finally, these histograms are concatenated to form the feature vec-
tor histogram of size M × N . An overview of HOG calculation process is shown in
Fig. 2.4 [20].
Figure 2.4: Basic HOG calculation process. (a) Gradient orientation of the image
pixels, (b) histogram of gradient orientation of each sub-region, (c) feature vector.
The limitation of HOG is that it only counts the orientations for a local re-
gion. Inter-relationship information between orientations is not used. To overcome
12
this limitation, an improvement of HOG called the Co-occurrence HOG (CoHOG)
method is proposed.
CoHOG is an extension of HOG. It also uses the quantized gradients as input and
subdivides the image into a number of sub-regions. The CoHOG method uses a
circular neighborhood with a given radius in which each pixel with the center pixel
forms a pair called an offset. Now for each sub-region and for each offset, the
co-occurrences of an orientation pair is computed by scanning all the pixels in the
sub-region to form a co-occurrence matrix (CM). The size of the CM is N × N ,
where N is the number of distinct orientations. The total number of CMs for a
sub-region depends on the number of offsets. Finally, these CMs are concatenated
to form the histogram for the sub-region. Then, the histograms of all the sub-
regions are concatenated to form the CoHOG feature vector for the given image.
The overview of CoHOG calculation process is shown in Fig. 2.5 [20].
Figure 2.5: Basic CoHOG calculation process. (a) Gradient orientation of the image
pixels, (b) Combination of of sub-regions and offsets for CM calculation, (c) CM
for each sub-region, and (d) feature vector.
The feature vector size depends on the number of orientations, the number
of offsets and the number of sub-regions. For example, if an image has M sub-
13
regions and K offsets with a CM size of N × N , then the final feature vector size
is M × K × (N × N ).
The CoHOG method has the advantages over HOG in preserving the inter-
relationship among the neighboring pixel orientations and, by using different offsets,
the co-occurrence matrices can better represent the local and global orientation in-
formation. However, the use of sub-regions limits the accuracy of CM for boundary
pixels and thus some information is incomplete for each sub-region.
In this thesis, we present a modified CoHOG method for texture feature extrac-
tion of the whole image which can overcome the sub-region issue mentioned above
and can reduce the feature vector size.
14
Chapter 3
Proposed Methodology
3.1 Overview
In this chapter we discuss the proposed approaches of extracting texture features
using the CoHOG method.
The original CoHOG method subdivides the original image and for each sub-
region it calculates the co-occurrence matrices for all the offsets. Finally, all the
co-occurrence matrices of each sub-region are combined to form the feature vector
histogram. The histogram is very large depending on the number of sub-regions and
the number of offsets. Image classes that contain very small changes in between the
groups are almost similar in all other regions. Features extracted from these regions
are also similar. Using this large number of similar features creates ambiguity
in defining the optimal hyperplane and leads to an incorrect classification by a
classifier.
In this thesis, based on the original CoHOG method, we propose three novel
texture feature extraction methods. Since one of the key components in CoHOG
is gradient calculation, three well-known methods such as Sobel [31], Gaussian
Derivative (GD) [23] and Local Frequency Descriptor Gradient (LFDG) [32], [33]
operators are used to calculate the gradient orientations of the image pixels. In
the first method, we use Sobel operators for gradient calculation named S-CoHOG.
15
The second method uses GD operators named GD-CoHOG and the third method
named LFDG-CoHOG uses the LFDG operators for gradient calculations.
The original CoHOG method uses sub-regions of an image and calculates the
sum of the co-occurrences of orientation pairs. The use of sub-region limits the
accuracy of co-occurrence calculation for the boundary pixels and thus some in-
formation is incomplete for each sub-region. While it is a simple idea, to the best
knowledge of the author, it is the first time of applying CoHOG to the whole image
using the three proposed methods for texture feature extraction.
The CoHOG features are extracted using two different neighborhood sizes. The
original CoHOG method uses a maximum neighborhood size of 4. We use a larger
neighborhood size of 8 to see the effect of using more distance information for
co-occurrence calculation on classification accuracy.
Finally, we select significant features using area under the ROC curve (AUC)
analysis for classification. Only features that contain significant differences in be-
tween the classes are selected using an AUC threshold.
The overview of the proposed approach is shown in Fig. 3.1. The proposed
approach for all the three proposed methods consists of four steps: pre-processing,
texture feature extraction, feature selection and classification. These steps are
discussed below.
3.2 Pre-processing
Texture features are extracted from pre-processed images. Pre-processing involves
gradient orientation (GO) calculation and quantization. The proposed S-CoHOG,
16
GD-CoHOG and LFDG-CoHOG methods use Sobel [31], Gaussian Derivative (GD)
[23] and Local Frequency Descriptor Gradient (LFDG) [32] operators for gradient
orientation calculation, respectively. The GO calculation and quantization steps
are discussed below.
The gradient orientations of image pixels are computed by convolving the gradient
operators with the image. Horizontal and vertical gradient operators are used
to calculate the corresponding gradient images and then gradient orientations are
calculated from the gradient images. The details of each of the three gradient
operators are discussed below.
Sobel uses two 3 × 3 kernels to estimate the horizontal and vertical derivatives. The
two operators used in this method are shown in Eq. 3.1,
−1 0 1 −1 −2 −1
Gx = −2 0 2 , Gy = 0 0 , (3.1)
0
−1 0 1 1 2 1
where Gx and Gy are the corresponding horizontal and vertical gradient operators.
The GD Operators
The GD operators that use two basic one-dimensional derivative filters are given in
Eq. 3.2 [23], [37],
−2t − t22 −t
2
f1 (t) = e σ , f (t) = e σ 2 ,
2 (3.2)
σ2
where t is the width of the derivative filter and σ the standard deviation. These
one-dimensional derivatives are used to calculate the two horizontal and vertical
derivative filters as shown below [23], [37].
17
Basic filters Filter in x Filter in y
Gx f1 f2
Gy f2 f1
Here, f1 and f2 are two vectors defined in Eq. 3.2. For both Gx and Gy filters,
filter in x and filter in y are convolved with each column and each row of image I,
respectively, to form the corresponding gradient image.
The LFDG operators can be calculated using the representation as shown in Eq.
3.3 and Eq. 3.4 [32],
p
X 2π(k − 1)
Gx = fk cos , (3.3)
k=1
p
p
X π 2π(k − 1)
Gy = fk cos + , (3.4)
k=1
2 p
where p is the number of neighboring points and fk the corresponding gray level of
the kth neighbor.
The kernel size depends on the specified radius. For a kernel with radius R, the
LFDG operator has the kernel size of N ×N , where N = 2R+1. In our experiments,
we use R = 1 and 34 neighboring points to calculate the kernel operators.
For all of the operators discussed above, Gx and Gy are convolved with the
original image to compute the horizontal and vertical gradient images, respectively.
Gradient orientations are computed using Eq. 3.5,
Gy
GO = arctan . (3.5)
Gx
Finally, the orientations are then quantized into 8 bins. In particular, 0◦ − 360◦
orientations are divided into eight bins of 45◦ each. Each pixel’s orientation is
assigned to the nearest bin. The orientation bins are shown in Fig. 3.2. The blue
18
lines are the boundary of the orientation bins.
19
Figure 3.3: (a) Offsets for different radii and (b) a specific offset at distance (x, y)
from pixel (p, q).
green pixel with itself. Increasing the neighborhood size increases the number of
offsets and thus the number of CMs. The upper half of the circular neighborhood
is not considered because they are redundant since pixels in the top left corner of
image are processed first.
In the proposed approach, we use the whole image for co-occurrence matrix
calculation instead of using sub-regions as that used in the original CoHOG method.
A neighborhood size of 4 and 8 are separately used for feature extraction. In
CoHOG, the co-occurrence matrix is obtained by summing the co-occurrences of
each pair of orientations for each offset. The size of the co-occurrence matrix
is N × N , where N is the number of distinct orientations which is pre-defined.
For a specific offset (x, y) and a specific orientation at pixel (p, q) = i and pixel
(p + x, q + y) = j, the equation for calculating the CM is shown in Eq. 3.6 [21],
20
m X
n
X 1 if Q is True
CMx,y (i, j) = , (3.6)
p=1 q=1 0 Otherwise
Fig. 3.4 shows the workflow of CoHOG. It shows that using the gradient oriented
image which is quantized into four different orientations (00 , 900 , 1800 , and 2700 ),
a co-occurrence matrix of size 4 × 4 for each offset within the specified radius is
created. The method scans each pixel for all the offsets and sums the co-occurrences
of the orientations for that offset and stores the sums into the entry that corresponds
to the pair of orientations of the specified co-occurrence matrix. After scanning all
the pixels, it finishes in building up all the co-occurrence matrices. The algorithm
21
for computing the CMs are given in Algorithm 1.
Algorithm 1: Algorithm for CM calculation of the proposed methods
Given I : Gradient oriented Image;
initialize: CM ← 0;
for all positions (p, q) inside of the image do
i ← I(p, q);
for all offsets (x, y) such that corresponds neighbors do
if (p + x, q + y) is inside of the image then
j ← I(p + x, q + y);
CM (i, j, x, y) ← CM (i, j, x, y) + 1;
end
end
end
Now all the created CMs are used to generate the feature vector for the selected
image. A feature vector (FV) is generated by simply concatenating the CMs as
shown in Eq. 3.7,
O
n
FV = vec(CMi ) (3.7)
i=1
f
where, is the concatenation operator, O is the number of offsets and vec is the
vector representation of CM. The FV is a histogram of the co-occurrences of orien-
tations of different offsets in the image (see Fig. 3.4(c)).
The size of the feature vector depends on the number of offsets used and the
size of the CM as shown in Eq. 3.8,
One can see that the size of the feature vector is very large if the number of
22
offsets is large. For example, with a neighborhood size of 4, the total number
of offsets is 31. Then the CM size is 8×8 = 64, and the FV size = 31×64 =
1984. With the same CM size, if the radius is increased to 6 with 61 offsets and
8 with 109 offsets then FV size = 3904 and FV size = 6976, respectively. When
the FV size is large, it is difficult to distinguish between two classes of distinct
categories, e.g. changes occur in a small portion of the images between two classes
and the features of all other portions of the image classes are similar. Distinguishing
between classes is difficult due to these similar features. Therefore, it is important
to select significant features that are extracted from the changed portion of the
images. These selected features have significant differences between the classes and
are used for classification.
23
demonstrate that better classification accuracy is obtained using the ROC feature
selection approach.
In this thesis, feature selection is performed using ROC analysis. It is notewor-
thy that we are the first to use a feature selection method to extract significant
CoHOG texture features to further improve classification accuracy. Significant fea-
tures are selected using area under the ROC curve (AUC) analysis for classification.
Only features that contain significant differences in between the classes are selected
using an AUC threshold.
3.5 Classification
For classification we use a linear support vector machine (SVM) [34]. A two stage
classification is used with the use of training and testing datasets.
For a two class classification, the SVM computes the optimal hyperplane to
partition the feature space of the training samples into two halves. Samples from
both classes are used for training. Each training sample consists of a feature vector
and a label of its class.
Finally, the trained SVM is used to predict the class of a test sample using Eq.
3.9,
X
class(x~t ) = Sgn{ y(lk )αk K(x~t , x~k ) + b}, (3.9)
∀k,lk ∈(p,c)
where class(x~t ) is the class label of the test sample x~t , x~k is the feature vector of
the kth training sample. y(lk ) is the class label function of the kth sample which
is either +1 or −1, αk the Lagrangian multiplier for the training sample k, K the
kernel function and b the bias parameter of the optimal hyperplane of the SVM.
A linear kernel function is used to map data into higher dimensional spaces hoping
that the data could be better separated. A linear kernel simply uses the dot product
of two vectors.
The classification is performed using LIBSVM [42] version 3.20 package. The
24
SVM classifier is trained with a random selection of half of the dataset from each
class and, then using the trained model, the classification accuracy is tested using
the rest of the sample. The average classification accuracy was recorded over 1000
runs to reduce the effect of randomness.
25
Chapter 4
Experimental Results
In this chapter, we discuss the results using the proposed methods for different
datasets. The proposed methods are implemented in Matlab. The program runs on
a PC with an Intel Core i7 with 3.40GHz CPU with 24GB RAM running Windows
7 Professional. Three well-known texture datasets are used in this experiment to
compare the classification performance of the proposed methods to other state of
the art methods.
We also use another three datasets consisting of 2D MR images of ALS pa-
tients and healthy controls for classification. We compare the results of these MRI
datasets with that of the GLCM method that has been used for texture classifi-
cation in ALS. Another multicenter dataset in ALS of different image contrasts is
also used to evaluate the classification performance of the proposed methods on
datasets having various image resolutions and contrasts.
26
4.1.1 INRIA Person Dataset
The INRIA Person [20] dataset is a widely used pedestrian detection benchmark
dataset. The dataset contains two classes of human and nonhuman images of
various sizes. We have used 200 images per class and human and nonhuman images
are divided into equal halves in each class. Three image samples of each class of
this dataset are shown in Fig. 4.1.
Figure 4.1: Three sample images of (a) human and (b) nonhuman classes from the
INRIA Person dataset.
27
photographed under a range of viewing and illumination angles. Three sample
images in class 1, 3 and 5 are shown in Fig. 4.2.
Figure 4.2: Three sample images of (a) class 1 (b) class 3 and (c) class 5 from the
CUReT dataset.
The third texture dataset we use is UIUC [44]. In the UIUC dataset, we use 10
different texture classes with each class containing 40 image samples. All the classes
have the same image resolution of 640 × 480. The dataset includes materials imaged
under significant view-point variations. Three sample images in class 1, 3 and 5 are
shown in Fig. 4.3.
28
Figure 4.3: Three sample images of (a) class 1 (b) class 3 and (c) class 5 from the
UIUC dataset.
29
clinical evidence of upper motor neuron (UMN) and lower motor neuron (LMN)
involvement.
In this experiment, we use the specified coronal slices of the the MRI scans of
the whole brain. Then a region of interest (ROI) is selected for texture feature
selection. We also use different downsampled version of the same image slice in the
experiment. The details of ALS, ROI selection and downsampling of the subjects
are discussed below.
30
to evaluate novel therapies.
From the MRI scan of the whole brain, coronal slices with an angulation parallel
to the corticospinal tract (CST) (see Fig. 4.4 (a)) are used for texture calculation
(see Fig. 4.4 (c)). The image angulation is performed using Mango [50].
Figure 4.4: (a) Sagittal, (b) Axial and (c) Coronal image slices. Coronal imaging
is used in texture feature extraction.
In particular, an ROI is manually defined that includes the region above the
inferior horn of the lateral ventricles (see Fig. 4.5) and is specified by creating a
mask to segment out the regions of interest. Masks for each subject are created
separately using ITK-SNAP [51].
4.2.3 Downsampling
31
Figure 4.5: ROI selection from a coronal image slice. The highlighted regions are
selected as ROI.
4.2.4 MR Dataset 1
Twelve patients and nineteen controls are in this datasets. Details of the patients
and controls for this dataset are given in Table 4.1.
4.2.5 MR Dataset 2
Nineteen patients and twenty controls are in this dataset. Details of the patients
and controls are given in Table 4.3.
MRI scans were done on a 1.5 Tesla system (Magnetom Sonata, Siemens Medical
32
Figure 4.6: Three sample image slices of (a) controls and (b) patients from MR
Dataset 1. Patients and controls are not distinguishable by visual inspection.
33
both patients and controls are shown in Fig. 4.7. We can see that patients and
controls are not distinguishable by visual inspection. MR dataset 2 images are
Figure 4.7: Three sample image slices of (a) controls and (b) patients from MR
Dataset 2. Patients and controls are not distinguishable by visual inspection.
downsampled into four different resolutions. Details of the image resolutions for
each scale are given in Table 4.4.
34
4.2.6 MR Dataset 3
All the subjects are the same for MR dataset 2 and MR dataset 3 (see Table
4.3). But MR dataset 3 was acquired with a T1-weighted MPRAGE (TR=1600ms,
TE=3.8ms, TI=1100ms, pixel size = 1.0×1.0mm2 , slice thickness = 1.5mm). MRI
scanning were performed on a 1.5 Tesla system (Magnetom Sonata, Siemens Medical
Systems). Three sample image slices of both patients and controls are shown in
Fig. 4.8. We can see that patients and controls are not distinguishable by visual
inspection. The resolutions of downsampled images are given in Table 4.2.
Figure 4.8: Three sample image slices of (a) controls and (b) patients from MR
Dataset 3. Patients and controls are not distinguishable by visual inspection.
For all the MR datasets, coronal imaging was performed with an angulation
parallel to the CST (see Fig. 4.4). ROI was manually selected for each subject that
covers the region above the inferior horn of the lateral ventricles. One sample ROI
35
is shown in Fig. 4.5.
The INRIA database contains two classes of images, namely, human and nonhu-
man. We use the proposed methods for feature extraction and selected using ROC
analysis with AUC thresholds of 0.95 and 0.99 for a neighborhood size of 4 and 8,
36
respectively. These thresholds are chosen to minimize the number of features such
that selected features can produce better classification results. For classification,
we use half of the images from each class for training and the remaining images
from both classes for testing. The classification accuracies with and without feature
selection for two neighborhood sizes are shown in Table 4.5.
Table 4.5: Classification accuracy of the proposed methods using INRIA Person
dataset. I means classification without feature selection and II mean classification
with feature selection.
Neighborhood Size of 4 Neighborhood Size of 8
Proposed
Classification Classification Classification Classification
Method
Accuracy (I) Accuracy (II) Accuracy (I) Accuracy (II)
S-CoHOG 99.00% 99.30% 98.90% 99.90%
GD-CoHOG 99.40% 99.60% 99.00% 99.90%
LFDG-CoHOG 98.70% 99.30% 99.00% 99.50%
The classification results for this dataset are almost 100% for all the proposed
methods. There is a small improvement in most cases in the classification accuracy
between with and without feature selection. Also using a larger neighborhood size
has little impact on the classification accuracy. The S-CoHOG and GD-CoHOG
methods acquire a maximum classification accuracy of 99.90% with neighborhood
size of 8 with feature selection.
Table 4.6: Number of features selected using AUC threshold for INRIA Person
dataset using two neighborhood sizes.
Neighborhood Size of 4 Neighborhood Size of 8
Proposed
AUC Selected AUC Selected
Method
Threshold Features Threshold Features
S-CoHOG 0.95 193 0.99 1403
GD-CoHOG 0.95 807 0.99 1080
LFDG-CoHOG 0.95 655 0.99 1245
The number of selected features using an AUC threshold for each of the method
is shown in Table 4.6. The selected features are smaller than that of the total
number of features for both the neighborhood sizes. The proposed methods achieve
better classification accuracy using the selected features than using all the features.
37
4.3.2 Classification of CUReT Dataset
The classification results of the proposed methods are calculated using 10 classes
of CUReT dataset. Each class contains 55 images and half of the images in each
class are used to train the classifier and rests of the images in each class are used
for testing. A two class classification is performed among the 10 classes and the
average classification accuracy is recorded. Feature selection is performed using an
AUC threshold of 0.70 and 0.80 for a neighborhood size of 4 and 8, respectively.
These thresholds are chosen to minimize the number of features such that selected
features can produce better classification results. The classification accuracies with
and without feature selection for two neighborhood sizes are shown in Table 4.7.
Table 4.7: Classification accuracy of the proposed methods using CUReT dataset. I
means classification without feature selection and II mean classification with feature
selection.
Neighborhood Size of 4 Neighborhood Size of 8
Proposed
Classification Classification Classification Classification
Method
Accuracy (I) Accuracy (II) Accuracy (I) Accuracy (II)
S-CoHOG 96.70% 96.60% 96.30% 97.80%
GD-CoHOG 96.80% 97.40% 96.50% 98.30%
LFDG-CoHOG 96.80% 97.10% 96.40% 97.60%
From the classification results, we observe that the proposed methods have
higher classification accuracy using feature selection than without feature selection
when the neighborhood size is 8. For a neighborhood size of 4, we found an excep-
tion for S-CoHOG that it has a slightly lower classification accuracy with feature
selection. Without feature selection, the classification accuracies are almost the
same in all the methods for both neighborhood sizes, but using feature selection
the proposed methods have a better classification accuracy for neighborhood size of
8. The GD-CoHOG method acquires a maximum classification accuracy of 98.30%
for neighborhood size of 8 and using feature selection.
The number of selected features using an AUC threshold for each of the methods
is shown in Table 4.8. The number of selected features are much smaller than that of
38
Table 4.8: Number of features selected using AUC threshold for CUReT dataset
using two neighborhood sizes.
Neighborhood Size of 4 Neighborhood Size of 8
Proposed
AUC Selected AUC Selected
Method
Threshold Features Threshold Features
S-CoHOG 0.70 343 0.80 235
GD-CoHOG 0.70 371 0.80 403
LFDG-CoHOG 0.70 378 0.80 563
the (1984 for neighborhood size of 4 and 6976 for neighborhood size of 8) number
of features for both neighborhood sizes. The proposed methods achieve better
classification accuracy using the selected features than using all the features.
In this dataset, we also use 10 different classes of images and each of which con-
tains 40 images. ROC feature selection is performed with an AUC threshold of
0.80 and 0.90 for a neighborhood size of 4 and 8, respectively. These thresholds are
chosen to minimize the number of features such that selected features can produce
better classification results. Half of the images in each class are used for training
and the remaining images in each class are used for testing. A two class classifi-
cation is performed among the 10 classes and the average classification accuracy is
recorded. The classification accuracies with and without feature selection for two
neighborhood sizes are shown in Table 4.9.
Table 4.9: Classification accuracy of the proposed methods using the UIUC dataset.
I means classification without feature selection and II mean classification with fea-
ture selection.
Neighborhood Size of 4 Neighborhood Size of 8
Proposed
Classification Classification Classification Classification
Method
Accuracy (I) Accuracy (II) Accuracy (I) Accuracy (II)
S-CoHOG 95.30% 95.20% 95.60% 97.00%
GD-CoHOG 93.70% 96.60% 95.00% 98.00%
LFDG-CoHOG 95.00% 95.60% 95.31% 97.50%
For this dataset, we observe that the proposed methods with feature selection
39
have similar or better classification results than without feature selection for both
the neighborhood sizes. For a neighborhood size of 4, we found an exception for
S-CoHOG method that it has a slightly lower classification accuracy with feature
selection than that without feature selection. Using a larger neighborhood, we
found better results than that of using a smaller neighborhood. The GD-CoHOG
method acquires a maximum classification accuracy of 98.00% for neighborhood
size of 8 using feature selection.
Table 4.10: Number of features selected using AUC threshold for UIUC dataset
using two neighborhood sizes.
Neighborhood Size of 4 Neighborhood Size of 8
Proposed
AUC Selected AUC Selected
Method
Threshold Features Threshold Features
S-CoHOG 0.80 436 0.90 1028
GD-CoHOG 0.80 188 0.90 743
LFDG-CoHOG 0.80 395 0.90 895
The AUC thresholds used for feature selection and number of selected features
for the proposed methods using this dataset are shown in Table 4.10.
We compare the classification results of the proposed methods with that of the
original CoHOG method [21] and the Eig(Hess)-CoHOG method [23] using the
INRIA Person, CUReT and the UIUC texture datasets.
The original CoHOG method uses 6 sub-regions and a neighborhood size of
4. Sobel operators are used for gradient calculations. The Eig(Hess)-CoHOG uses
the Hessian matrix to calculate the eigen values of the image surface. These eigen
values are used for pixel orientation calculation. This method uses 4 sub-regions
with a neighborhood size of 4. The comparison results are shown in Table 4.11.
Here we compare the results found using feature selection and neighborhood
size of 8 for the proposed methods with that of the original CoHOG and Eig(Hess)-
CoHOG methods. The original CoHOG method uses the normalized images of zero
40
Table 4.11: Comparison of the classification accuracies (CA) of the proposed meth-
ods with the original CoHOG and the Eig(Hess)-CoHOG methods.
Method INRIA Dataset CUReT Dataset UIUC Datasets
CA CA CA
Original CoHOG 95.5% 94.94% 77.41%
Eig(Hess)-CoHOG - 90.00% 91.66%
S-CoHOG 99.90% 97.80% 97.00%
GD-CoHOG 99.90% 98.30% 98.00%
LFDG-CoHOG 99.50% 97.60% 97.50%
mean and unit standard deviation. We do not use any normalization of the image
dataset for the proposed methods. It is noteworthy that the results of the original
CoHOG are worse for images without normalization. For the CUReT and UIUC
datasets, GD-CoHOG has the best classification accuracies. S-CoHOG and GD-
CoHOG have the best results for INRIA Person dataset. The Eig(Hess)-CoHOG
method has better classification results than that of the original CoHOG method
but are worse than that of our proposed methods.
Six patients and ten controls are used to train the linear SVM classifier and the
rest of the patients and controls are used for testing in this dataset. The maximum
AUC is calculated for the selected features and then the classification accuracy is
41
calculated using the selected features. The results are shown in Table 4.12 and
4.13.
Table 4.12: ROC analysis of MR Dataset 1 using S-CoHOG features extracted for
neighborhood radius of 4.
Image Pixel Maximum Optimal Optimal Classification
Size (mm2 ) AUC Sensitivity Specificity Accuracy
0.5 × 0.5 0.815 54% 67% 63.00%
1×1 0.895 57% 73% 67.30%
2×2 0.886 81% 84% 83.50%
3×3 0.895 81% 91% 87.30%
4×4 0.842 71% 84% 79.00%
Table 4.13: ROC analysis of MR Dataset 1 using S-CoHOG features extracted for
neighborhood radius of 8.
Image Pixel Maximum Optimal Optimal Classification
Size (mm2 ) AUC Sensitivity Specificity Accuracy
0.5 × 0.5 0.831 50% 61% 57.00%
1×1 0.895 57% 74% 67.70%
2×2 0.906 74% 83% 79.30%
3×3 0.917 91% 95% 93.00%
4×4 0.921 90% 89% 90.30%
Four different downsampled images along with the original image are used in
this experiment. From the results it is shown that features extracted (using both
neighborhood size of 4 and 8) from downsampled images (image pixel size = 3×3
mm2 ) have better classification accuracy with a higher maximum AUC than that
of using the original image resolution. In particular, the best classification accu-
racy (93.00%), the maximum AUC (0.917) and the optimal sensitivity (91%) and
specificity (95%) are obtained using features extracted with a neighborhood size of
8.
In this dataset, we use 19 patients and 20 controls for classification and ROC
analysis. Ten patients and 10 controls are used for training the linear SVM and
42
the other 9 patients and 10 controls are used for testing. The classification results
for different downsampled images with two neighborhood sizes are shown in Table
4.14 (four neighbors) and Table 4.15 (eight neighbors).
Table 4.14: ROC analysis of MR Dataset 2 using S-CoHOG features extracted for
neighborhood radius of 4.
Image Pixel Maximum Optimal Optimal Classification
Size (mm2 ) AUC Sensitivity Specificity Accuracy
0.86 × 0.86 0.783 48% 57% 53.30%
1×1 0.791 54% 59% 56.70%
2×2 0.810 63% 78% 71.00%
3×3 0.834 75% 78% 76.90%
4×4 0.856 84% 86% 85.30%
Table 4.15: ROC analysis of MR Dataset 2 using S-CoHOG features extracted for
neighborhood radius of 8.
Image pixel Maximum Optimal Optimal Classification
Size (mm2 ) AUC Sensitivity Specificity Accuracy
0.86 × 0.86 0.834 55% 60% 57.70%
1×1 0.855 60% 61% 61.10%
2×2 0.850 66% 78% 72.70%
3×3 0.834 77% 82% 80.00%
4×4 0.867 92% 88% 90.40%
We observe from the results that downsampling increases the classification ac-
curacy along with sensitivity and specificity. Here we found the best classification
accuracy (90.40%), the maximum AUC (0.867) and with the best optimal sensitiv-
ity (92%) and specificity (88%) in downsampled images (image pixel size = 4×4
mm2 ) with a neighborhood size of 8.
43
Table 4.16: ROC analysis of MR Dataset 3 using S-CoHOG features extracted for
neighborhood radius of 4.
Image Pixel Maximum Optimal Optimal Classification
Size (mm2 ) AUC Sensitivity Specificity Accuracy
0.5 × 0.5 0.641 40% 35% 37.50%
1×1 0.753 53% 45% 49.30%
2×2 0.811 63% 65% 64.50%
3×3 0.818 68% 75% 72.70%
4×4 0.869 81% 75% 78.10%
Table 4.17: ROC analysis of MR Dataset 3 using S-CoHOG features extracted for
neighborhood radius of 8.
Image Pixel Maximum Optimal Optimal Classification
Size (mm2 ) AUC Sensitivity Specificity Accuracy
0.5 × 0.5 0.659 31% 37% 34.20%
1×1 0.780 43% 39% 41.50%
2×2 0.821 75% 74% 74.20%
3×3 0.845 81% 83% 82.00%
4×4 0.895 94% 92% 93.50%
We also found the best results at a lower resolution for this dataset as well. In
this case, downsampled images (image pixel size = 4×4 mm2 ) have the best results.
We observed the best classification accuracy (93.50%), the maximum AUC (0.895)
and the best optimal sensitivity (94%) and specificity (92%) when the neighborhood
size is 8.
For all of the three MR datasets results, we apply the S-CoHOG method for
texture feature extraction. MR Dataset 1 and 2 are both T2-weighted but are
collected from different scanner parameters. MR dataset 3 is a T1-weighted dataset
which is different from the other datasets. We select the features with AUC ≥ 0.8
for all of the above results because we observe that the best classification results
are obtained using AUC≥ 0.8. Fig. 4.9 shows the classification accuracy for the
selected features using different AUC thresholds.
MR Dataset 1 and two neighborhood sizes are used in this experiment. For both
neighborhoods, we found the best classification results at AUC ≥ 0.8. Mean feature
values and standard deviations of mean feature values of ten selected features of
44
Figure 4.9: Classification accuracy for selected features using different AUC thresh-
olds for MR dataset 1.
patients and controls using AUC≥0.8 for MR dataset 1 and MR dataset 2 are shown
in Fig. 4.10. There is a significant difference between the patients and the controls
in the mean feature values and also their standard deviations do not overlap.
Figure 4.10: Mean feature values with standard deviations of the mean feature
values between patients and controls of ten selected features for (a) MR Dataset 1
and (b) MR Dataset 2.
45
We observe that images with lower resolution give better results in classification.
This is because the reduced resolution image contains more dense information than
the original image for a fixed neighborhood size. With a fixed neighborhood size,
CoHOG can cover more regions in feature extraction for downsampled image than
for the original image. Thus the extracted features have more distance information
for downsampled image than for the original image. For MR dataset 1, the best
results are obtained using downsampled images with a pixel size of 3×3 mm2 and of
4×4 mm2 give the best results for MR dataset 2 and 3. For all of the MR datasets,
the best results are observed using a neighborhood size of 8. Therefore, the rest
of the experimental analysis is focused on using only downsampled images with a
pixel size of 3×3 mm2 for MR dataset 1 and of 4×4 mm2 for MR dataset 2 and 3
with a neighborhood size of 8.
Based on three different gradient operators, Sobel, Gaussian Derivative (GD) and
Local Frequency Descriptor Gradient (LFDG), our three proposed methods S-
CoHOG, GD-CoHOG and LFDG-CoHOG are used separately with the ROC curve
to analyze the classification accuracy of MR dataset 1 and 2. The results of classifi-
cation accuracy and optimal sensitivity and specificity for MR dataset 1 are shown
in Table 4.18.
Table 4.18: ROC analysis of MR Dataset 1 using three proposed methods. CoHOG
features extracted using a neighborhood radius of 8.
Proposed Maximum Optimal Optimal Classification
Methods AUC Sensitivity Specificity Accuracy
S-CoHOG 0.917 91% 95% 93.00%
GD-CoHOG 0.954 98% 96% 97.30%
LFDG-CoHOG 0.897 88% 98% 94.75%
All of the three proposed methods have a high classification accuracy with ex-
cellent optimal sensitivity and specificity. Among them, GD-CoHOG method has
the highest classification accuracy (97.30%) and the optimal sensitivity (98%) and
46
specificity (96%). Also, GD-CoHOG has the highest maximum AUC (0.954) among
the all the operators.
Table 4.19: ROC analysis of MR Dataset 2 using three proposed methods. CoHOG
features extracted using a neighborhood radius of 8.
Proposed Maximum Optimal Optimal Classification
Methods AUC Sensitivity Specificity Accuracy
S-CoHOG 0.867 92% 88% 90.40%
GD-CoHOG 0.918 91% 93% 92.30%
LFDG-CoHOG 0.864 87% 95% 91.00%
For MR dataset 2, the results are shown in Table 4.19. In this case, GD-CoHOG
method has a better performance than the other two methods too. GD-CoHOG
acquires the highest classification accuracy of 92.30% along with the highest optimal
sensitivity of 91% and specificity of 93%.
47
Figure 4.11: Region based analysis of the subjects of MR dataset 1. Significant
regions are marked by the colored boxes and classification accuracy of the corre-
sponding boxes.
We compare the best results of our proposed methods with that of the well-known
GLCM method. GLCM has been used in medical image analysis in many appli-
cations [25], [26], [27]. We implemented the GLCM method (Gray Labels = 32,
Neighbor distance = 1, Neighbor direction = 00 ) in the same environment for MR
datasets 1 and 2. In total, 22 features are calculated using well-known feature
functions in GLCM (see Table 2.1). Among them only 3 features namely, Angular
second moment, Entropy and Sum entropy are selected using ROC feature selection
with an AUC threshold. The results are shown in Table 4.20 and Table 4.21 for
MR datasets 1 and 2, respectively.
Table 4.20: Comparison of ROC analysis between GD-CoHOG and GLCM methods
using MR dataset 1.
Methods Maximum Optimal Optimal Classification
AUC Sensitivity Specificity Accuracy
GD-CoHOG 0.954 98% 96% 97.30%
GLCM 0.601 4% 95% 58.60%
We found the best results in downsampled images with a pixel size of 3×3 mm2
and of 4×4 mm2 for MR dataset 1 and 2, respectively. The proposed GD-CoHOG
method outperforms the GLCM method for both the MR datasets. For MR dataset
48
Table 4.21: Comparison of ROC analysis between GD-CoHOG and GLCM methods
using MR dataset 2.
Methods Maximum Optimal Optimal Classification
AUC Sensitivity Specificity Accuracy
GD-CoHOG 0.918 91% 93% 92.30%
GLCM 0.805 68% 76% 72.30%
1, we observe that GLCM has very poor performance. In particular, it has a high
specificity but a very low sensitivity. The overall classification accuracy is very
low compared to that of GD-CoHOG. Using MR dataset 2, GLCM has a better
performance than using MR dataset 1, but is still worse than that of GD-CoHOG.
MR dataset 1 was acquired using a high resolution 4.7 Tesla MRI system and MR
dataset 2 was acquired using a relatively low resolution 1.5 Tesla MRI system. So,
we can see from the comparison results that GLCM has a poor performance using
high resolution images than that of low resolution images. This is because of its
sensitivity to changes in the intensity levels that GLCM uses for features. Such
a finding is consistent with the observation that the proposed methods have very
similar performance using either MR datasets.
Moreover, we also compare the results of the proposed methods with 3D GLCM
that uses 3D texture analysis in ALS [30]. We compare the sensitivity and speci-
ficity in the CST using MR dataset 2. 3D GLCM achieves a sensitivity of 90%
and specificity of 95% whereas our proposed 2D GD-CoHOG method acquires a
sensitivity of 91% and specificity of 93%, which is comparable.
In this section, we analyze the effect of selecting the wrong slice in the experiment to
calculate the classification accuracy. Manual selection may cause error by selecting
the wrong slice. In this experiment we randomly choose slices from each subjects
to see how it affects the results.
MR Dataset 1 is used in this experiment. We use five slices of each subject.
49
These five slices include the manually selected slice along with two immediate slices
from both right and left side of the manually selected slice. Texture features are
calculated for one of the five slices of each subject. The experiment is done using
S-CoHOG with a neighborhood size of 8.
Figure 4.12: Classification accuracies for 10 random slice selection experiments and
the mean and the standard deviation of the classification accuracies.
50
mity, etc., may lead to tissue classification or image registration differences that
could reduce or wholly offset the enhanced statistical power of multicenter data
[53]. Therefore, using multicenter data for classification is still a major challenge
due to the use of different scanning parameters as well as the inherent differences in
image characteristics arising from different machines used in different centers. Sev-
eral works have found differences in multicenter data. A multicenter Voxel-Based
Morphometry (VBM) study was done using multicenter data in [54]. The same sub-
jects were used in different scanners with the results showing differences in spatial
patterns of the results between different scanners. Another VBM based multicenter
MRI analysis is done to study reliability in multicenter data [53]. The study was to
detect group differences and to estimate heritability when MRI scans from different
scanners running different acquisition protocols in a multicenter setup. A study
on multicenter data included subjects from three different countries to study gray
matter changes with reading disability [55]. VBM analysis showed significant group
differences.
In this experiment, we use multicenter data for classification using our proposed
methods. We use data from five different centers. T1-weighted MRI scans of the
subjects of different centers are performed using different MRI acquisition param-
eters. One sample image of the patients and controls for each center are shown in
Fig. 4.13. The details of the parameters are shown in Table 4.22.
Data from centers C1, C2 and C3 are acquired using a 3 Tesla GE Medical
Systems scanner and data from centers C4 and C5 are acquired using a 3 Tesla
Siemens Medical Systems scanner. As data from these two groups are acquired
51
Figure 4.13: Sample image slices of (a) controls and (b) patients of each center
from Multicenter dataset. Patients and controls are not distinguishable by visual
inspection.
using two different scanners, we combine the subjects of C1, C2 and C3 and formed
a multicenter (MC) dataset 1. Similarly, MC dataset 2 is formed using subjects
of centers C4 and C5. This datasets are formed to study the scanner specific
classification on multicenter data.
The details of the two MC datasets are shown in Table 4.23. MC dataset 1
contains 10 patients and 12 controls and MC dataset 2 contains 19 patients and 13
controls.
The selected 2D MRI images of the subjects are downsampled with an image
pixel size of 3×3 mm2 for both the MC dataset 1 and MC dataset 2. We also
form another dataset containing all the subjects from all the centers called MC Full
dataset (see Table 4.23). This dataset contains 29 patients and 25 controls. we
downsampled all the images to 3×3 mm2 physical resolution.
52
Table 4.24: ROC analysis of MC datasets using proposed methods with a neigh-
boring radius of 8 and GLCM.
Datasets Methods Maximum Optimal Optimal Classification
AUC Sensitivity Specificity Accuracy
S-CoHOG 0.950 81% 99% 91.00%
MC LFDG-CoHOG 0.929 87% 98% 93.30%
Dataset 1 GD-CoHOG 0.912 75% 90% 83.70%
GLCM 0.758 70% 67% 68.80%
S-CoHOG 0.807 87% 87% 87.00%
MC LFDG-CoHOG 0.802 88% 72% 81.80%
Dataset 2 GD-CoHOG 0.866 83% 84% 83.50%
GLCM 0.729 42% 75% 61.70%
S-CoHOG 0.846 86% 83% 85.10%
MC Full LFDG-CoHOG 0.821 81% 78% 80.10%
Dataset GD-CoHOG 0.817 80% 77% 78.60%
GLCM 0.715 52% 66% 59.30%
For all of the datasets, texture features are extracted using the proposed meth-
ods with a neighborhood size of 8. The classification accuracy along with the
sensitivity and specificity are shown in Table 4.24 for the three datasets. The re-
sults for MC dataset 1 is higher than that of the MC dataset 2. LFDG-CoHOG
achieves the best classification accuracy of 93.30% for MC dataset 1. The best
classification accuracy in MC Full dataset is achieved by S-CoHOG method. The
classification accuracy, sensitivity and specificity of these datasets are comparable
to that of using the datasets in section 4.2. Though these datasets have variations in
intensities and illuminations, the proposed methods can still differentiate between
the patients and controls.
We also compare our results with that of the GLCM method. For all the MC
datasets, the proposed methods have much better classification accuracies than that
of the GLCM as shown in Table 4.24.
53
for testing. Texture features are extracted using a neighboring radius of 8. We
use centers C1 and C3 in one group and centers C4 and C5 in another group for
this experiment to perform a scanner specific, between-center classification. Center
C2 is not used in this experiment because it contains only two subjects. For a
particular group, both centers are used for training and testing separately. The
classification results are shown in Table 4.25.
Table 4.25: Classification accuracy using data from one center for training and the
other center for testing.
Proposed Train Test Classification
Method Center Center Accuracy
C1 C3 93.00%
C3 C1 90.00%
S-CoHOG
C4 C5 91.50%
C5 C4 80.00%
54
Chapter 5
Conclusion
5.1 Summary
In this thesis, based on the original CoHOG method, three novel gradient-based
methods are proposed. Gradient operators Sobel, GD and LFDG are used in the
proposed S-CoHOG, GD-CoHOG and LFDG-CoHOG methods, respectively. For
the first time, we apply the proposed methods to the whole image instead of to
sub-regions for feature calculation to reduce the sub-region issue problem in the
original CoHOG. The original CoHOG method uses a maximum neighborhood size
of 4. We also use a larger neighborhood size of 8 for co-occurrence calculation.
The extracted feature vector size is very large. Using this large number of similar
features creates ambiguity in creation of an optimal hyperplane and leads to an
erroneous classification by a classifier. For the first time, we apply the feature
selection method on the extracted CoHOG features to select significant features
using ROC analysis with a significance level of p ≤ 0.01 and an AUC threshold.
The selected features are used in a linear support vector machine (SVM) classifier
to determine the classification accuracy.
Three well-known texture datasets, INRIA Person, CUReT and the UIUC are
used to evaluate the classification accuracy of the proposed methods. The proposed
methods achieve the best classification results using a neighborhood size of 8 with
55
feature selection. The proposed S-CoHOG and GD-CoHOG methods achieve a
maximum classification accuracy of 99.90% for the INRIA Person dataset. A max-
imum classification accuracy of 98.30% and 98.00% are achieved by GD-CoHOG
method for the CUReT and the UIUC datasets. The classification results of the
proposed methods are compared with that of the original CoHOG method. The
classification results show that the proposed methods achieve the best classification
results on all the datasets that outperform that of the original CoHOG method.
Three different datasets of 2D MRI are used for classification. Each dataset
has a different image resolution and contrast. MRI imaging of ALS patients and
controls are classified using the proposed methods. To the best of our knowledge,
we are the first to use the CoHOG-like methods to study cerebral degeneration
in ALS. A multicenter ALS dataset with images having the same resolution but
different contrasts is also used to demonstrate the classification performance of the
proposed methods. The experimental results demonstrate that our methods have
promising classification abilities with high sensitivity and specificity. In particular,
the GD-CoHOG method achieves the maximum classification accuracy of 97.30%
for MR dataset 1. For these datasets, we compare the results of the proposed
methods with that of the GLCM method. The classification results show that the
proposed methods outperform that of the GLCM method. Also the sensitivity and
specificity of the proposed methods have higher than that of the GLCM method.
Region based analysis is also performed and the result shows that areas most re-
sponsible for significant differences between the patients and controls are congruent
with the spatial distribution of the pathology of ALS. For the multicenter dataset,
classification is done using data from one center for training and that from the
other center for testing. This experiment is done to address the issue when there is
a lack of subjects in a center and whether data from another center can be used for
training. The classification accuracy is promising for such a multicenter setting.
In summary the proposed CoHOG based methods show excellent classification
accuracy in different texture datasets. As well, the proposed methods show excel-
56
lent classification accuracy in ALS datasets of different contrasts (T1 and T2) and
data collected from different MRI machines. Thus, the proposed methods using
texture show promise as a potential method to identify ALS. Future research using
the proposed methods in a multicenter setting is much warranted in addition to
determining the ability of the method to monitor disease progression.
5.2 Contributions
The main contributions of this thesis include:
1. The proposed three methods use the whole image instead of subdividing it
into sub-regions. The use of sub-regions limits the accuracy of co-occurrence
matrix for boundary pixels and thus some information is incomplete for each
sub-region. Also it increases the size of the feature vector. Thus, using the
whole image not only reduces the boundary pixels problem in sub-regions but
also reduces the size of feature vector.
2. For the first time, we adopt two gradient operators GD and LFDG for the
proposed GD-CoHOG and LFDG-CoHOG methods, respectively. The pro-
posed methods are compared to see the impact of the gradient operators on
classification accuracy using the whole image.
3. Texture features are extracted using two different neighborhood sizes. The
original CoHOG method uses a maximum neighborhood size of 4. We use a
larger neighborhood size of 8 to see the effect of using more spatial information
for co-occurrence calculation on classification accuracy. Indeed, the experi-
mental results confirm our expectation that a better classification accuracy is
achieved using a neighborhood size of 8.
4. The extracted feature vector size using the CoHOG method is large with
many similar features. Changes that occur in a small portion of the images
57
between two classes produce a large number of similar features. Using this
large number of similar features not only creates ambiguity in creation of
an optimal hyperplane but also leads to the wrong classification for a classi-
fier. We are the first to use a feature selection method to extract significant
CoHOG texture features using area under the ROC curve (AUC) analysis
for classification. Only features that contain significant differences between
classes are selected using an AUC threshold. Experimental results show that
classification using feature selection has a better accuracy than that without
using the feature selection.
1. The proposed CoHOG based methods are applied to other areas of image
classification and to study cerebral degeneration in ALS more extensively.
Other areas that are applicable include document processing, remote sensing,
automated inspection, fingerprint recognition, etc.
58
2. The proposed 2D CoHOG based methods can be extended to 3D methods to
extract features from a 3D image. The 3D methods can be used to extract
features from 3D MRI scans of the brain. For a 3D method we have to use
a spherical neighborhood rather than a circular neighborhood that we have
used for the proposed 2D CoHOG method. Also 3D gradient operators are
needed for the calculation of gradient orientations of the 3D image.
The above are some of the many interesting problems in which the proposed
methods may be useful.
59
Bibliography
[1] Ming Zhao, Shutao Li, and James Kwok. Text detection in images using sparse
representation with discriminative dictionaries. Image and Vision Computing,
28(12):1590–1599, 2010.
[3] Wei-Chen Li and Du-Ming Tsai. Wavelet-based defect detection in solar wafer
images with inhomogeneous texture. Pattern Recognition, 45(2):742–756, 2012.
[4] Loris Nanni and Alessandra Lumini. Local binary patterns for a hybrid fin-
gerprint matcher. Pattern recognition, 41(11):3461–3466, 2008.
[6] Parveen Lehana, Swapna Devi, Satnam Singh, Pawanesh Abrol, Saleem Khan,
and Sandeep Arya. Investigations of the mri images using aura transformation.
Signal & Image Processing, 3(1):95, 2012.
[7] Timo Ojala, Matti Pietikäinen, and Topi Mäenpää. Multiresolution gray-
scale and rotation invariant texture classification with local binary patterns.
Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(7):971–
987, 2002.
60
[8] Robert M Haralick, Karthikeyan Shanmugam, and Its’ Hak Dinstein. Tex-
tural features for image classification. Systems, Man and Cybernetics, IEEE
Transactions on, (6):610–621, 1973.
[9] Xuejie Qin and Yee-Hong Yang. Similarity measure and learning with gray
level aura matrices (glam) for texture image retrieval. In Computer Vision
and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE
Computer Society Conference on, volume 1, pages I–326. IEEE, 2004.
[10] Mary M Galloway. Texture analysis using gray level run lengths. Computer
graphics and image processing, 4(2):172–179, 1975.
[11] Xinqi Chu and Kap Luk Chan. Rotation and scale invariant texture analysis
with tunable gabor filter banks. In Advances in Image and Video Technology,
pages 83–93. Springer, 2009.
[12] Thomas Leung and Jitendra Malik. Representing and recognizing the visual
appearance of materials using three-dimensional textons. International journal
of computer vision, 43(1):29–44, 2001.
[13] Manik Varma and Andrew Zisserman. A statistical approach to texture clas-
sification from single images. International Journal of Computer Vision, 62
(1-2):61–81, 2005.
[14] Michael Unser and Murray Eden. Multiresolution feature extraction and se-
lection for texture segmentation. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, 11(7):717–728, 1989.
[16] George R Cross and Anil K Jain. Markov random field texture models. Pattern
Analysis and Machine Intelligence, IEEE Transactions on, (1):25–39, 1983.
61
[17] Fernand S. Cohen, Zhigang Fan, and Maqbool A Patel. Classification of rotated
and scaled textured images using gaussian markov random field models. IEEE
Transactions on Pattern Analysis & Machine Intelligence, (2):192–202, 1991.
[18] Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. A sparse texture repre-
sentation using local affine regions. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, 27(8):1265–1278, 2005.
[19] David G Lowe. Distinctive image features from scale-invariant keypoints. In-
ternational journal of computer vision, 60(2):91–110, 2004.
[20] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human
detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005.
IEEE Computer Society Conference on, volume 1, pages 886–893. IEEE, 2005.
[21] Tomoki Watanabe, Satoshi Ito, and Kentaro Yokoi. Co-occurrence histograms
of oriented gradients for pedestrian detection. In Advances in Image and Video
Technology, pages 37–47. Springer, 2009.
[22] Hirokatsu Kataoka, Kiyoshi Hashimoto, Kenji Iwata, Yutaka Satoh, Nassir
Navab, Slobodan Ilic, and Yoshimitsu Aoki. Extended co-occurrence hog with
dense trajectories for fine-grained activity recognition. In Computer Vision–
ACCV 2014, pages 336–349. Springer, 2014.
[23] Kazim Hanbay, Nuh Alpaslan, Muhammed Fatih Talu, Davut Hanbay, Ali
Karci, and Adnan Fatih Kocamaz. Continuous rotation invariant features for
gradient-based texture classification. Computer Vision and Image Understand-
ing, 132:87–101, 2015.
[24] Thanh-Toan Do and Ewa Kijak. Face recognition using co-occurrence his-
tograms of oriented gradients. In Acoustics, Speech and Signal Processing
(ICASSP), 2012 IEEE International Conference on, pages 1301–1304. IEEE,
2012.
62
[25] TR Sivapriya, V Saravanan, and P Ranjit Jeba Thangaiah. Texture analysis
of brain mri and classification with bpn for the diagnosis of dementia. In
Trends in Computer Science, Engineering and Information Technology, pages
553–563. Springer, 2011.
[26] Xin Li, Hong Xia, Zhen Zhou, and Longzheng Tong. 3d texture analysis of
hippocampus based on mr images in patients with alzheimer disease and mild
cognitive impairment. In Biomedical Engineering and Informatics (BMEI),
2010 3rd International Conference on, volume 1, pages 1–4. IEEE, 2010.
[27] Ahmed Kharrat, Nacéra Benamrane, Mohamed Ben Messaoud, and Mohamed
Abid. Detection of brain tumor in medical images. In Signals, Circuits and
Systems (SCS), 2009 3rd International Conference on, pages 1–6. IEEE, 2009.
[30] Rouzbeh Maani, Yee-Hong Yang, Derek Emery, and Sanjay Kalra. Cerebral
degeneration in amyotrophic lateral sclerosis revealed by 3-dimensional texture
analysis. Frontiers in Neuroscience, 10, 2016.
[31] Richard O Duda, Peter E Hart, et al. Pattern classification and scene analysis,
volume 3. Wiley New York, 1973.
[32] Rouzbeh Maani, Sanjay Kalra, and Yee-Hong Yang. Robust volumetric texture
63
classification of magnetic resonance images of the brain using local frequency
descriptor. Image Processing, IEEE Transactions on, 23(10):4625–4636, 2014.
[33] Rouzbeh Maani, Sanjay Kalra, and Yee-Hong Yang. Rotation invariant lo-
cal frequency descriptors for texture classification. Image Processing, IEEE
Transactions on, 22(6):2409–2419, 2013.
[35] Yang Zhao, Rong-Gang Wang, Wen-Min Wang, and Wen Gao. Local quanti-
zation code histogram for texture classification. Neurocomputing, 2016.
[36] Yang Zhao, Rong-Gang Wang, Wen-Min Wang, and Wen Gao. Local quanti-
zation code histogram for texture classification. Neurocomputing, 2016.
[37] Jun Zhang, Heng Zhao, and Jimin Liang. Continuous rotation invariant local
descriptors for texton dictionary-based texture classification. Computer Vision
and Image Understanding, 117(1):56–75, 2013.
[39] Shuangge Ma and Jian Huang. Regularized roc method for disease classification
and biomarker selection with microarray data. Bioinformatics, 21(24):4356–
4362, 2005.
64
[41] Malak Alshawabkeh, Javed A Aslam, Jennifer Dy, and David Kaeli. Feature
selection metric using auc margin for small samples and imbalanced data clas-
sification problems. In Machine Learning and Applications and Workshops
(ICMLA), 2011 10th International Conference on, volume 1, pages 145–150.
IEEE, 2011.
[42] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector
machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–
27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/
libsvm.
[43] Kristin J Dana, Bram Van Ginneken, Shree K Nayar, and Jan J Koen-
derink. Reflectance and texture of real-world surfaces. ACM Transactions
on Graphics (TOG), 18(1):1–34, 1999. The Dataset is available at http:
//www.cs.columbia.edu/CAVE/software/curet/.
[44] Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. A sparse texture rep-
resentation using local affine regions. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 27(8):1265–1278, 2005. The Dataset is available at
http://www-cvr.ai.uiuc.edu/ponce_grp/data/.
[45] Benjamin Rix Brooks, Robert G Miller, Michael Swash, and Theodore L Mun-
sat. El escorial revisited: revised criteria for the diagnosis of amyotrophic
lateral sclerosis. Amyotrophic lateral sclerosis and other motor neuron disor-
ders, 1(5):293–299, 2000.
[46] Matthew C Kiernan, Steve Vucic, Benjamin C Cheah, Martin R Turner, An-
drew Eisen, Orla Hardiman, James R Burrell, and Margaret C Zoing. Amy-
otrophic lateral sclerosis. The Lancet, 377(9769):942–955, 2011.
65
in amyotrophic lateral sclerosis patients with mild disability. Human brain
mapping, 28(12):1430–1438, 2007.
[48] Julian Grosskreutz, Jörn Kaufmann, Julia Frädrich, Reinhard Dengler, Hans-
Jochen Heinze, and Thomas Peschel. Widespread sensorimotor and frontal
cortical atrophy in amyotrophic lateral sclerosis. BMC neurology, 6(1):1, 2006.
[50] Jack L. Lancaster and Michael J. Martinez. Mango Software. URL http:
//rii.uthscsa.edu/mango/mango.html.
[51] Paul A. Yushkevich, Joseph Piven, Heather Cody Hazlett, Rachel Gim-
pel Smith, Sean Ho, James C. Gee, and Guido Gerig. User-guided 3D ac-
tive contour segmentation of anatomical structures: Significantly improved
efficiency and reliability. Neuroimage, 31(3):1116–1128, 2006.
[52] Caroline A Schneider, Wayne S Rasband, Kevin W Eliceiri, et al. Nih image
to imagej: 25 years of image analysis. Nat methods, 9(7):671–675, 2012.
[53] Hugo G Schnack, Neeltje EM van Haren, Rachel M Brouwer, G Caroline M van
Baal, Marco Picchioni, Matthias Weisbrod, Heinrich Sauer, Tyrone D Cannon,
Matti Huttunen, Claude Lepage, et al. Mapping reliability in multicenter mri:
Voxel-based morphometry and cortical thickness. Human brain mapping, 31
(12):1967–1982, 2010.
[54] Niels K Focke, Gunther Helms, Susanne Kaspar, Christine Diederich, Vera
Tóth, Peter Dechent, Alexander Mohr, and Walter Paulus. Multi-site voxel-
based morphometrynot quite there yet. Neuroimage, 56(3):1164–1170, 2011.
66
[55] Katarzyna Jednorog, Artur Marchewka, Irene Altarelli, Ana Karla Monza-
lvo Lopez, Muna van Ermingen-Marbach, Marion Grande, Anna Grabowska,
Stefan Heim, and Franck Ramus. How reliable are gray matter disruptions
in specific reading disability across multiple countries and languages? insights
from a large-scale voxel-based morphometry study. Human brain mapping, 36
(5):1741–1754, 2015.
67