Face Detection System Based On MLP Neural Network
Face Detection System Based On MLP Neural Network
HTTPS://[Link]/SITE/JOURNALOFCOMPUTING/
[Link] 57
Abstract—Face detection is the problem of determining whether there are human faces in the image and tries to make a
judgment on whether or not that image contains a face. In this paper, we propose a face detector using an efficient architecture
based on a Multi-Layer Perceptron (MLP) neural network and Maximal Rejection Classifier (MRC). The proposed approach
significantly improves the efficiency and the accuracy of detection in comparison with the traditional neural-network techniques.
In order to reduce the total computation cost, we organize the neural network in a pre-stage that is able to reject a majority of
non-face patterns in the image backgrounds, thereby significantly improving the overall detection efficiency while maintaining
the detection accuracy. An important advantage of the new architecture is that it has a homogeneous structure so that it is
suitable for very efficient implementation using programmable devices. Comparisons with other state-of-the-art face detection
systems are presented. Our proposed approach achieves one of the best detection accuracies with significantly reduced
training and detection cost.
Index Terms— Face Detection, Neural Network, MLP Neural Network, Training, Learning, MRC.
—————————— ——————————
1 INTRODUCTION
in the sample space. finding faces. As shown in fig.3.
The last stage classifier is the MLP neural network clas‐
sifier that has been used extensively in classification and
regression. The expected face window resulted from the
MRC classifier passed to the MLP classifier to decide
whether the window contains a face. The employed mul‐
tilayer feed forward neural network consists of neurons
with a sigmoid activation function. It is used in two mod‐ Fig. 3 Maximal Rejection Classifier ʺ1ʺ
es. In classification mode, it is presented at the input layer After that we take only the remaining non‐face part in the
and propagated forward through the network to compute dark green and finding theta two, it means that we must
the activation value for each output neuron. The second find another theta and two decision levels that the num‐
mode is called the training or learning mode. Learning in ber of rejected non‐faces is maximized while finding faces
ANN involves the adjustment of the weights in order to as shown in fig.4.
achieve the desired processing for a set of learning face
samples. More specifically, the second mode includes
feeding a neural network with a number of training pairs.
Then the networks parameters are adjusted through a
supervised training algorithm.
2.2 MRC Face Detection
The Maximal Rejection Classifier (MRC) is a linear clas‐
sifier that overcomes the two drawbacks. While maintain‐
Fig.4 Maximal Rejection Classifier ʺ2ʺ
ing the simplicity of a linear classifier, it can also deal
with non‐linearly separable cases. The only requirement
The first stage in the MRC is to gather two example‐sets,
is that the Clutter class and the Target class are disjoint.
Faces and Non‐Faces. Large enough sets are needed in
MRC is an iterative rejection based classification algo‐
order to guarantee good generalization for the faces and
rithm. The main idea is to apply a linear projection fol‐
the non‐faces that may be encountered in images.
lowed by a threshold (in each iteration). However, as op‐
posed to these two methods, the projection vector and the
The target class should be assumed to be convex in order
corresponding thresholds are chosen such that at each
for the MRC to perform well. The target class containing
iteration. MRC attempts to maximize the number of re‐
frontal faces. One can easily imagine two faces that are
jected Clutter samples. This means that after the first clas‐
not perfectly aligned, and therefore, when averaged,
sification iteration, many of the Clutter samples are al‐
create a new block with possibly four eyes, or a nose and
ready classified as such, and discarded from further con‐
its echo, etc. To our help comes the fact that we are using
sideration. The process is continued with the remaining
low‐ resolution representation of the faces (20 * 20 pix‐
Clutter samples, again searching for a linear projection
els).This implies that even for such misaligned faces ,the
vector and thresholds that maximizes the rejection of
convex average appear as a face, and thus our assumption
Clutter points from the remaining set. This process is re‐
regarding convexity of the faces class is valid. The Non‐
peated iteratively until convergence to zero or a small
Face set is required to be much larger, in order to
number of Clutter points. The remaining samples at the
represent the variability of Non‐Face patterns in images.
final stage are considered as targets.
2.2.1Rejection on MRC
Now, that we have segmented every image into several The rejection in MRC is as follow:
segments and approximated every segment with a small Build a combination of classifiers: we must find theta
number of representative pixels, we can exhaustively and two decision levels that the number of rejected
search for the best combination of segments that will re‐ non‐faces is maximized while finding faces.
ject the largest number of non‐face images. We repeat this Apply the weak classifiers sequentially while reject‐
process until the improvement in rejection is negligible. It ing non‐faces: it means that we must find another
means that we must find theta and two decision levels theta and two decision levels that the number of re‐
that the number of rejected non‐faces is maximized while jected non‐faces is maximized while finding faces as
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 8, AUGUST 2010, ISSN 2151-9617
HTTPS://[Link]/SITE/JOURNALOFCOMPUTING/
[Link] 60
shown in the fig.5. system is included detailing the operation of the detector.
Fig.5 Maximal Rejection Classifier All Stages.
2.3 The MLP neural network classifier
Fig.6 The MLP Network
Fig.6 shows the MLP neural network implemented in our
face detection system. It is composed of 3 layers, one in‐
put, one hidden and one output. The input layer consti‐ 3 EXPERIMENTAL RESULTS
tutes of 400 neurons (20*20) which receive pixel binary The face detection system presented in this paper was
data from a 20x20 symbol pixel matrix. The size of this developed, trained, and tested using MATLAB™ 7.0, vis-
matrix was decided taking into consideration the average ual studio .NET 2003 on the Intel Pentium (4) 3.20 GHz
1.00GB of RAM and Windows XP operating system.
height and width of character image that can be mapped
We divide our system into two parts; first is training and
without introducing any significant pixel noise. second is testing.
The hidden layer constitutes of 300 neurons whose num‐ Training: Each stage in the cascade was trained using a
ber is decided on the basis of optimal results on a trial positive set, a negative set. In order to train the Face De-
and error basis. The output layer is 1 neuron. To initialize tection System to figure out the face and non-face images.
We need to pass a collect (bunch) of face and non-face
the weights a random function was used to assign an ini‐
training data to give out 1 for face and -1 for non-face.
tial random number which lies between two preset integ‐ Initially our face training set contains face images col-
ers named bias. The weight bias is selected from trial and lected from MIT face databases. Our face detection system
error observation to correspond to average weights for has been applied to several test images (faces were fron-
quick convergence. tal). All images were scaled to 20x20 pixels, and satisfac-
tory results have been obtained. The test set consists of a
total of 2000 face and non-face images given in the MIT
After the image data enters the input node, it is calculated
database. Non-face patterns are generated at different
by the weights. Face or non‐face data is determined by locations and scales from images with various subjects,
comparing output results with the thresholds. For exam‐ such as rocks, trees, buildings, and flowers, which contain
ple, if the output is larger than the threshold, it is consi‐ no faces.
dered as a face data.
For the training images, because of the lighting, shadow,
contract variances, we need to equalize those differences.
A problem that arises with window scanning techniques The linear fit function will approximate the overall
is overlapping detections. Deals with this problem brightness of each part of the window.
through two heuristics: To train the cascade of classifiers of the rejection stage to
Thresholding: the number of detections in a small detect frontal upright faces, the same face and non-face
region surrounding the current location is counted, patterns are used for all stages. The neural net is trained
for 191 epochs as shown in fig.7.
and if it is above a certain threshold, a face is present
at this location.
Overlap elimination: when a region is classified as a
face according to thresholding, then overlapping de‐
tections are likely to be false positives and thus are re‐
jected.
Windows thought to contain a face are outlined with a
bounding box and on completion a copy of the image is
displayed, indicating the locations of any faces detected.
In the next section a more thorough description of the Fig.7 Trained Epochs.
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 8, AUGUST 2010, ISSN 2151-9617
HTTPS://[Link]/SITE/JOURNALOFCOMPUTING/
[Link] 61
System Testing: Firstly, we compare the first classifier Table1 MRC Testing Results
learned using MRC algorithm on three images; secondly Image Detection Average False False
we compare those images on the all stages on our system. # Int. % Time Positive Negative
Finally, we compare our face detection system with oth- 1 92 17 sec. 0 2
ers. 2 100 34 sec. 0 0
This experiment compares single strong classifier learned Table 2: MRC-MLP Testing Results
using MRC algorithm in the classification performance.
The database consists of two images as shown in fig.8.
Image Detection Average False False
The first image contains 8 frontal faces and the second
# Int. % Time Positive Negative
image contains 11 frontal faces. The system has been
1 100 19 sec. 0 0
tested using the MRC classifier alone. Then, the system
2 100 42 sec. 1 0
has been tested using the MRC-MLP cascaded face detec-
tion system. The images below have complex background
and occlusion faces and it have different shapes and dif-
ferent lighting. 4 COMPARISON AND DISCUSSION
Table 3 shows the comparison of the proposed system
with well-known classifiers used in the literature. This
comparison shows that our system gives good results.
The results shows that the MRC-MLP classifier has detec-
tion rate more than SVM and MLP classifiers and an error
rate (False positive and false negative) less than Adaboost
classifier[6].
References:
[1] Aamer M., Ying W., Jianmin J. and Stan I., Face Detec-
tion based Neural Networks using Robust Skin Color
Segmentation, Amman, IEEE SSD 2008. 5th Internation-
al Multi-Conference, 20-22 July 2008, pp.1 – 5.
[2] Rowley, H.A., Baluja, S., and Kanade T., Neural net-
work-based face detection. Pattern Analysis and
Machine Intelligence, IEEE Transactions on, Vol.20, no.1,
1998, pp.23 – 38
[3] M-H Yang, D. Kriegman and N. Ahuja, Detecting Face
in Images:A Survey, IEEE Transaction on Pattern
Analysis and Machine Intelligence, vol. 24, no. 1, 2002,
pp. 34-58.
[5] Aouatif A., Sanaa G. and Mohammed R., Face Detec-
tion in Still Color Images Using Skin Color Informa-
tion, ISBN: 978-1-4244-1751-3 ,IEEE, April 2008.
[6] P. Viola and M. J. Jones, “Robust real-time face detec-
tion,” International Journal of Computer Vision, vol.
57, no. 2, pp. 137– 154, 2004.
[8] Elad, M.; Hel-Or, Y.; Keshet, R. ,Pattern detection us-
ing a maximal rejection classifier, pattern recognition
letters, Vol.23, 2001, pp. 1459-1471.
[9] Christophe Garcia and Manolis Delakis, A Neural Ar-
chitecture for Fast and Robust Face Detection, IEEE
The cascaded system enhances face detection rates and reduces false positives by deploying a two-stage approach where the Maximal Rejection Classifier (MRC) and the Multi-Layer Perceptron (MLP) neural network work in succession. The MRC first filters out non-face patterns, significantly reducing the search space despite low resolution. This initial reduction minimizes the chances of false positives reaching the second stage. Then, the MLP, trained to learn non-linear boundary decisions, provides a rigorous final classification, ensuring that only true face patterns pass through. Such cascading allows for greater specificity and sensitivity compared to a single classifier approach, which may not utilize early stage filtering .
The main limitation of using a 20x20 pixel resolution for facial feature extraction is that it significantly reduces the amount of detail available, which can potentially impair the ability to capture intricate facial features. However, the proposed system overcomes these limitations by employing the Maximal Rejection Classifier (MRC) and neural network techniques that are robust to low-resolution input. The MRC can reject incorrect patterns effectively even with minimal data, and the MLP is trained to distill crucial features from the modest resolution, assuming the convexity of the target class helps manage misalignment issues implicitly, compensating for detail loss .
The Maximal Rejection Classifier (MRC) provides significant benefits in face detection by maintaining the simplicity of linear classifiers while effectively handling non-linearly separable cases. This capability is achieved because the MRC iteratively maximizes the rejection of clutter or non-face samples by selecting optimal linear projection vectors and thresholds in each iteration, continuing this process until only a minimal number of clutter points remain. MRC's rejection capability ensures that non-face patterns are filtered early, allowing face detection to focus on more likely candidate regions .
The robustness of the MRC-MLP cascaded system in face detection arises from its comprehensive training process. Training involves several steps: gathering large and diverse datasets of face and non-face images, applying techniques like histogram equalization to normalize different lighting conditions, and ensuring the training data is sufficiently large and varied to represent possible real-world scenarios. The MRC first performs a coarse classification to discard clear non-face images. The remaining challenging samples are passed to the MLP, which fine-tunes the detection using its learning from extensive training sets. The combinatorial strategy of using MRC to handle clutter and MLP for fine decisions ensures the system's accuracy and reduces false negatives and positives when deployed .
The face detection system addresses the challenges of color images by specifically employing a preprocessing stage that standardizes images in terms of size, contrast, and color level. Initially, histogram equalization is applied to adjust image contrast and compensate for illumination differences. Subsequently, images are converted to grayscale, which simplifies the complexity associated with color data while preserving essential structural information relevant for face detection. These preprocessing steps ensure more uniform input to subsequent classification stages, mitigating issues like lighting variance and color diversity .
Histogram equalization is used in the image enhancement stage to adjust the image contrast by spreading the intensities across the available range. This step compensates for variations in illumination across different images, making it easier to distinguish features necessary for face detection. With improved contrast, the subsequent preprocessing and classification stages, including facial feature extraction, can perform more effectively as the features become more discernible against the image background .
The testing procedures for evaluating the proposed face detection system involved using a dataset from the MIT face database, comprising 2000 face and non-face images. Non-face images included various objects, which helped assess the system's ability to distinguish correctly across different forms and backgrounds. The tests evaluated the system's performance in terms of detection rate, time efficiency, and error rate (false positives and negatives). The cascaded MRC-MLP system showed a significant improvement, achieving a 100% detection rate in test images while maintaining low error rates, even compared to the MRC alone, which detected more than 90% of faces but with a longer average time .
The MLP neural network acts as the final decision-making stage in the face detection system. After the MRC has filtered most non-face patterns, the MLP neural network receives expected face candidates to determine definitively whether they contain a face. This complementary role is efficient because the MRC significantly reduces the search space for the neural network, allowing the MLP to focus on challenging patterns where it performs best. By using a multilayer perceptron with a sigmoid activation function, the MLP can learn complex decision boundaries necessary for accurately classifying face and non-face images .
The iterative process in the MRC method enhances classification performance by continuously optimizing the rejection of non-face samples. At each iteration, MRC identifies a linear projection vector and threshold combination that maximizes the rejection of clutter samples. By discarding these samples early, the process focuses computational resources on potential face regions. This iterative refinement means that as iterations proceed, only the most ambiguous cases are left for further processing, thereby improving efficiency and accuracy by concentrating on fewer but more critical computations .
The MRC-MLP system demonstrated a higher detection rate compared to Support Vector Machines (SVM) and better error rate management than AdaBoost in tests. Specifically, while SVM and MLP classifiers handled detection with respective rates of 88.14% and 91.25%, the MRC-MLP achieved 91.6%. The error rate was superior to that of AdaBoost, noted for its detection error resilience, indicating that MRC-MLP manages a balance between detection accuracy and error minimization effectively .