0% found this document useful (0 votes)
123 views6 pages

Document From Shruti Singhal

This document summarizes a research paper on using convolutional neural networks (CNNs) for age and gender classification from facial images. The researchers collected a dataset of 100 male and female facial images ranging from ages 5 to 60. They created a CNN model with 3 convolutional layers and 2 fully connected layers, trained it on the dataset using Keras. The CNN was able to accurately predict age and gender from facial images, outperforming previous methods. This shows that CNNs can obtain good performance on age and gender classification tasks from facial images.

Uploaded by

Shruti Singhal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views6 pages

Document From Shruti Singhal

This document summarizes a research paper on using convolutional neural networks (CNNs) for age and gender classification from facial images. The researchers collected a dataset of 100 male and female facial images ranging from ages 5 to 60. They created a CNN model with 3 convolutional layers and 2 fully connected layers, trained it on the dataset using Keras. The CNN was able to accurately predict age and gender from facial images, outperforming previous methods. This shows that CNNs can obtain good performance on age and gender classification tasks from facial images.

Uploaded by

Shruti Singhal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Age and Gender Classification using

Convolutional Neural Network


Muskan Chawla, Anupma Gadhwal and Kunal Jain
Department of Computer Science and Electronics
Bharati Vidyapeeth’s College of Engineering

ABSTRACT such as expression, gender and age. Human


beings can detect and analyze this information
Accurately predicting the age of humans is an easily, for example, majority of people are able
extremely challenging task. Automatic age and to recognize human traits like gender, where
gender classification has become important to they can tell if the person is male or female by
many applications, particularly since the rise of only seeing his/her face. Similarly, they can
social media. In this paper, we show that by the determine the age of the person and say whether
use of convolutional neural networks (CNN), a that person is a child or an adult.
significant increase in performance can be
obtained on these tasks. First, the dataset is In this paper we attempt to close the gap
collected from GOOGLE and consists of 100 between automatic face recognition capabilities
images of male/female of age group 5 to 60. and those of age and gender estimation
We created a CNN network adjoining the fully methods. To this end, we follow the successful
connected network using Keras library. We example laid down by recent face recognition
inserted 3 convoluted layer with first layer systems: Face recognition techniques described
containing 64 neurons and a window of 7x7 in the last few years have shown that
followed by Relu activation layer and max tremendous progress can be made by the use of
pooling layer of 2x2 this is followed by 2 more deep convolutional neural networks (CNN) .
CNN layers with 100 and 64 neurons and a 5x5 We demonstrate similar gains with a simple
and 3x3 respectively , with activation function network architecture, designed by considering
as Relu. After we received the output from the rather limited availability of accurate age
CNN output was in a matrix form so we and gender labels in existing face data sets. On
flattened it using a fatten layer and fed it to the other hand, constructing applications to
fully connected layers with 64 and 1 neurons identify the people from their face and
each. Final output layer consist of a single extracting their age and gender information is a
neuron with Sigmoid Activation function. challenging task for computer vision as well as
pattern recognition. Computer vision includes
various methods and techniques for
Keywords understanding, analysing, and extracting
CNN (Convolutional Neural Network) , Dense information from images. In other words, it’s a
layer, Pixels, adam, haarcascade, opencv. science that works on building a system that is
able to make the computer see and describe our
[Link] world. Pattern recognition, on the other hand, is
a technology providing identification,
The human face holds very important quantity description, and interpretation of how machines
of attributes and information about the person, can recognize and detect the pattern, which
could be a shape, speech signal, fingerprint calculating ratios between different
image, a handwritten word, environment, or a measurements of facial features [29]. Once
human face. The design of a pattern recognition facial features (e.g. eyes, nose, mouth, chin,
system involves three parts- pre-processing, etc.) are localized and their sizes and distances
features extraction, and classification. measured, ratios between the mare calculated
and used for classifying the face into different
age categories according to hand-crafted rules.
More recently, [6] uses a similar approach to
model age progression in subjects under 18
years old. As those methods require accurate
localization of facial features, a challenging
problem by itself, they are unsuitable for in-the-
wild images which one may expect to find on
social platforms. On a different line of work are
methods that represent the aging process as a
subspace [16] or a manifold [19]. A drawback
of those methods is that they require input
images to be near-frontal and well-aligned.
These methods therefore present experimental
results only on constrained data-sets of near-
frontal images (e.g UIUC-IFP-Y [12, 19] ,FG-
NET[30] and MORPH[23]). Again, as a
Figure 1. Faces for age and gender classification consequence, such methods are ill-suited for
. These images represent some of the challenges unconstrained images. Different from those
of age and gender estimation from real-world, described above are methods that use local
unconstrained images. features for representing face images. In [25]

Gaussian Mixture Models (GMM) [13] were


2. RELATED WORK used to represent the distribution of facial
patches. In [24] GMM were used again for
Before describing the proposed method we representing the distribution of local facial
briefly review related methods for age and measurements, but robust descriptors were used
gender classification and provide a cursory instead of pixel patches. Finally, instead of
overview of deep convolutional networks. GMM, Hidden-Markov Model, super-vectors
[20] were used in [26] for representing face
2.1 Age and Gender Classification patch distributions. An alternative to the local
image intensity patches are robust image
2.1.1 Age classification. The problem of
descriptors: Gabor image descriptors [22] were
automatically extracting age related attributes
used in [15] along with a Fuzzy-LDA classifier
from facial images has received increasing
which considers a face image as belonging to
attention in recent years and many methods
more than one age class. In [20] a combination
have been put fourth. A detailed survey of such
of Biologically-Inspired Features (BIF) [44]
methods can be found in [11] and in [21]. We
and various manifold-learning methods were
note that despite our focus here on age group
used for age estimation. Gabor [23] and local
classification rather than precise age estimation
binary patterns (LBP) [1] features were used in
(i.e., age regression), the survey below includes
[7] along with a hierarchical age classifier
methods designed for either task. Early
composed of Support Vector
methods for age estimation are based on
Machines(SVM)[9] to classify the input image
to an age-class followed by a support vector specific predictive modeling problem, such as
regression [10] to estimate a precise age. image classification. The result is highly
Finally,[4] proposed improved version so specific features that can be detected anywhere
frelevant component analysis [3] and locally on input images.
preserving projections [26]. Those methods are
used for distance learning and dimensionality
reduction, respectively, with Active
Appearance Models [8] as an image feature. All
of these methods have proven effective on
small and/or constrained benchmarks for age
estimation. To our knowledge, the best
performing methods were demonstrated on the
Group Photos benchmark [14]. We show our
proposed method to outperform the results they
report on the more challenging Adience
benchmark, designed for the same task.

2.1.2 Gender classification. A detailed survey Figure2. CNN(Convolutional Neural Network)


of gender classification methods can be found
3. METHODOLOGY
in [4] and more recently in [12]. Here we
quickly survey relevant methods. One of the 3.1 DATA COLLECTION
early methods for gender classification [17]
used a neural network trained on a small set of The dataset is collected from GOOGLE and
near-frontal face images. In [27] the combined consists of 100 images of male/female of age
3D structure of the head (obtained using a laser group 5 to 60.
scanner) and image intensities were used for
3.2 DATASET FORMATION
classifying gender. SVM classifiers were used
by [25], applied directly to image intensities. It is a process to create a labelled dataset which
Rather than using SVM, [2] used AdaBoost for is used for training / testing purposes. In this
the same purpose, here again, applied to image project, we stored the male and female data in
intensities. Finally, viewpoint-invariant age and different folders.
gender classification was presented by [29].
We used OS library to travel through the
2.1.3 Deep Convolutional Neural Network directory to access the data. As an image
usually consists of 3 channels which makes it
Convolution and the convolutional layer are the
hard to process, therefore we read all images in
major building blocks used in convolutional
grayscale format and resized it to a matrix of
neural networks.
70x70 pixels. We labelled male as 0 and female
A convolution is the simple application of a
as 1. Then, we exported the data using pickle
filter to an input that results in an activation.
library.
Repeated application of the same filter to an
input results in a map of activations called a 3.3 CNN FORMATION
feature map, indicating the locations and
strength of a detected feature in an input, such We created a CNN network adjoining the fully
as an image. connected network using Keras library. We
The innovation of convolutional neural inserted 3 convoluted layer with first layer
networks is the ability to automatically learn a containing 64 neurons and a window of 7x7
large number of filters in parallel specific to a followed by Relu activation layer and max
training dataset under the constraints of a pooling layer of 2x2 this is followed by 2 more
CNN layers with 100 and 64 neurons and a 5x5
and 3x3 respectively , with activation function
as Relu. After we received the output from
CNN output was in a matrix form so we
flattened it using a fatten layer and fed it to
fully connected layers with 64 and 1 neurons
each.

Final output layer consist of a single neuron


with Sigmoid Activation function.

3.4 TRAINING/TESTING

We trained the model on the dataset we made


consisting of 100 pictures, with validation split
of 10% and adam as the optimizer.

4. IMPLEMENTED WORK

We used our model for the prediction of the


gender. As the prediction of the age is more
complex and requires more features, we used a
pre-trained model for age detection.

We predicted the age and gender on the live


feed from the webcam. For this, we used
opencv library . We used haarcascade for face
detection .Then, we extracted the face from the
image turned it into grayscale format and
resized it a 70x79 pixel matrix. Then, we fed it
to our constructed model. and obtained output.

Finally, we wrapped the obtained results onto


the image frame using opencv library

[Link]

Successfully predicted the age group and


gender of a person feeding image to the code [Link](Gender and Age range)
via webcam with accuracy score of 80%.
[Link]
Age Range: (8, 12) Age Range: (21, 32) Age
Though many previous methods have
Range: (8, 12) Age Range: (8, 12) Age Range:
addressed the problems of age and gender
(8, 12) Age Range: (21, 32) Age Range: (21,
classification, until recently, much of this work
32) Age Range: (8, 12) Age Range: (21, 32)
has focused on constrained images taken in lab
Age Range: (21, 32) Age Range: (8, 12) Age
settings. Such settings do not adequately reflect
Range: (8, 12) Age Range: (21, 32) Age Range:
appearance variations common to the real-
(8, 12) Age Range: (8, 12) Age Range: (8, 12)
world images in social websites and online
Age Range: (21,32) Age Range: (21,32) Age
repositories. Internet images, however, are not
Range: (21,32) Age Range: (21,32)
simply more challenging: they are also
abundant. The easy availability of huge image robust local image descriptor. Trans. Pattern
collections provides modern machine learning Anal. Mach. Intell., 32(9):1705–1720, 2010. 2
based systems with effectively endless training [7] S. E. Choi, Y. J. Lee, S. J. Lee, K. R. Park,
data. Taking example from the related problem and J. Kim. Age
of face recognition we explore how well deep estimationusingahierarchicalclassifierbasedong
CNN perform on these tasks using Internet data. lobaland local facial features. Pattern
We provide results with a lean deep-learning Recognition, 44(6):1262–1281, 2011. 2
architecture designed to avoid overfitting due to [8] T. F. Cootes, G. J. Edwards, and C. J.
the limitation of limited labeled [Link] further Taylor. Active appearance models. In European
inflate the size of the training data by artificially Conf. Comput. Vision, pages 484–498.
adding cropped versions of the images in our Springer, 1998. 2
training set. The resulting system was tested
unfiltered images and shown to significantly out [9] [Link]. Support-
perform recent state of the art. CNN can be used vectornetworks. Machine learning, 20(3):273–
to provide improved age and gender 297, 1995. 2 [10] E. Eidinger, R. Enbar, and T.
classification results,even considering the much Hassner. Age and gender estimation of
smaller size of contemporary unconstrained unfiltered faces. Trans. on Inform. Forensics
image sets labeled for age and gender. Second, and Security, 9(12), 2014. 1, 2, 5, 6
the simplicity of our model implies that more [11] Y. Fu, G. Guo, and T. S. Huang. Age
elaborate systems using more training data may synthesis and estimation via faces: A survey.
well be capable of substantially improving Trans. Pattern Anal. Mach. Intell.,
results beyond those reported here. 32(11):1955–1976, 2010. 2
[12] Y. Fu and T. S. Huang. Human age
REFRENCES estimation with regression on discriminative
References [1] T. Ahonen, A. Hadid, and M. aging manifold. Int. Conf. Multimedia,
Pietikainen. Face description with local binary 10(4):578–584, 2008. 2
patterns: Application to face recognition. [13] K. Fukunaga. Introduction to statistical
Trans. Pattern Anal. Mach. Intell., pattern recognition. Academic press, 1991. 2
[14] A. C. Gallagher and T. Chen.
28(12):2037–2041, 2006. 2
[2] [Link] H. [Link]. Understanding images of groups of people. In
Boostingsexidentificationperformance. Int. J. Proc. Conf. Comput. Vision Pattern
Comput. Vision, 71(1):111–119, 2007. 2 Recognition, pages 256–263. IEEE, 2009. 2, 5
[3][Link],[Link],[Link],[Link] [15] F. Gao and H. Ai. Face age classification
hall. Learning distance functions using on consumer images with gabor feature and
equivalence relations. In Int. Conf. Mach. fuzzy lda method. In Advances in biometrics,
Learning, volume 3, pages 11–18, 2003. 2 pages 132–141. Springer, 2009. 1, 2
[4] W.-L. Chao, J.-Z. Liu, and J.-J. Ding. Facial [16] X. Geng, Z.-H. Zhou, and K. Smith-Miles.
age estimation based on label-sensitive learning Automatic age estimation based on facial aging
and age-oriented regression. Pattern patterns. Trans. Pattern Anal. Mach. Intell.,
Recognition, 46(3):628–641, 2013. 1, 2 29(12):2234–2240, 2007. 2
[5] K. Chatfield, K. Simonyan, A. Vedaldi, and [17] B. A. Golomb, D. T. Lawrence, and T. J.
A. Zisserman. Return of the devil in the details: Sejnowski. Sexnet:
Delving deep into convolutional nets. arXiv Aneuralnetworkidentifiessexfromhumanfaces.
preprint arXiv:1405.3531, 2014. 3 InNeural Inform. Process. Syst., pages 572–
[6] J. Chen, S. Shan, C. He, G. Zhao, M. 579, 1990. 2
Pietikainen, X. Chen, and W. Gao. Wld: A [18] A. Graves, A.-R. Mohamed, and G.
Hinton. Speech recognition with deep recurrent
neural networks. In Acoustics, Speech and
Signal Processing (ICASSP), 2013 IEEE [28] A. Krizhevsky, I. Sutskever, and G. E.
International Conference on, pages 6645–6649. Hinton. Imagenet classification with deep
IEEE, 2013. 3 convolutional neural networks. In Neural
[19] G. Guo, Y. Fu, C. R. Dyer, and T. S. Inform. Process. Syst., pages 1097–1105, 2012.
Huang. Imagebased human age estimation by 3, 4
manifold learning and locally adjusted robust [29] Y. H. Kwon and N. da Vitoria Lobo. Age
regression. Trans. Image Processing, classification from facial images. In Proc. Conf.
17(7):1178–1188, 2008. 2 Comput. Vision Pattern Recognition, pages
[20] G. Guo, G. Mu, Y. Fu, C. Dyer, and T. 762–767. IEEE, 1994. 1, 2
Huang. A study on [30] A. Lanitis. The FG-NET aging database,
automaticageestimationusingalargedatabase. 2002. Available: www-
[Link]. Conf. Comput. Vision, pages 1986– [Link]/FGnet/html/
1991. IEEE, 2009. 2 [Link].
[21] H. Han, C. Otto, and A. K. Jain. Age
estimation from face images: Human vs.
machine performance. In Biometrics (ICB),
2013 International Conference on. IEEE, 2013.
[22] T. Hassner. Viewing real-world faces in
3d. In Proc. Int. Conf. Comput. Vision, pages
3607–3614. IEEE, 2013. 6
[23] T. Hassner, S. Harel, E. Paz, and R. Enbar.
Effective face
frontalizationinunconstrainedimages.
[Link]. Vision Pattern
Recognition, 2015. 5, 6 [24]
[Link],[Link],[Link],[Link]
ever,and R. R. Salakhutdinov. Improving
neural networks by preventing co-adaptation of
feature detectors. arXiv preprint
arXiv:1207.0580, 2012. 5
[25] G. B. Huang, M. Ramesh, T. Berg, and E.
Learned-Miller. Labeled faces in the wild: A
database for studying face recognition in
unconstrained environments. Technical report,
Technical Report 07-49, University of
Massachusetts, Amherst, 2007. 3, 5
[26][Link],[Link],[Link],[Link],J
.Long,[Link], S. Guadarrama, and T.
Darrell. Caffe: Convolutional
architectureforfastfeatureembedding.
arXivpreprint arXiv:1408.5093, 2014. 5
[27][Link],[Link],[Link],[Link],
[Link], and L. Fei-Fei. Large-scale
video classification with convolutional neural
networks. In Proc. Conf. Comput. Vision
Pattern Recognition, pages 1725–1732. IEEE,
2014. 3

Common questions

Powered by AI

The ReLU (Rectified Linear Unit) activation function introduces non-linearity into the model, enabling CNNs to learn complex patterns and relationships within data, essential for tasks like age and gender classification. ReLU's ability to activate only a subset of neurons makes the network more efficient in handling diverse and complex features, reducing the chances of vanishing gradient problems and accelerating the convergence of the training process .

Early age estimation methods relied heavily on calculating ratios between fixed facial features, requiring accurate localization and frontal alignment, which limited their use to controlled environments. Modern CNN-based methods overcome these limitations by learning to recognize patterns and features dynamically from a wide range of input conditions, thereby handling variations in facial expressions, orientations, and occlusions more efficiently without requiring precise alignment .

CNNs improve the performance of automatic age and gender classification by learning a large number of specific filters during training, which identify and focus on critical features of images. These filters are learned in parallel, allowing CNNs to detect highly specific features in diverse input data. This approach allows CNNs to handle variations in unfiltered, real-world images, improving their classification accuracy compared to earlier methods that relied on handcrafted rules and constrained datasets .

Converting images to grayscale simplifies the data by reducing the number of channels from three (RGB) to one, making it easier to process while losing minimal significant information essential for gender classification. Resizing images to a 70x70 matrix standardizes input dimensions for the CNN, ensuring consistent processing across all images, which is crucial for the reliability and effectiveness of the classification process .

Labeling data numerically, such as using 0 for males and 1 for females, simplifies the processing and classification process in CNNs by converting qualitative attributes into quantitative data. This numerical encoding is essential for CNNs to process input features consistently, enabling the model to distinguish and learn patterns related to gender more effectively during the training phase .

The haarcascade library facilitates real-time face detection by using pre-trained classifiers for various facial features, allowing swift identification within images. OpenCV complements this by providing efficient image manipulation tools such as converting images to grayscale and resizing them rapidly, essential for on-the-fly processing in real-time applications. The combination enables the swift capture, processing, and classification of age and gender from a webcam feed .

Challenges in age and gender estimation from real-world images include variations in lighting, pose, and occlusion that make accurate localization of facial features difficult. Traditional methods requiring frontal and well-aligned images cannot handle in-the-wild images found on social media. Methods using local features and robust descriptors are employed to address these challenges, though they still struggle with unconstrained images due to such variations .

The proposed CNN method was able to outperform existing methods on the challenging Adience benchmark, designed to test age and gender classification in unconstrained real-world images. This improvement is attributed to its lean deep-learning architecture which is less prone to overfitting and capable of extracting sophisticated features from varied image conditions, outperforming state-of-the-art results due to its handling of unconstrained conditions typical in social media environments .

A pre-trained model was used for age detection to leverage existing knowledge and patterns learned from large and diverse datasets, which the limited dataset in the study cannot provide. This integration allowed the system to achieve high accuracy in age detection by focusing the custom model development efforts on gender prediction, thereby efficiently splitting tasks based on available data and complexity .

Given limited labeled data, training strategies to enhance CNNs for age and gender classification include inflating training data size through augmentation, such as adding cropped versions of existing images. This approach helps prevent overfitting in the network. Additionally, using a lean architecture ensures better generalization. These strategies utilize enormous collections of freely available online images, thereby bypassing the data constraint limitations by introducing diversity into the training set .

You might also like