Handwritten Character Recognition System
Handwritten Character Recognition System
1
Tirapathi Reddy B Handwritten Character Recognition
2
Viswanath J System
3
Elangovan Guruva
Reddy
4
Viswanathan R
5
Monica P
Abstract: - Digitizing handwritten documents and enabling efficient information processing and retrieval require systems that can recognize
handwritten characters. This research offers a unique approach for handwritten character detection using state-of-the-art machine learning
algorithms. The proposed technique automatically extracts discriminative features from photos of handwritten characters using
convolutional neural networks (CNNs). These attributes are then used by a classifier to determine which characters are related. The dataset
used for training and assessment is made up of a large collection of handwritten characters gathered under various writing styles, sizes, and
orientations in order to guarantee the durability and generalization power of the model. To enhance its quality and diversity, the training
data is put through a rigorous preparation procedure that includes picture augmentation, noise removal, and normalization. The studies'
results demonstrate how well and precisely the proposed system can recognize handwritten characters in a range of languages and writing
styles. The system performs competitively compared to state-of-the-art methods and demonstrates robustness against variations in
handwriting style and quality. Furthermore, the system has potential in terms of efficiency and scalability, making it suitable for real-time
applications such as document digitalization, handwritten word recognition in electronic devices, and automatic form processing.
Keywords: Handwriting recognition, Character recognition, Deep learning, Convolutional neural networks (CNN), Pattern
recognition
I. INTRODUCTION
At the vanguard of technological advancement, handwritten character recognition (HCR) is essential to the larger
domains of artificial intelligence, image processing, and pattern recognition. It is becoming more and more obvious
that we must smoothly connect the analogue and digital domains as we navigate the 21st-century digital world. In
order to extract a wealth of information stored in analogue documents as shown in Figure 1, HCR plays a pivotal
role in this revolutionary journey by automating the laborious process of transcribing handwritten text into a
machine-readable format [1]. This thorough introduction examines the development throughout time, guiding
principles, techniques, difficulties, and range of applications of HCR, emphasizing the technology's significant
influence on several industries and its potential to influence human-computer interaction going forward.
A. Historical Evolution
HCR has its origins in the grandiose endeavor of teaching robots to read human handwriting, which began in the
middle of the 20th century. Due to the absence of advanced algorithms and the restricted computer capacity, early
efforts were crude. With the development of technology, especially in the last decades, the field of HCR saw a
paradigm change. Character recognition techniques were completely changed by the introduction of artificial neural
networks and machine learning, which opened the door to more sophisticated and effective systems [2].
1 Professor, Department of Information Technology, P.V.P. Siddhartha Institute of Technology, Kanuru, Vijayawada, AP, 520007,
India. [email protected]
2Assistant Professor, Department of Artificial Intelligence and Data Science, Madanapalle Institute of Technology and Science,
AP, India. [email protected]
3*Corresponding author: Associate Professor , Department of Computer Science and Engineering, Koneru Lakshmaiah
Education Foundation, Vijayawada, AP, 522302, India. [email protected]
4Professor, Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vijayawada, AP,
522302, India. [email protected]
5Assistant Professor, School of Electrical and Electronics Engineering, VIT Bhopal University, Bhopal, Madhya Pradesh, India.
[email protected]
Copyright © JES 2024 on-line : journal.esrgroups.org
1465
J. Electrical Systems 20-3 (2024): 1465-1475
Character identification and classification in the initial generation of HCR systems was mostly done using rule-
based techniques, which used templates and predetermined rules [3]. Nevertheless, the applicability of these
systems was limited due to their inability to adjust to the intrinsic variety in human handwriting. With the advent
of machine learning techniques, especially the use of neural networks, the turn of the century saw a dramatic shift
[4,5]. This was a turning point because it allowed HCR systems to learn from different handwriting styles and
generalize from them, which improved their ability to handle real-world situations.
B. Underlying Principles
HCR uses a variety of approaches to interpret the intricacies of handwritten language, functioning at the nexus of
computer science, pattern recognition, and artificial intelligence. Fundamentally, handwritten character recognition
(HCR) is the process of extracting significant characteristics from unprocessed input data, usually in the form of
pictures [6]. Machine learning models are trained on these traits, which help them identify patterns and correlations
that are essential for precise character recognition.
Inspired by the architecture and operation of the human brain, neural networks have become a major player in HCR.
at particular, Convolutional Neural Networks (CNNs) have shown great efficacy at recognizing patterns and spatial
hierarchies seen in handwritten characters. Given the temporal connections present in handwriting, recurrent neural
networks (RNNs) are also useful for character sequences [7]. Together with developments in deep learning
architectures, these networks' cooperation has enabled HCR to reach previously unheard-of levels of accuracy and
adaptability.
1466
J. Electrical Systems 20-3 (2024): 1465-1475
1467
J. Electrical Systems 20-3 (2024): 1465-1475
Neural networks became a prominent paradigm in HCR research as machine learning gained traction, especially in
the previous 20 years. Convolutional neural networks (CNNs) were first applied in handwritten digit recognition
by LeCun et al. (1998), who made significant advances in robustness and accuracy [20]. The ability of CNNs to
extract hierarchical features was crucial in identifying the spatial hierarchies and patterns seen in handwritten
characters. Recurrent Neural Networks (RNNs) addressed issues with cursive writing and sequence recognition by
adding a time dimension to HCR. Long Short-Term Memory (LSTM) networks, a kind of RNN, have been shown
to be effective at managing sequential dependencies in handwriting by Graves et al. (2009) [21]. Graves (2013)
suggested integrating CNNs and RNNs in Hybrid models, which demonstrated synergies that further enhanced
recognition performance, especially in unconstrained handwriting conditions [22].
A crucial component of HCR is still featuring extraction, which establishes how discriminatively capable models
are in identifying handwriting patterns. Using manual feature engineering techniques, researchers would define
characteristics like stroke thickness, slant, and curvature. The usefulness of geometric and statistical characteristics
was investigated by Blumenstein and Verma (2007), demonstrating the significance of feature selection in raising
recognition accuracy [23]. Automatic feature learning gained prominence in the deep learning era. Convolutional
Capsule Networks were first proposed by Simard et al. (2003), and their work completely changed how features
are learned and encoded in HCR [24]. By providing a more dynamic and hierarchical feature extraction method,
capsule networks improved the models' capacity to adjust to different writing styles.
Notwithstanding notable advancements, difficulties in handwriting recognition still exist due to the inherent variety
of human handwriting. Challenges with character size fluctuations, slant, and curvature were noted by Plamondon
and Srihari (2000) [18]. The challenge of identification is further complicated by the presence of noise, uneven
strokes, and overlapping letters. The absence of standardized datasets that reflect a range of writing styles is a major
obstacle to the training of generalizable models. An additional degree of complexity is introduced by the HCR
systems' capacity to adapt to many languages and scripts. The difficulties of multilingual HCR have been studied,
which highlight the necessity for adaptable models that can support a variety of writing systems [25]. Furthermore,
unconstrained handwriting recognition is still an ongoing research topic that needs models to handle a variety of
writing styles that are seen in real-world contexts [26].
(i) (ii)
Figure 2. Sample Data from IAM dataset
A key component of measuring the effectiveness and advancement of HCR systems is the creation of standardized
datasets and standards. The IAM Handwriting Database as shown in the Figure 2, a benchmark dataset that was
extensively embraced by the HCR community, was first presented [27]. This dataset served as a foundation for
evaluating system performance and allowed for fair comparisons between various recognition techniques.
According to [25], competition venues like the International Conference on Frontiers in Handwriting Recognition
(ICFHR) contests have been essential in measuring advances in HCR [28]. These contests push the limits of
robustness and accuracy in recognition while also offering standardized datasets and encouraging healthy rivalry.
HCR's adaptability to a wide range of applications in several fields has aided in the development of administrative
procedures, document digitalization, healthcare, and education. According to Fischer et al. (2014), HCR makes it
easier to digitize old handwritten records, conserving historically significant and cultural artefacts that were
previously difficult to access [29].
HCR applications have focused on automating administrative procedures, especially form processing. The
effectiveness of HCR in automating data entry operations, decreasing human labor, and minimizing mistakes related
to processing handwritten forms was proved by Kim et al. (2015) [30]. HCR is useful to the banking industry for
activities like cheque processing, where solutions such as the one suggested by Blumenstein et al. (2011) improve
transaction processing security and speed [31]. HCR aids in the creation of interactive teaching resources and
tutoring programmes in the field of education. Huenerfauth et al. (2009) conducted a study that demonstrated the
ability of handwritten assignments and evaluations to be converted into digital forms using HCR. This might lead
1468
J. Electrical Systems 20-3 (2024): 1465-1475
to improved accessibility and cooperation opportunities for instructors and students [32]. HCR is utilized by the
healthcare industry to digitize medical documents and patient records. HCR speeds up the process of converting
handwritten medical notes into electronic health records (EHRs), as shown by Rath et al. (2018), making the process
more efficient and accessible [33].
ANN Architecture
Feature
Extraction
Dense
CNN Network
RNN
Dense
Temporal Network
Dependences
Finite Tuning /
Optimization
1469
J. Electrical Systems 20-3 (2024): 1465-1475
B. Feature Extraction
The HCR System has implemented a two-pronged strategy for feature extraction, using both automatic and manual
procedures. Handwriting parameters such as slant, curvature, stroke thickness, and spatial correlations are
meticulously identified by manual extraction, which has given the system a sophisticated grasp of these qualities.
Concurrently, automated feature extraction has been made possible by the use of deep learning methods, particularly
CNNs and RNNs. This puts the system in a position to learn intricate hierarchical representations on its own from
the unprocessed input data, which is a critical step towards developing a more flexible and all-encompassing
recognition framework.
A hybrid neural network that combines recurrent and convolutional neural networks (RNNs) is the suggested model
for the process that is being discussed. Because it combines the special advantages of both CNNs and RNNs, this
hybrid architecture is preferred over alternative methods for guaranteeing a thorough comprehension of handwritten
characters. CNNs are good at extracting visual characteristics from handwritten letters because they are good at
recognizing spatial hierarchies and patterns within pictures. Concurrently, RNNs - especially LSTM networks—
are excellent at managing the sequential character of cursive handwriting's strokes, efficiently capturing temporal
relationships. The hybrid model is very flexible to various writing styles and variances because of the integration
of these two types of neural networks, which enables a more comprehensive and subtle interpretation of handwritten
characters. This method improves the accuracy and resilience of the model and provides an improved answer to the
problems caused by the complexities of handwritten character recognition.
1470
J. Electrical Systems 20-3 (2024): 1465-1475
CNNs are perfect for processing the visual input of handwritten characters since they are good at recognizing spatial
hierarchies and patterns in pictures and its architecture is given in Figure 4. Their proficiency is in extracting features
from the unprocessed pixel data of photographs, an essential skill for identifying handwritten characters with diverse
forms and styles.
RNNs schematic shown in the Figure 5, on the other hand, operate well with sequential data, especially Long Short-
Term Memory (LSTM) networks. To recognize cursive handwriting or letters with complicated structures, they
must be able to recognize temporal relationships in the sequence of handwritten strokes.
A wide range of handwritten characters are gathered throughout the dataset preparation stage from reliable datasets
such as MNIST and IAM Handwriting Database. Every dataset is meticulously annotated to furnish training
purposes with ground truth labels. In the training phase, the model measures the difference between the predicted
and real character labels using loss functions such category cross-entropy. Through backpropagation, in which
mistakes are sent backward through the network to modify the weights appropriately, this aids in optimizing the
model's parameters. The model's parameters are updated iteratively through the use of optimization techniques such
as Adam or Stochastic Gradient Descent (SGD), which guarantees a smooth and consistent convergence process.
W1 W2 Wn
Dense Layer
Sigmoid
Through the integration of CNNs for spatial feature extraction and RNNs for sequential pattern recognition, the
hybrid model is able to recognize handwritten characters with strong performance, even across a wide range of
handwriting styles and variances. The rigorous training, optimization, and dataset preparation procedures further
add to the model's efficacy and accuracy in practical Handwritten Character Recognition applications.
E. Post Processing
The HCR System's post-processing domain presents advanced decoding techniques, such as using beam search for
sequence-based recognition tasks. In addition, a thorough application of error correction methods has been carried
out to address any mistakes, specifically concentrating on situations when characters are unclear or overlap. This
stage demonstrates the dedication to improving the system's output and provides a greater level of precision and
dependability. The accuracy of the system's character recognition is further improved by adding more neural
network components, rule-based techniques, or language models.
1471
J. Electrical Systems 20-3 (2024): 1465-1475
Accuracy = (TP+TN)/(TP+FP+TN+FN)
B. Precision
The percentage of properly identified positives to all positives found can be used to determine precision. In this
way, it is evident that:
Precision = TP/(TP+FP)
C. Recall
Recall, or sensitivity, is defined as the proportion of related instances recovered to all instances retrieved. It looks
like this:
Recall= TP/(TP+FN)
1472
J. Electrical Systems 20-3 (2024): 1465-1475
D. F-MEASURE/F1-SCORE
The f-measure takes accuracy and recall into consideration. The f-measure, which appears like this, may be
understood as the average weight of all values:
From the Figure 6 we can infer an accuracy of 93.8%, the assessment results demonstrate the hybrid model's
improved performance in handwritten character recognition. With the CNN-only model scoring 91.2% and the
RNN-only model scoring 88.5%, this outperforms the separate models. The Hybrid Model outperforms the CNN-
only model (92.4%) and the RNN-only model (93.1%) in terms of precision metrics, with a score of 93.5%. With
a recall of 92.4%, the Hybrid Model outperforms both the CNN-only model (90.1%) and the RNN-only model
(89.5%). The hybrid approach's balanced performance is further shown by its F1 Score of 92.7%, which surpasses
that of the individual CNN-only (91.2%) and RNN-only (89.3%) models. These findings support the synergistic
advantages of combining RNNs and CNNs for improving Handwritten Character Recognition's overall accuracy
and precision.
Model Comparison
95
94
93
92
91
90
89
88
87
86
85
Accuracy (%) Precision (%) Recall (%) F1 Score (%)
The effectiveness of the HCR System is largely dependent on post-processing techniques, such as error correction
methods and decoding with beam search for sequence-based recognition tasks. Particularly in situations with
intricate sequences, the decoding techniques greatly aid in the output's refinement. The system's dedication to
producing precise and dependable outcomes is demonstrated by the application of mistake correction algorithms,
which are specifically designed to address problems in situations involving overlapping or unclear characters.
The debate is further enhanced by the fine-tuning and optimization phase, which highlights the HCR System's
versatility and adaptability. The careful examination of hyperparameter variables, like as batch sizes and learning
rates, highlights the dedication to maximizing the model's performance. The thoughtful incorporation of transfer
learning methodologies, particularly in situations with sparse labelled data, demonstrates the system's adaptability
and efficiency on a variety of datasets and practical uses. The HCR System not only fulfils but surpasses the
requirements of dynamic recognition tasks thanks to its adaptive methodology.
Going forward, there are encouraging opportunities for further application and improvements in the field of hybrid
Handwritten Character Recognition (HCR). Even if the merger of recurrent and convolutional neural networks has
produced some amazing results in the current work, there are still a number of areas that might use improvement
and growth. The integration of attention processes into the hybrid architecture is a noteworthy area of future
research. In tasks involving natural language processing, attention methods have shown to be successful in enabling
models to selectively focus on pertinent segments of input sequences. By including attention processes in the hybrid
model, one might theoretically improve identification accuracy, especially when dealing with complicated or
1473
J. Electrical Systems 20-3 (2024): 1465-1475
densely written text, by enhancing the model's capacity to grasp minute nuances and relationships inside
handwritten letters.
Furthermore, there is potential for future deployment from the investigation of sophisticated transfer learning
techniques. It may be possible to accelerate convergence and improve performance by utilizing pre-trained models
on large datasets and customizing them to the unique characteristics of handwritten character recognition,
particularly when there is a dearth of labelled data. By utilizing transfer learning to fine-tune the hybrid model,
potential latent skills might be unlocked, improving the system's ability to generalize across a variety of handwriting
styles and languages. In the post-processing domain, the use of more complex mistake correcting processes,
including sophisticated language models or neural network components, may improve the recognition results even
further. In situations where characters overlap or display ambiguity, this might be very helpful in testing the system's
tolerance for difficult handwriting variants. A dynamic learning component might be added to the system by
investigating the incorporation of reinforcement learning techniques. Through interaction with real-world data and
user input, reinforcement learning techniques may allow the model to evolve and enhance its recognition
capabilities over time. This cyclical learning procedure can improve the system's flexibility and reactivity to
changing patterns in various handwriting styles.
V. CONCLUSION
To sum up, the Handwritten Character identification System shows notable improvements in character
identification accuracy by using a hybrid technique that combines Convolutional Neural Networks (CNNs) and
Recurrent Neural Networks (RNNs). The combination of sequential and spatial information improves flexibility in
a variety of handwriting styles. Effective training on datasets like as MNIST and the IAM Handwriting Database,
along with careful pre- and post-processing, highlight the resilience of the system. Although the findings are
respectable, there are exciting opportunities to further refine and extend the system's capabilities in tackling
developing issues in handwritten character recognition through future implementation routes such as reinforcement
learning and attention methods.
REFERENCES
[1] Basilis Gatos, Nikolaos Stamatopoulos, Georgios Louloudis. ICDAR 2009 handwriting segmentation contest. In
Proceedings of the Tenth International Conference on Document Analysis and Recognition (ICDAR) (pp. 25-33).
[2] Kaur, H., Kumar, M, “Signature identification and verification techniques: state-of-the-art work,” J Ambient Intell
Human Comput 14, 2023, 1027–1045.
[3] Wang, K., & Belongie, S, “Word spotting in the wild,” In Proceedings of the 11th European Conference on Computer
Vision: Part I (ECCV), 2010, (pp. 591-604).
[4] Huang, K., Hussain, A., Wang, QF., Zhang, R. (eds), “Deep Learning: Fundamentals, Theory and Applications,”
Cognitive Computation Trends, 2019, vol 2.
[5] Del Bimbo, A., Cucchiara, R., Sclaroff, S., Farinella, G. M., Mei, T., Bertini, M., & Vezzani, R. (Eds.), “Pattern
Recognition,” ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part III
(Vol. 12663). Springer Nature.
[6] Shao, Y., Wang, C. & Xiao, B, “A character image restoration method for unconstrained handwritten Chinese character
recognition’” IJDAR, 2015, 18, 73–86
[7] Graves, Alex & Liwicki, Marcus & Fernández, Santiago & Bertolami, Roman & Bunke, Horst & Schmidhuber, Jürgen,
“A Novel Connectionist System for Unconstrained Handwriting Recognition,” IEEE transactions on pattern analysis and
machine intelligence. 2009, 31. 855-68.
[8] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” 2005 IEEE Computer Society
Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, pp. 886-893 vol. 1
[9] Liwicki, Marcus & Bunke, H, “IAM-On DB - An on-line English sentence database acquired from handwritten text on
a whiteboard,” Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. 2005. 956
- 961 Vol. 2.
[10] N. Singh, “An Efficient Approach for Handwritten Devanagari Character Recognition based on Artificial Neural
Network,” 2018 5th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, pp.
894-897
[11] Gatos, B., Pratikakis, I., & Perantonis, S, “Adaptive binarization of historical manuscripts,” In Proceedings of the
International Conference on Document Analysis and Recognition (ICDAR), 2005, Vol. 2, pp. 959-963.
1474
J. Electrical Systems 20-3 (2024): 1465-1475
[12] Ciresan, D., Meier, U., Gambardella, L. M., & Schmidhuber, J, “Convolutional neural network committees for
handwritten character classification,” In Proceedings of the International Conference on Document Analysis and
Recognition (ICDAR), 2011, pp. 1135-1139.
[13] Chiu, C. Y., & Wu, J. L, “Recognizing unconstrained handwritten numerals using support vector machines,” In
Proceedings of the International Conference on Document Analysis and Recognition (ICDAR), 2001, Vol. 2, pp. 890-
894.
[14] Yang, X., & Zhang, Y, “Handwritten digit recognition using deep learning by combining spatial and spectral
information’” In Proceedings of the International Conference on Image Processing (ICIP) 2018, pp. 4093-4097.
[15] Seni, G., Elder, J. F., & Fateman, R, “Ensemble-based recognition,” In Proceedings of the Fourth International
Conference on Document Analysis and Recognition (ICDAR), 1997, Vol. 2, pp. 827-831.
[16] Xuefeng Xiao, Lianwen Jin, Yafeng Yang, Weixin Yang, Jun Sun, Tianhai Chang, “Building fast and compact
convolutional neural networks for offline handwritten Chinese character recognition,” Pattern Recognition, Volume 72,
2017, Pages 72-81.
[17] Pal, U., & Chaudhuri, B. B, “Indian script character recognition: A survey,” Pattern Recognition, 37(9), 2004, 1887-
1899.
[18] Plamondon, R., & Srihari, S. N, “On-line and off-line handwriting recognition: a comprehensive survey,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, 22(1), 2000, 63-84.
[19] Juang, B. H., & Rabiner, L. R, “Hidden Markov Models for Speech Recognition,” Technometrics, 33(3), 1991, 251-272.
[20] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P, “Gradient-based learning applied to document recognition,”
Proceedings of the IEEE, 86(11), 1998, 2278-2324.
[21] Graves, A., Fernández, S., Gomez, F., & Schmidhuber, J, “Connectionist temporal classification: Labelling unsegmented
sequence data with recurrent neural networks,” In Proceedings of the 23rd International Conference on Machine
Learning (ICML), 2006, (pp. 369-376.
[22] Graves, A, “Generating sequences with recurrent neural networks,” arXiv preprint arXiv:1308.0850.
[23] Schenk, J., Lenz, J., & Rigoll, G, “ Novel script line identification method for script normalization and feature extraction
in on-line handwritten whiteboard note recognition,” Pattern recognition, 42(12), 2009, 3383-3393.
[24] Simard, P. Y., Steinkraus, D., & Platt, J. C, “Best practices for convolutional neural networks applied to visual document
analysis,” In Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR),
2003, Vol. 2, pp. 958-963.
[25] A. Antonacopoulos, B. Gatos and N. Stamatopoulos, “Handwriting Segmentation Contest, in 2007,” 9th International
Conference on Document Analysis and Recognition, Parana, 2007, pp. 1284-1288.
[26] Inunganbi, S, “A systematic review on handwritten document analysis and recognition,” Multimed Tools Appl 83, 2004,
5387–5413.
[27] Marti, U. V., & Bunke, H, “An English sentence database for offline handwriting recognition,” International Journal on
Document Analysis and Recognition, 2002, 5(1), 39-46.
[28] Ammour N, Bazi Y, Alajlan N, “Multimodal Approach for Enhancing Biometric Authentication,” Journal of Imaging.
2023, 9(9):168.
[29] Nguyen, T. T. H., Jatowt, A., Coustaty, M., & Doucet, A, “Survey of post-OCR processing approaches,” ACM
Computing Surveys (CSUR), 2021, 54(6), 1-37
[30] Handley, John, “Improving OCR accuracy through combination: a survey,” IEEE International Conference on Systems,
Man, and Cybernetics (Cat. No.98CH36218), 1998, 4330 - 4333 vol.5.
[31] Cai, J., & Liu, Z. Q, “Integration of structural and statistical information for unconstrained handwritten numeral
recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(3), 263-270.
[32] Hassan, L, “Accessibility of games and game-based applications: A systematic literature review and mapping of future
directions,” New Media & Society, 2024, 26(4), 2336-2384.
[33] Manogaran, G., & Lopez, D, “A survey of big data architectures and machine learning algorithms in healthcare,”
International Journal of Biomedical Engineering and Technology, 2017, 5(2-4), 182-211.
1475