Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2021, International Journal of Advanced Computer Science and Applications
…
8 pages
1 file
In this paper, we show how to classify Arabic document images using a convolutional neural network, which is one of the most common supervised deep learning algorithms. The main goal of using deep learning is its ability to automatically extract useful features from images, which eliminates the need for a manual feature extraction process. Convolutional neural networks can extract features from images through a convolution process involving various filters. We collected a variety of Arabic document images from various sources and passed them into a convolutional neural network classifier. We adopt a VGG16 pre-trained network trained on ImageNet to classify the dataset of four classes as handwritten, historical, printed, and signboard. For the document image classification, we used VGG16 convolutional layers, ran the dataset through them, and then trained a classifier on top of it. We extract features by fixing the pre-trained network's convolutional layers, then adding the fully connected layers and training them on the dataset. We update the network with the addition of dropout by adding after each max-pooling layer and to the fourteen and the seventeenth layers which are the fully connected layers. The proposed approach achieved a classification accuracy of 92%.
2020
Handwritten Arabic, like other handwritten (such as Latin, Chinese, etc.), have received increasing attention from several researchers. To preserve and promote wider access to the invaluable cultural and literary heritage held in both public and private collections of manuscripts, the researchers have proposed and developed several approaches based on annotation, metadata, and transcription. The need to access to the manuscript text is increasing on a large scale. For this reason, traditional methods of indexing such as annotation or transcription will be outdated as they require a considerable and unreliable manual effort. It is, therefore, necessary to develop new tools for the identification and recognition of handwritten text contained in images. However, despite the development that has been shown by Convolutional Neural Network (CNN) in different computer vision tasks, the latter has not known many uses in the field of Arabic manuscripts. Even if, the use of these methods based on deep learning to predict the class of characters, such as the Handwritten numbers, has achieved a great result. Hence, the idea of using methods based on deep learning techniques to classify words and characters in images of Arabic manuscripts. In this paper, we propose two classification methods to predict the class of each word, using the HADARA80P dataset. The first one uses a simple neural network and the last one uses a convolutional neural network. The experimental results obtained by these two methods are very interesting
2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), 2017
Deep learning is a form of hierarchical learning, it consists of multiple layers of representations that gradually transform data into high level concepts. Deep learning has been providing the state of the art results for various computer vision problems. However, a typical deep leaning algorithm needs a large amount of data to train a deep model and guarantee the models ability to generalize. It is not easy to generate large labeled datasets and it is one of the main barriers to apply deep learning for many problems. Data augmentation schemes were introduced to overcome this limitation, by extending small available labeled datasets. In this work we experiment with extending a small labeled dataset of Arabic continuous subwords by an orders of magnitude. The labeled dataset, which consist of handwritten Arabic subwords is used to synthesize a large collection of labeled dataset. The synthesized subwords are based on one or multiple writing styles from the original labeled dataset. We also experiment with generating various printed forms of subwords. We include only Naskh font, as most of the Arabic historical manuscripts were written in this type of font. We train several convolutional neural networks using handwritten, printed and synthesized datasets and obtain encouraging results.
Handwritten Arabic character recognition systems face several challenges, including the unlimited variation in human handwriting and large public databases. In this work, we model a deep learning architecture that can be effectively apply to recognizing Arabic handwritten characters. A Convolutional Neural Network (CNN) is a special type of feed-forward multilayer trained in supervised mode. The CNN trained and tested our database that contain 16800 of handwritten Arabic characters. In this paper, the optimization methods implemented to increase the performance of CNN. Common machine learning methods usually apply a combination of feature extractor and trainable classifier. The use of CNN leads to significant improvements across different machine-learning classification algorithms. Our proposed CNN is giving an average 5.1% misclassification error on testing data.
Computational Intelligence and Neuroscience, 2022
Handwritten characters recognition is a challenging research topic. A lot of works have been present to recognize letters of different languages. The availability of Arabic handwritten characters databases is limited. Motivated by this topic of research, we propose a convolution neural network for the classification of Arabic handwritten letters. Also, seven optimization algorithms are performed, and the best algorithm is reported. Faced with few available Arabic handwritten datasets, various data augmentation techniques are implemented to improve the robustness needed for the convolution neural network model. The proposed model is improved by using the dropout regularization method to avoid data overfitting problems. Moreover, suitable change is presented in the choice of optimization algorithms and data augmentation approaches to achieve a good performance. The model has been trained on two Arabic handwritten characters datasets AHCD and Hijja. The proposed algorithm achieved high...
Neural Computing and Applications, 2020
Automatic handwriting recognition is an important component for many applications in various fields. It is a challenging problem that has received a lot of attention in the past three decades. Research has focused on the recognition of Latin languages’ handwriting. Fewer studies have been done for the Arabic language. In this paper, we present a new dataset of Arabic letters written exclusively by children aged 7–12 which we call Hijja. Our dataset contains 47,434 characters written by 591 participants. In addition, we propose an automatic handwriting recognition model based on convolutional neural networks (CNN). We train our model on Hijja, as well as the Arabic Handwritten Character Dataset (AHCD) dataset. Results show that our model’s performance is promising, achieving accuracies of 97% and 88% on the AHCD dataset and the Hijja dataset, respectively, outperforming other models in the literature.
Indonesian Journal of Electrical Engineering and Computer Science
A new method for recognizing automatically Arabic handwritten words was presented using convolutional neural network architecture. The proposed method is based on global approaches, which consists of recognizing all the words without segmenting into the characters in order to recognize them separately. Convolutional neural network (CNN) is a particular supervised type of neural network based on multilayer principle; our method needs a big dataset of word images to obtain the best result. To optimize our system, a new database was collected from the benchmarking Arabic handwriting database using the pre-processing such as rotation transformation, which is applied on the images of the database to create new images with different features. The convolutional neural network applied on our database that contains 40320 of Arabic handwritten words (26880 images for training set and 13440 for test set). Thus, different configurations on a public benchmark database were evaluated and compared...
2022 25th International Conference on Computer and Information Technology (ICCIT)
Handwriting Recognition has been a field of great interest in the Artificial Intelligence domain. Due to its broad use cases in real life, research has been conducted widely on it. Prominent work has been done in this field focusing mainly on Latin characters. However, the domain of Arabic handwritten character recognition is still relatively unexplored. The inherent cursive nature of the Arabic characters and variations in writing styles across individuals makes the task even more challenging. We identified some probable reasons behind this and proposed a lightweight Convolutional Neural Network-based architecture for recognizing Arabic characters and digits. The proposed pipeline consists of a total of 18 layers containing four layers each for convolution, pooling, batch normalization, dropout, and finally one Global average pooling and a Dense layer. Furthermore, we thoroughly investigated the different choices of hyperparameters such as the choice of the optimizer, kernel initializer, activation function, etc. Evaluating the proposed architecture on the publicly available 'Arabic Handwritten Character Dataset (AHCD)' and 'Modified Arabic handwritten digits Database (MadBase)' datasets, the proposed model respectively achieved an accuracy of 96.93% and 99.35% which is comparable to the state-ofthe-art and makes it a suitable solution for real-life end-level applications.
International Journal of Computer Applications
Identification of persons is mainly through the physiological characteristics like fingerprints, face, iris, retina, and hand geometry and the behavioral characteristics like a voice, signature, and handwriting. Identifying the author of a handwritten document has been an active field of research over the past few years and it used in many applications as in biometrics, forensics and historical document analysis. This research presents the study and implementation of the stages of writer identification, starting from data acquisition, and then augmente the data through programming an algorithm that generate a large number of texts from the set of texts available within the database, finally building a convolutional Neural Network (CNN)) Which is useful for extracting features information and then classification the data, therefore, the features are not needed to pre-define. The experiments in this study were conducted on images of Arabic handwritten documents from ICFHR2012 dataset of 202 writer, and each writer have 3 text. The proposed method achieved a classification accuracy of 98.2426%.
Research Square (Research Square), 2023
The Arabic language is one of the six most important languages in the world. Because more than 420 million people worldwide use the Arabic script, research into the recognition of Arabic handwriting is crucial. The demand for software that can automatically read and interpret Arabic Handwriting has been rapidly expanding in recent years as the use of digital devices has become increasingly widespread. Characters are written by Hands in Arabic are more difficult to decipher than those noted in English or other languages because of the nature of the words used in Arabic. In this study, we designed a new model, Convolutional Neural Network 14 Layers (CNN-14), to recognise handwritten Arabic characters. The The model was trained and tested on the Arabic Handwritten Character Dataset (AHCD) and Hijja datasets, The proposed model achieved good results, with an accuracy of 99.36 per cent in AHCD and 94.35 per cent Hijja dataset.
International Journal of Grid and Distributed Computing, 2018
Text classification is the process of gathering documents into classes and categories based on their contents. This process is becoming more important due to the huge textual information available online. The main problem in text classification is how to improve the classification accuracy. Many algorithms have been proposed and implemented to solve this problem in general. However, few studies have been carried out for categorizing and classifying Arabic text. Technically, the process of text classification follows two steps; the first step consists on selecting some special features from all the features available from the text by applying features selection, features reduction and features weighting techniques. And the second step applies classification algorithms on those chosen features. In this paper, we present an innovative method for Arabic text classification. We use an Arabic stemming algorithm to extract, select and reduce the features that we need. After that, we use the Term Frequency-Inverse Document Frequency technique as feature weighting technique. And finally, for the classification step, we use one of the deep learning algorithms that is very powerful in other field such as the image processing and pattern recognition, but still rarely used in text mining, this algorithm is the Convolutional Neural Networks. With this combination and some hyperparameter tuning in the Convolutional Neural Networks algorithm we can achieve excellent results on multiple benchmarks.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Bulletin of Electrical Engineering and Informatics, 2025
Baghdad Science Journal
Bulletin of Electrical Engineering and Informatics, 2024
IEEE Access, 2020
International Journal of Multimedia Data Engineering and Management, 2019
IJARCCE ISSN (Online) 2278-1021 ISSN (Print) 2319-5940 International Journal of Advanced Research in Computer and Communication Engineering, 2020
Indonesian Journal of Electrical Engineering and Computer Science, 2021
International Journal of Engineering & Technology
International Journal of Intelligent Systems Technologies and Applications, 2016
Entropy, 2021
2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 2019
International Journal of Advanced Research in Computer and Communication Engineering, 2020
2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), 2017
Computers, Materials & Continua
Computer Engineering and Intelligent Systems, 2018
Advances in Science, Technology and Engineering Systems Journal, 2020