Information Retrieval and Document Classification

Journal of Computer Science IJCSIS

Information Retrieval and Document Classification

Journal of Computer Science IJCSIS

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

There appears to be various information available online in the form of document. Finding these kinds of documents and retaining them, corresponding to their category has never been more automatic. This paper acknowledges the issue of classifying genre of different English novels with the help of different Natural Language Processing and Machine Learning methods. Different novels are collected and divided into training Dataset and test Dataset. Originally for the purpose of classification uses three dissimilar varieties of Fiction genre specifically Romantic, Fairy Tales and Thriller. The genres that have been taken are some of the most widely read genres of book among different age groups. Using different linguistic feature to obtain representative features for the genres. The training module uses the feature Datasets to provide the base for classification feature.

MANTECH PUBLICATIONS

mantech publications, 2023

Rapid progress in digital data acquisition techniques have led to huge volume of data. More than 80 percent of today's data is composed of unstructured or semi-structured data. The recovery of similar patterns and trends to see the text data from huge volume of data is a big issue. Text mining is a process of extracting interesting and nontrivial patterns from huge amount of text documents. There lie many techniques and tools to mine the text documents and discover the information for future and process in decision making. The choice of selecting the right and appropriate text mining technique helps to recover the speed and slows the time and effort required to get valuable information. This paper briefly discusses and analyzes the text mining techniques and their applications. With the advancement of technology, more and more data is available in digital form. Among which, most of the data (approx. 85%) is in unstructured textual form. Thus, it has become essential to build better techniques and algorithms to get useful and interesting data from the large amount of textual data. Hence, the field of information extraction and text mining became popular areas of research, to get interesting and needful information

Log In

Information Retrieval and Document Classification

Sign up for access to the world's latest research

Abstract

Related papers

Related topics

Related papers