International Journal of Web Information Systems

Radhouane Guermazi

International Journal of Web Information Systems

Radhouane Guermazi

2008, Information Systems

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

The paper discusses the increasing need for automatic text categorization methods in the context of large-scale digital document collections. It elaborates on the limitations of human categorization and highlights the evolution of text categorization methods, particularly the naïve Bayes approach and its extensions. The research presents various statistical techniques, including SVM and LR models, to evaluate performance across different domains, providing empirical results that demonstrate improvements in accuracy and reductions in false positive rates.

Aidana Darkenova

ACM Computing Surveys, 2002

The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

Log In

International Journal of Web Information Systems

Sign up for access to the world's latest research

AbstractAI

Related papers

Related topics

Abstract
AI