Text classification with the support of pruned dependency patterns

Tunga Gungor

Text classification with the support of pruned dependency patterns

Tunga Gungor

2010, Pattern Recognition Letters

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

Text classification is enhanced using a modified bag-of-words approach that incorporates lexical dependency patterns and a pruning strategy. By adding grammatical relations between words as features and removing less informative ones, the proposed method significantly outperforms traditional text classification techniques on multiple datasets. Experimental results demonstrate the effectiveness of using both word pruning and dependency features, paving the way for more accurate document categorization.

Levent Ozgur

Pattern Analysis and Applications, 2010

In this study, a comprehensive analysis of the lexical dependency and pruning concepts for the text classification problem is presented. Dependencies are included in the feature vector as an extension to the standard bag-of-words approach. The pruning process filters features with low frequencies so that fewer but more informative features remain in the solution vector. The pruning levels for words, dependencies, and dependency combinations for different datasets are analyzed in detail. The main motivation in this work is to make use of dependencies and pruning efficiently in text classification and to achieve more successful results using much smaller feature vector sizes. Three different datasets were used in the experiments and statistically significant improvements for most of the proposed approaches were obtained.

Log In

Text classification with the support of pruned dependency patterns

Sign up for access to the world's latest research

AbstractAI

Related papers

Related topics

Related papers

Abstract
AI