A Fuzzy Ontology and SVM–Based Web Content Classification System

pervez khan

A Fuzzy Ontology and SVM–Based Web Content Classification System

pervez khan

IEEE Access

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

The volume of adult content on the world wide web is increasing rapidly. This makes an automatic detection of adult content a more challenging task, when eliminating access to ill-suited websites. Most pornographic webpage-filtering systems are based on n-gram, naïve Bayes, K-nearest neighbor, and keyword-matching mechanisms, which do not provide perfect extraction of useful data from unstructured web content. These systems have no reasoning capability to intelligently filter web content to classify medical webpages from adult content webpages. In addition, it is easy for children to access pornographic webpages due to the freely available adult content on the Internet. It creates a problem for parents wishing to protect their children from such unsuitable content. To solve these problems, this paper presents a support vector machine (SVM) and fuzzy ontology-based semantic knowledge system to systematically filter web content and to identify and block access to pornography. The proposed system classifies URLs into adult URLs and medical URLs by using a blacklist of censored webpages to provide accuracy and speed. The proposed fuzzy ontology then extracts web content to find website type (adult content, normal, and medical) and block pornographic content. In order to examine the efficiency of the proposed system, fuzzy ontology, and intelligent tools are developed using Protégé 5.1 and Java, respectively. Experimental analysis shows that the performance of the proposed system is efficient for automatically detecting and blocking adult content. INDEX TERMS Data mining, semantic knowledge, fuzzy ontology, SVM, adult content identification.

International Research Group - IJET JOURNAL

Currently, the amount of adult (pornographic) content on the Internet is increasing rapidly. This makes an automatic detection of adult content a more challenging task, when eliminating access to ill-suited websites. It is easy for children to access pornographic webpages due to the freely available adult content on the Internet. It creates a problem for parents wishing to protect their children from such unsuitable content.In 2005, the European Parliament launched a large program called "Safer Use of the Internet", particularly for young people. Some webpages contain a huge amount of combined data related to healthcare (information on diseases, mental health, and physical fitness) and sexual knowledge (medicine for sexual health, birth control, treatment during pregnancy, etc.). In this system, we focus on the recognition of Web adult content. A fuzzy-ontology/SVM-based adult content detection system is proposed to automate the classification of pornographic versus medical websites. The proposed mechanism offers an adult content detection system that classifies webpages into normal, pornographic, or medical webpages using extracted web content features. The adult Web page bag recognition is carried out using multi-instance learning based on the combination of classifying texts, images and videos in Web pages.

Log In

A Fuzzy Ontology and SVM–Based Web Content Classification System

Sign up for access to the world's latest research

Abstract

Related papers