0% found this document useful (0 votes)

57 views5 pages

Semester Project Report by Qaiser

The document discusses hate speech detection using natural language processing techniques. It outlines different features that can be extracted from text like simple surface features, word generalization, sentiment analysis, lexical resources, linguistic features, knowledge-based features, meta-information, and multimodal information to aid in detecting hate speech.

Uploaded by

xixa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views5 pages

Semester Project Report by Qaiser

Uploaded by

xixa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Semester Project

Report
Hate Speech Detection by using
natural language processing
By
Qaiser Hassan and Rafiqa
Zainab
The idea of this HATE SPEECH DETECTOR is Extracted From The
Article “A Survey on Hate Speech Detection using Natural
Language Processing “ by Anna Schmidt (Spoken Language
Systems Saarland University D-66123 Saarbrucken, Germany)
and Michael Wiegand( Spoken Language Systems Saarland
University D-66123 Saarbrucken, Germany)

Introduction
Hate speech is commonly defined as any communication that disparages a person or a group on the
basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality,
religion, or other characteristic (Nockleby, 2000). Examples are (1)-(3).1 (1) Go fucking kill yourself and
die already useless ugly pile of shit scumbag. (2) The Jew Faggot behind the Financial Collapse (3) Hope
one of those bitches falls over and breaks her leg Due to the massive rise of user-generated web
content, in particular on social media networks, the amount of hate speech is also steadily increasing.
Over the past years, interest in online hate speech detection and particularly the automatization of this
task has continuously grown, along with the societal impact of the phenomenon. Natural language
processing focusing specifically on this phenomenon is required since basic word filters do not provide a
sufficient remedy: What is considered a hate speech message might be influenced by aspects such as
the domain of an utterance, its discourse context, as well as context consisting of co-occurring media
objects (e.g. images, videos, audio), the exact time of posting and world events at this moment, identity
of author and targeted recipient. This paper provides a short, comprehensive and structured overview
of automatic hate speech detection, and outlines the existing approaches in a systematic manner,
focusing on feature extraction in particular. It is mainly aimed at NLP researchers who are new to the
field of hate speech detection and want to inform themselves about the state of the art.

2 Terminology
In this paper authors use the term hate speech since it can be considered a broad umbrella term for
numerous kinds of insulting user-created content addressed in the individual works we summarize in
this paper. Hate speech is also the most frequently used expression for this phenomenon, and is even a
legal term in several countries

3 Features for Hate Speech Detection

3.1 Simple Surface Features

For any text classification task, the most obvious information to utilize are surface-level features, such as
bag of words. Indeed, unigrams and larger n-grams are included in the feature sets

3.2 Word Generalization

Since hate speech detection is usually applied on small pieces of text (e.g. passages or even individual
sentences), one may face a data sparsity problem. This is why several works address this issue by
applying some form of word generalization. This can be achieved by carrying out word clustering and
then using induced cluster IDs representing sets of words as additional (generalized) features. A
standard algorithm for this is Brown clustering.

3.3 Sentiment Analysis

Hate speech and sentiment analysis are closely related, and it is safe to assume that usually negative
sentiment pertains to a hate speech message. Because of this, several approaches acknowledge the
relatedness of hate speech and sentiment analysis by incorporating the latter as an auxiliary
classification.

3.4 Lexical Resources

Trying to make use of the general assumption that hateful messages contain specific negative words
(such as slurs, insults, etc.), many authors utilize the presence of such words as a feature. To obtain this
type of information lexical resources are required that contain such predictive expressions.

3.5 Linguistic Features

Linguistic aspects also play an important role for hate speech detection. Linguistic features are either
employed in a more generic fashion or are specifically tailored to the task.

3.6 Knowledge-Based Features

Hate speech detection is a task that cannot be solved by simply looking at keywords. Even if one tries to
model larger textual units, as researchers attempt to do by means of linguistic features , it remains
difficult to decide whether some utterance represents hate speech or not. For instance, (5) may not be
regarded as some form of hate speech when only read in isolation. (5) Put on a wig and lipstick and be
who you really are. However, when the context information is given that this utterance has been
directed towards a boy on a social media site for adolescents7 , one could infer that this is a remark to
malign the sexuality or gender identity of the boy being addressed

3.7 Meta-Information
Meta-information (i.e. information about an utterance) is also a valuable source to hate speech
detection. Since the text commonly used as data for this task almost exclusively comes from social
media platforms, a variety of such meta-information is usually offered and can be easily accessed via the
APIs those platforms provide.

3.8 Multimodal Information

Modern social media do not only consist of text but also include images, video and audio content. Such
non-textual content is also regularly commented on, and therefore becomes part of the discourse of a
hate speech utterance. This context outside a written user comment can be used as a predictive feature.

4 Anticipating Alarming Societal Changes

Apart from detecting individual, isolated hateful comments and classifying the types of users involved,
the overall proportion of extreme negative posts over a certain time-span also allows for interesting
avenues of research. Insights into changes in public or personal mood can be gained. Information on
notable increases in the number of hateful posts within a short time span might indicate suspicious
developments in a community. Such information could be utilized to circumvent incidents such as racial
violence, terrorist attacks, or other crimes before they happen, thus providing steps in the direction of
anticipatory governance.

We have Done the whole coding in Matlab here are some

screen shots of the main programming
This code is took from the “train file” that is use to train data to the system. It store the trained data as
input.

This is the part of above file.we train our data by using MFCC algorithm, that is use to audio record.
These codes contain the trained data.

var cv = require('opencv');

var color = [0, 255, 0];

var thickness = 2;
var cascadeFile = './my_cascade.xml';

var inputFiles = [
'./recognize_this_1.jpg', './recognize_this_2.jpg',
'./recognize_this_3.jpg',
'./recognize_this_3.jpg', './recognize_this_4.jpg', './recognize_this_5.jpg'
];

inputFiles.forEach(function(fileName) {
cv.readImage(fileName, function(err, im) {
im.detectObject(cascadeFile, {neighbors: 2, scale: 2}, function(err,
objects) {
console.log(objects);
for(var k = 0; k < objects.length; k++) {
var object = objects[k];
im.rectangle(
[object.x, object.y],
[object.x + object.width, object.y + object.height],
color,
2
);
}
im.save(fileName.replace(/\.jpg/, 'processed.jpg'));
});
});

Challenges in Hate Speech Detection
No ratings yet
Challenges in Hate Speech Detection
12 pages
Gitari - A Lexicon-Based Approach For Hate Speech Detection
0% (1)
Gitari - A Lexicon-Based Approach For Hate Speech Detection
16 pages
Hate Speech Detection: Challenges and Solutions: A1111111111 A1111111111 A1111111111 A1111111111 A1111111111
No ratings yet
Hate Speech Detection: Challenges and Solutions: A1111111111 A1111111111 A1111111111 A1111111111 A1111111111
16 pages
Machine Learning Based Automatic Hate Speech Recognition System
No ratings yet
Machine Learning Based Automatic Hate Speech Recognition System
4 pages
TMP 2001326023
No ratings yet
TMP 2001326023
22 pages
A Survey On Hate Speech Detection Using Natural Language Processing
No ratings yet
A Survey On Hate Speech Detection Using Natural Language Processing
10 pages
A Lexicon-Based Approach For Hate Speech Detection
No ratings yet
A Lexicon-Based Approach For Hate Speech Detection
17 pages
Detecting Hate Speech on Facebook
No ratings yet
Detecting Hate Speech on Facebook
15 pages
Multilingual Hate Speech Detection A Semi-Supervised Generative Adversarial Approach
No ratings yet
Multilingual Hate Speech Detection A Semi-Supervised Generative Adversarial Approach
19 pages
Hate Speech Detection in Hindi Language
No ratings yet
Hate Speech Detection in Hindi Language
8 pages
Hate Speech Detection - Challenges and Solutions - PLOS ONE
No ratings yet
Hate Speech Detection - Challenges and Solutions - PLOS ONE
9 pages
FDIA 2023 Paper 4
No ratings yet
FDIA 2023 Paper 4
12 pages
Journal Pone 0305657
No ratings yet
Journal Pone 0305657
24 pages
G28.docx 10 75
No ratings yet
G28.docx 10 75
66 pages
A Review of Challenges in Machine Learning Based Automated Hate Speech Detection
No ratings yet
A Review of Challenges in Machine Learning Based Automated Hate Speech Detection
9 pages
Countering Hate Speech On Social Media
No ratings yet
Countering Hate Speech On Social Media
2 pages
NLP Case Studynaman
No ratings yet
NLP Case Studynaman
23 pages
Seminar Research Format
No ratings yet
Seminar Research Format
14 pages
Hate Speech Detection in Twitter Using Natural Language Processing
No ratings yet
Hate Speech Detection in Twitter Using Natural Language Processing
7 pages
Hate Speech Detection Using Machine Learning2
No ratings yet
Hate Speech Detection Using Machine Learning2
4 pages
A Survey On Automatic Online Hate Speech Detection in Low-Resource Languages
No ratings yet
A Survey On Automatic Online Hate Speech Detection in Low-Resource Languages
34 pages
A Survey On Automatic Detection of Hate Speech in Text
No ratings yet
A Survey On Automatic Detection of Hate Speech in Text
30 pages
Overview of The HASOC Subtrack at FIRE 2022 Identification of Conversational Hate-Speech in Hindi-English Code-Mixed and German Language-T7-1
No ratings yet
Overview of The HASOC Subtrack at FIRE 2022 Identification of Conversational Hate-Speech in Hindi-English Code-Mixed and German Language-T7-1
14 pages
Investigating Deep Learning Approaches For Hate
No ratings yet
Investigating Deep Learning Approaches For Hate
12 pages
A Multilingual Evaluation For Online Hate Speech Detection
No ratings yet
A Multilingual Evaluation For Online Hate Speech Detection
22 pages
Roman Urdu Multi-Class Offensive Text Detection - 2020
No ratings yet
Roman Urdu Multi-Class Offensive Text Detection - 2020
6 pages
Marathi Hate Speech Detection
No ratings yet
Marathi Hate Speech Detection
5 pages
Navigating The Dark Web of Hate: Supervised Machine Learning Paradigm and NLP For Detecting Online Hate Speeches
No ratings yet
Navigating The Dark Web of Hate: Supervised Machine Learning Paradigm and NLP For Detecting Online Hate Speeches
8 pages
Semantic Quantum Correlations in Hate Speeches: Francesco Galofaro
No ratings yet
Semantic Quantum Correlations in Hate Speeches: Francesco Galofaro
14 pages
8 - Hateful Symbols or Hateful People Predictive Features For Hate Speech Detection On Twitter
No ratings yet
8 - Hateful Symbols or Hateful People Predictive Features For Hate Speech Detection On Twitter
6 pages
3 Deep Learning Based Implementation of Hate Speech Identification On Texts in Indonesian - Preliminary Study
No ratings yet
3 Deep Learning Based Implementation of Hate Speech Identification On Texts in Indonesian - Preliminary Study
4 pages
RP 3
No ratings yet
RP 3
4 pages
Paper by Raghad and Hend For Hate Speech Detection in Saudi Twitter Sphere A Deep Learning Approach
No ratings yet
Paper by Raghad and Hend For Hate Speech Detection in Saudi Twitter Sphere A Deep Learning Approach
12 pages
Automated Hate Speech Detection
No ratings yet
Automated Hate Speech Detection
4 pages
Multi-Modal Hate Speech Detection Using Machine
No ratings yet
Multi-Modal Hate Speech Detection Using Machine
5 pages
Hate Speech Detection of Arabic Social Media Using Machine Learning Techniques: A Comparative Study
No ratings yet
Hate Speech Detection of Arabic Social Media Using Machine Learning Techniques: A Comparative Study
24 pages
Hate Speech Chapter Final Preprint
No ratings yet
Hate Speech Chapter Final Preprint
27 pages
Final Year
No ratings yet
Final Year
25 pages
LSTM-Based Hate Speech Detection
No ratings yet
LSTM-Based Hate Speech Detection
49 pages
RP 5
No ratings yet
RP 5
7 pages
Detection of Hate Based Political Speech
No ratings yet
Detection of Hate Based Political Speech
5 pages
Overview of The HASOC Subtrack at FIRE 2023: Identification of Conversational Hate-Speech
No ratings yet
Overview of The HASOC Subtrack at FIRE 2023: Identification of Conversational Hate-Speech
9 pages
NLP Techniques for Hate Speech Detection
No ratings yet
NLP Techniques for Hate Speech Detection
4 pages
Automated Hate Speech Detection and The Problem of Offensive Language
No ratings yet
Automated Hate Speech Detection and The Problem of Offensive Language
4 pages
Contextual-Aware and Expert Data Resources For Bra
No ratings yet
Contextual-Aware and Expert Data Resources For Bra
22 pages
CE807 - Assignment 1 - Interim Practical Text Analytics and Report
No ratings yet
CE807 - Assignment 1 - Interim Practical Text Analytics and Report
5 pages
A Context-Aware Based Model For The Detection of Online Swahili Hate Speech Using Bertuser
No ratings yet
A Context-Aware Based Model For The Detection of Online Swahili Hate Speech Using Bertuser
59 pages
Masonperplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles
No ratings yet
Masonperplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles
7 pages
Hate Speech Detection in Pashto Tweets
No ratings yet
Hate Speech Detection in Pashto Tweets
8 pages
Detecting Offensive Language in Bengali, Bodo, and Assamese Using Word Unigrams, Char N-Grams, Classical Machine Learning, and Deep Learning Methods
No ratings yet
Detecting Offensive Language in Bengali, Bodo, and Assamese Using Word Unigrams, Char N-Grams, Classical Machine Learning, and Deep Learning Methods
9 pages
Overview of The HASOC Subtrack at FIRE 2021 Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages-T1-1
No ratings yet
Overview of The HASOC Subtrack at FIRE 2021 Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages-T1-1
19 pages
Final Report Edit
No ratings yet
Final Report Edit
26 pages
Detecting Hate Speech in Social Media
No ratings yet
Detecting Hate Speech in Social Media
6 pages
2024.icon Fauxhate.0
No ratings yet
2024.icon Fauxhate.0
5 pages
A Big-Data Processing and Visualization Platform
No ratings yet
A Big-Data Processing and Visualization Platform
21 pages
Amharic Hate Speech Detection on Facebook
No ratings yet
Amharic Hate Speech Detection on Facebook
12 pages
AI for Hate Speech Detection
No ratings yet
AI for Hate Speech Detection
11 pages
Deep Learning For Hate Speech Detection: Compararive Study
No ratings yet
Deep Learning For Hate Speech Detection: Compararive Study
18 pages
Secure e-Voting with Fingerprint
No ratings yet
Secure e-Voting with Fingerprint
8 pages
Brain Tumor Detection via ANN
No ratings yet
Brain Tumor Detection via ANN
5 pages
Shop Herarkhy
No ratings yet
Shop Herarkhy
1 page
Introduction to Dfg Concepts
No ratings yet
Introduction to Dfg Concepts
1 page
May 2020 Salary Statement UOBS
No ratings yet
May 2020 Salary Statement UOBS
2 pages
UOBS Employee Salary Sheet May 2020
No ratings yet
UOBS Employee Salary Sheet May 2020
1 page
Easy Employee Management System
No ratings yet
Easy Employee Management System
2 pages
Employee Personal Information Form
No ratings yet
Employee Personal Information Form
2 pages
FYP Proposal Alumni System of UOBS
No ratings yet
FYP Proposal Alumni System of UOBS
8 pages
Employee Information Form Template
No ratings yet
Employee Information Form Template
1 page
Hate Speech Research Paper
No ratings yet
Hate Speech Research Paper
9 pages
Hate Speech - Final
No ratings yet
Hate Speech - Final
205 pages
Offences Against Public Tranquility
No ratings yet
Offences Against Public Tranquility
26 pages
Legal Analysis: Amish Devgan Case
No ratings yet
Legal Analysis: Amish Devgan Case
11 pages
The University of Chicago Press Ethics: This Content Downloaded From 128.112.200.107 On Tue, 07 Nov 2017 14:50:43 UTC
No ratings yet
The University of Chicago Press Ethics: This Content Downloaded From 128.112.200.107 On Tue, 07 Nov 2017 14:50:43 UTC
29 pages
Professional Standards Committee Recommendations To The Nar Board of Directors
No ratings yet
Professional Standards Committee Recommendations To The Nar Board of Directors
12 pages
NHMC - Petition For Inquiry Into Hate Speech
100% (2)
NHMC - Petition For Inquiry Into Hate Speech
43 pages
Public OSINT Report: Neo-Nazi Andrew Christo Nelson - New-Brunswick, Canada
No ratings yet
Public OSINT Report: Neo-Nazi Andrew Christo Nelson - New-Brunswick, Canada
17 pages
Afaan Oromoo Hate Speech Detection
No ratings yet
Afaan Oromoo Hate Speech Detection
87 pages
Social Media Charter Overview
No ratings yet
Social Media Charter Overview
33 pages
PECA Laws in Pakistan
No ratings yet
PECA Laws in Pakistan
5 pages
Constitutional Limits on Free Speech
No ratings yet
Constitutional Limits on Free Speech
7 pages
Hate Speech Laws and Their Implications
No ratings yet
Hate Speech Laws and Their Implications
4 pages
Special Collection On The Case Law On Freedom of Expression African System of Human and Peoples Rights
No ratings yet
Special Collection On The Case Law On Freedom of Expression African System of Human and Peoples Rights
19 pages
Freedom vs. Hate Speech Debate
No ratings yet
Freedom vs. Hate Speech Debate
2 pages
Robert Post - Legitimacy and Hate Speech
No ratings yet
Robert Post - Legitimacy and Hate Speech
11 pages
Politeness Analysis Found in Social Media
No ratings yet
Politeness Analysis Found in Social Media
12 pages
Mulugeta Abrha
No ratings yet
Mulugeta Abrha
104 pages
Gaita V. Chesley Amended Trial Management Report
No ratings yet
Gaita V. Chesley Amended Trial Management Report
18 pages
The Limits of Free Speech Shared
No ratings yet
The Limits of Free Speech Shared
4 pages
BIBLIO Kamen (2020) Insults in Classical Athens
No ratings yet
BIBLIO Kamen (2020) Insults in Classical Athens
283 pages
Understanding Hate Speech in India
No ratings yet
Understanding Hate Speech in India
15 pages
Writing a Thesis on Hatred: A Guide
100% (3)
Writing a Thesis on Hatred: A Guide
4 pages
CSO Grants: Combat Discrimination
No ratings yet
CSO Grants: Combat Discrimination
9 pages
The Ethical Role of AI in Content Moderation and Free Speech
No ratings yet
The Ethical Role of AI in Content Moderation and Free Speech
5 pages
BBD University Defendant
No ratings yet
BBD University Defendant
15 pages
Discourse and Social Theory Lemke
No ratings yet
Discourse and Social Theory Lemke
222 pages
Hate Speech and Democratic Citizenship Eric Heinze
No ratings yet
Hate Speech and Democratic Citizenship Eric Heinze
62 pages
Kenya ICT Law Amendments
No ratings yet
Kenya ICT Law Amendments
39 pages
The Linguistic Features of Hate Speech and Social Critic On Social Media: An Analysis of Forensic Linguistic
No ratings yet
The Linguistic Features of Hate Speech and Social Critic On Social Media: An Analysis of Forensic Linguistic
10 pages

Semester Project Report by Qaiser

Uploaded by

Semester Project Report by Qaiser

Uploaded by

Semester Project

3 Features for Hate Speech Detection

3.1 Simple Surface Features

3.2 Word Generalization

3.3 Sentiment Analysis

3.4 Lexical Resources

3.5 Linguistic Features

3.6 Knowledge-Based Features

3.8 Multimodal Information

4 Anticipating Alarming Societal Changes

We have Done the whole coding in Matlab here are some

var color = [0, 255, 0];

You might also like