FAKE NEWS DETECTION USING MACHINE LEARNING
Kamini. A, Dhanush Kumar. K
Department of Artificial Intelligence and Data Science
Agni college of technology, Thalambur, Chennai.
ABSTRACT
One of the most important inventions is the internet, and many people use it for various
purposes. Social media platforms are accessible to these users, and anyone can post or share
news on these platforms. Unfortunately, these platforms don’t verify the users and their
posts, so some people try to spread fake news through them. Fake news can be propaganda
against someone, a society, an organization, or a political party. Unfortunately, a human
being can’t detect all of these fake news, so a need arises for machines that can automatically
detect fake news. This systematic literature review discusses the use of machine learning
classification systems to detect the fake news.
Keywords: Online fake news, Machine learning, Fake news, Text Classification, Social
media.
I. INTRODUCTION
The world is rapidly evolving, and while the digital world has many benefits, it also has its
drawbacks. One of these is the spread of fake news, which can be spread to damage the
reputation of a person or organization. This can be done through a variety of online
platforms, such as Facebook and Twitter. Machine learning, which is the branch of artificial
intelligence, is the process of making systems capable of learning and performing various
tasks. There are various machine learning algorithms available, ranging from supervised to
unsupervised and reinforcement-based. These algorithms must first be trained using a data
set, known as the train data set, and then used to carry out various tasks. Machine learning
can be used in a variety of sectors to perform various tasks, but most of the time it is used for
prediction or to detect hidden information.
Different researchers are working on the detection of Fake News. Machine learning is
proving to be very useful in this field. Researchers are using various algorithms to detect
Fake News. According to Wang (2017), fake news detection is very difficult. They have
used machine learning for fake news detection. According to Zhou (2019), researchers have
found that the number of fake news is increasing with time. Therefore, there is a need for the
detection of false news. The algorithms of machine learning are specially trained to detect
false news. The machine learning algorithms will automatically detect the fake news once
they have been trained. This literature review will provide answers to the various research
questions. It will also discuss the importance of machine learning to detect fake news and the
machine learning algorithms that are used for detecting false news.
This literature review will provide answers to the various research questions. This literature
review will demonstrate the importance of machine learning to detect fake news. It will also
explain how machine learning can be used to detect the false news. The machine learning
algorithms used to detect false news in this literature review will be discussed.
II. METHODOLOGY
The purpose of this literature review is to provide answers to certain research questions.
Therefore, the methodology used is systematic literature review, which helps to provide
answers to the research questions. The papers have been collected from different databases to
be included in this literature review.
III. EXISTING SYSTEM
There exists an expansive body of inquire about on the subject of machine learning strategies
for misdirection location, most of it has been centring on classifying online surveys and
freely accessible social media posts. Especially since late 2016 amid the American
Presidential decision, the address of deciding 'fake news' has moreover been the subject of
specific consideration inside the writing. Conroy, Rubin, and Chen traces a few approaches
that appear promising towards the point of superbly classify the deluding articles. They note
that straightforward content-related n-grams and shallow parts-of-speech labelling have
demonstrated inadequately for the classification errand, regularly coming up short to account
for imperative setting data. Or maybe, these strategies have been appeared valuable as it
were in couple with more complex strategies of investigation. Profound Language structure
examination utilizing Probabilistic Setting Free Language structures have been appeared to
be especially important in combination with n-gram strategies. Feng, Banerjee, and Choi are
able to realize 85%-91curacy in duplicity related classification assignments utilizing online
survey corpora.
IV. PROPOSED SYSTEM
In this article, the model is assembled based on a count vector or matrix. Since this problem
is a type of text classification, it is best to apply Naive Bayes classification because it is the
standard for text-based processing. The real golden developing the model was to convert the
and choose the type of text to use. Now the next step is to extract the optimal features from
the count vector using n number of most used words and/or phrases, with or without
lowercase letters, mainly removing stop words that are common words like "notes", "when"
and "there" and use only words that appear at least certain\times in the given textual data.
V. ALGORITHMS
Naive Bayes
One of supervised learning algorithm based on probabilistic classification technique.
It is a powerful and fast algorithm for predictive modelling.
In this project, Multinomial Naive Bayes Classifier is used.
Support Vector Machine- SVM
SVM‟s is a set of supervised learning methods used for classification, and regression.
Effective in high dimensional spaces.
Uses a subset of training points in the support vector, so it is also memory efficient.
Logistic Regression
Linear model for classification rather than regression.
The expected values of the response variable are modelled based on combination of values
taken by the predictors.
VI. LITERATURE SURVEY
The purpose of this literature review is to provide answers to certain research questions.
Therefore, the methodology used is systematic literature review, which helps to provide
answers to the research questions. The papers have been collected from different databases to
be included in this literature review.
VII. SOURCE CODE
import numpy as np
import pandas as pd
import json
import csv
import random
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import regularizers
import pprint
import tensorflow.compat.v1 as tf
from tensorflow.python.framework import ops
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
tf.disable_eager_execution()
# Reading the data
data = pd.read_csv("news.csv")
data. head ()
VIII. OUTPUT
IX. CONCLUSION
Many people consume news from social media instead of traditional news media. However,
social media has also been used to spread fake news that has negative\effects on individuals
and society. This paper presents an innovative model for detecting fake news using machine
learning algorithms. This model uses news events as input and predicts, based on Twitter
insights and classification algorithms, how much news is fake or real. In this phase, the
feasibility of the project is analysed and a business proposal is presented with a general
project plan and some cost estimates. During the system analysis, a feasibility study of the
planned system must be conducted. This is to ensure that the proposed system is not a burden
on the company. From the point of view of feasibility analysis, it is necessary to understand
the most important requirements of the system. The purpose of this study is to check the
financial impact of the system on the organization. There is a limit to the amount of money a
company can spend on system research and development. Expenses must be justified. The
system was developed thus also within the budget and this was achieved because most of the
technologies used are freely available. Only personalized products were to be purchased.
X. REFERENCES
1. Hadeer Ahmed, Issa Traore and Sherif Saad, "Detection of online fake news using n-
gram analysis and machine learning techniques", International Conference on
Intelligent Secure and Dependable Systems in Distributed and Cloud Environments,
pp. 127-138, 2017.
2. Chih-Chung Chang and Chih-Jen Lin, LIBSVM - A Library for Support Vector
Machines, July 2018.
3. Niall J Conroy, Victoria L Rubin and Yimin Chen, "Automatic deception detection:
Methods for finding fake news", Proceedings of the Association for Information
Science and Technology, vol. 52, no. 1, pp. 1-4, 2015.
4. Chris Faloutsos, "Access methods for text", ACM Computing Surveys (CSUR), vol.
17, no. 1, pp. 49-74, 1985.
5. Mykhailo Granik and Volodymyr Mesyura, "Fake news detection using naive bayes
classifier", 2017 IEEE First Ukraine Conference on Electrical and Computer
Engineering (UKRCON), pp. 900-903, 2017.
6. Getting Real about Fake News, 2016.
7. Juan Ramos et al., "Using tf-idf to determine word relevance in document
queries", Proceedings of the first instructional conference on machine learning, vol.
242, pp. 133-142, 2003.
8. D S K R Vivek Singh and Rupanjal Dasgupta, Automated fake news detection using
linguistic analysis and machine learning.
9. William Yang Wang, “liar liar pants on fire”: A new benchmark dataset for fake news
detection, 2017,