Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2012, Sixth International Aaai Conference on Weblogs and Social Media
…
8 pages
1 file
News articles are extremely time sensitive by nature. There is also intense competition among news items to propagate as widely as possible. Hence, the task of predicting the popularity of news items on the social web is both interesting and challenging. Prior research has dealt with predicting eventual online popularity based on early popularity. It is most desirable, however, to predict the popularity of items prior to their release, fostering the possibility of appropriate decision making to modify an article and the manner of its publication. In this paper, we construct a multi-dimensional feature space derived from properties of an article and evaluate the efficacy of these features to serve as predictors of online popularity. We examine both regression and classification algorithms and demonstrate that despite randomness in human behavior, it is possible to predict ranges of popularity on twitter with an overall 84% accuracy. Our study also serves to illustrate the differences between traditionally prominent sources and those immensely popular on the social web.
—With the expansion of the Internet, more and more people enjoys reading and sharing online news articles. The number of shares under a news article indicates how popular the news is. In this project, we intend to find the best model and set of feature to predict the popularity of online news, using machine learning techniques. Our data comes from Mashable, a well-known online news website. We implemented 10 different learning algorithms on the dataset, ranging from various regressions to SVM and Random Forest. Their performances are recorded and compared. Feature selection methods are used to improve performance and reduce features. Random Forest turns out to be the best model for prediction, and it can achieve an accuracy of 70% with optimal parameters. Our work can help online news companies to predict news popularity before publication.
Online news has been the most active and widely used source of news consumption in the information era. The popularity of the news can be easily determined by the number of shares that article has. In this paper, we have presented quite a few supervised learning techniques to find the number of shares (using regression) and class label of discretized shares buckets (using classification). The dataset required for the analysis contains a curated collection of numeric and categorical features gathered by scraping ~39,000 news articles shared on well-known online news website called mashable.com and was obtained from UC-Irvine Machine learning dataset repository. We have implemented regression, binary classification and 3 -class multi-label classification. In order to address the prediction task, we used 6 learning algorithms including Linear Regression, Logistic regression, Decision tree classifiers, Random forests, Artificial neural networks, etc with multiple flavours of each being built by hyper-parameter tuning at various stages. Their performances have been recorded, compared and presented. Feature selection methods and dimensionality reduction has been performed to reduce the time and computational complexity of the learners. In binary classification problem, extreme gradient boosting classifier gave the best results with an accuracy or PPV of 65% and precision score of 0.62. In 3-class multi-label classification problem, we achieved an accuracy of 58% and a precision score of 0.52 using artificial neural networks. In the following paper, we will be explaining the process of data cleaning, feature selection, dimensionality reduction, model selection, evaluation and summary with respect to the dataset and the stages followed.
Computación y Sistemas, 2017
In this study, we identify the features of an article that encourage people to leave a comment for it. The volume of the received comments for a news article shows its importance. It also indirectly indicates the amount of influence a news article has on the public. Leaving comment on a news article indicates not only the visitor has read the article but also the article has been important to him/her. We propose a machine learning approach to predict the volume of comments using the information that is extracted about the users’ activities on the web pages of news agencies. In order to evaluate the proposed method, several experiments were performed. The results reveal salient improvement in comparison with the baseline methods.
International Journal for Research in Applied Science and Engineering Technology (IJRASET), 2022
In this Internet era, where everything is available online so is News. People are nowadays using online platforms like Facebook, Twitter, Instagram to read and share the news. Predicting the popularity of news accurately holds immense value for news providers including online stakeholders, advertisers. In this research paper we aimed at finding the best model for predicting the accuracy of online news articles. The data for our project has been taken from a website named Mashable. We used various models on the dataset and found that SVR with PCA yielded the best results.
IRJET, 2021
With the expansion of the Internet, more and more people enjoy reading and sharing online news articles. Most people nowadays have switched to online mode for getting their daily news. As news articles are shared over the internet through social media platforms and brought to the attention of many people, they are likely to gain popularity. Predicting news popularity prior to publication has proved to be useful for online news publishers. Various works have been done for predicting online news popularity using different machine learning methods. This research work is based on comparative analysis of such machine learning techniques and choosing the most suitable technique for the problem of predicting the popularity of online news articles.
Proceedings of the 25th International Conference Companion on World Wide Web - WWW '16 Companion
The paper presents a framework for the prediction of several news story popularity indicators, such as comment count, number of users, vote score and a measure of controversiality. The framework employs a feature engineering approach, focusing on features from two sources of social interactions inherent in online discussions: the comment tree and the user graph. We show that the proposed graph-based features capture the complexities of both these social interaction graphs and lead to improvements on the prediction of all popularity indicators in three online news post datasets and to significant improvement on the task of identifying controversial stories. Specifically, we noted a 5% relative improvement in mean square error for controversiality prediction on a news-focused Reddit dataset compared to a method employing only rudimentary comment tree features that were used by past studies.
WIMS'11, 2011
Understanding user participation is fundamental in anticipating the popularity of online content. In this paper, we explore how the number of users' comments during a short observation period after publication can be used to predict the expected popularity of articles published by a countrywide online newspaper. We evaluate a simple linear prediction model on a real dataset of hundreds of thousands of articles and several millions of comments collected over a period of four years. Analyzing the accuracy of our proposed model for different values of its basic parameters we provide valuable insights on the potentials and limitations for predicting content popularity based on early user activity.
Social Network Analysis and Mining, 2014
News articles are an engaging type of online content that captures the attention of a significant amount of Internet users. They are particularly enjoyed by mobile users and massively spread through online social platforms. As a result, there is an increased interest in discovering the articles that will become popular among users. This objective falls under the broad scope of content popularity prediction and has direct implications in the development of new services for online advertisement and content distribution. In this paper, we address the problem of predicting the popularity of news articles based on user comments. We formulate the prediction task as a ranking problem, where the goal is not to infer the precise attention that a content will receive but to accurately rank articles based on their predicted popularity. Using data obtained from two important news sites in France and Netherlands, we analyze the ranking effectiveness of two prediction models. Our results indicate that popularity prediction methods are adequate solutions for this ranking task
SMART MOVES JOURNAL IJOSCIENCE, 2018
News popularity is the maximum growth of attention given for particular news article. The popularity of online news depends on various factors such as the number of social media, the number of visitor comments, the number of Likes, etc. It is therefore necessary to build an automatic decision support system to predict the popularity of the news as it will help in business intelligence too. The work presented in this study aims to find the best model to predict the popularity of online news using machine learning methods. In this work, the result analysis is performed by applying Co-relation algorithm, particle swarm optimization and principal component analysis. For performance evaluation support vector machine, naïve bayes, k-nearest neighbor and neural network classifiers are used to classify the popular and unpopular data. From the experimental results, it is observed that support vector machine and naïve bayes outperforms better with co-relation algorithm as well as k-NN and neu...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Journal of Internet Services and Applications, 2014
Communications of the ACM, 2010
Proceedings of the 19th international conference on World wide web - WWW '10, 2010
arXiv: Social and Information Networks, 2018
Journal of Systems Science and Systems Engineering, 2020
2018 21st Saudi Computer Society National Computer Conference (NCC), 2018
Advances in Intelligent Systems and Computing, 2015
Journal of Computational Science, 2018
Lecture Notes in Computer Science, 2018
Computer communications and networks, 2016
MultiMedia Modeling, 2018