Papers by Mahmoudreza Babaei

As Internet users increasingly rely on social media sites like Facebook and Twitter to receive ne... more As Internet users increasingly rely on social media sites like Facebook and Twitter to receive news, they are faced with a bewildering number of news media choices. For example, thousands of Facebook pages today are registered and categorized as some form of news media outlets. Inferring the bias (or slant) of these media pages poses a difficult challenge for media watchdog organizations that traditionally rely on content analysis. In this paper, we explore a novel scalable methodology to accurately infer the biases of thousands of news sources on social media sites like Facebook and Twitter. Our key idea is to utilize their advertiser interfaces, that offer detailed insights into the demographics of the news source’s audience on the social media site. We show that the ideological (liberal or conservative) leaning of a news source can be accurately estimated by the extent to which liberals or conservatives are over-/under-represented among its audience. Additionally, we show how bia...

ArXiv, 2021
Influence maximization has found applications in a wide range of real-world problems, for instanc... more Influence maximization has found applications in a wide range of real-world problems, for instance, viral marketing of products in an online social network, and information propagation of valuable information such as job vacancy advertisements and health-related information. While existing algorithmic techniques usually aim at maximizing the total number of people influenced, the population often comprises several socially salient groups, e.g., based on gender or race. As a result, these techniques could lead to disparity across different groups in receiving important information. Furthermore, in many of these applications, the spread of influence is time-critical, i.e., it is only beneficial to be influenced before a time deadline. As we show in this paper, the time-criticality of the information could further exacerbate the disparity of influence across groups. This disparity, introduced by algorithms aimed at maximizing total influence, could have far-reaching consequences, impac...
The top 5 ranked news stories prioritized according to the three objectives of social media platf... more The top 5 ranked news stories prioritized according to the three objectives of social media platforms for selecting stories for fact checking. The low overlap between the three ranked lists highlights the complementary nature of the objectives.

On social media platforms, like Twitter, users are often interested in gaining more influence and... more On social media platforms, like Twitter, users are often interested in gaining more influence and popularity by growing their set of followers, aka their audience. Several studies have described the properties of users on Twitter based on static snapshots of their follower network. Other studies have analyzed the general process of link formation. Here, rather than investigating the dynamics of this process itself, we study how the characteristics of the audience and follower links change as the audience of a user grows in size on the road to user's popularity. To begin with, we find that the early followers tend to be more elite users than the late followers, i.e., they are more likely to have verified and expert accounts. Moreover, the early followers are significantly more similar to the person that they follow than the late followers. Namely, they are more likely to share time zone, language, and topics of interests with the followed user. To some extent, these phenomena are...

IEEE Transactions on Computational Social Systems, 2021
Misinformation on social media has become a critical problem, particularly during a public health... more Misinformation on social media has become a critical problem, particularly during a public health pandemic. Most social platforms today rely on users' voluntary reports to determine which news stories to fact-check first. Despite the importance, no prior work has explored the potential biases in such a reporting process. This work proposes a novel methodology to assess how users perceive truth or misinformation in online news stories. By conducting a large-scale survey (N = 15,000), we identify the possible biases in news perceptions and explore how partisan leanings influence the news selection algorithm for fact checking. Our survey reveals several perception biases or inaccuracies in estimating the truth level of stories. The first kind, called the total perception bias (TPB), is the aggregate difference in the ground truth and perceived truth level. The next two are the false-positive bias (FPB) and false-negative bias (FNB), which measures users' gullibility and cynical...

The potential for machine learning systems to amplify social inequities and unfairness is receivi... more The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. Much recent work has focused on developing algorithmic tools to assess and mitigate such unfairness. However, there is little work on enhancing fairness in graph algorithms. Here, we develop a simple, effective and general method, CrossWalk, that enhances fairness of various graph algorithms, including influence maximization, link prediction and node classification, applied to node embeddings. CrossWalk is applicable to any random walk based node representation learning algorithm, such as DeepWalk and Node2Vec. The key idea is to bias random walks to cross group boundaries, by upweighting edges which (1) are closer to the groups’ peripheries or (2) connect different groups in the network. CrossWalk pulls nodes that are near groups’ peripheries towards their neighbors from other groups in the embedding space, while preserving the necessary str...
A growing number of people rely on social media platforms, such as Twitter and Facebook, for thei... more A growing number of people rely on social media platforms, such as Twitter and Facebook, for their news and information needs [15, 24], where users themselves play a role in selecting the sources from which they consume information, overthrowing traditional journalistic gatekeeping [23]. Since users can just select their information sources, they don’t have full control on the content they receive. Moreover, it is very hard to ascertain the quality, relevance, and credibility of information produced by social media users [1, 6, 7], raising interesting questions like: (i) how efficient are users at selecting their information sources?, (ii) how do users perceive the truthfulness of information?, and (iii) how does the consumed information impact users and the society?

Social media systems have increasingly become digital information marketplaces, where users produ... more Social media systems have increasingly become digital information marketplaces, where users produce, consume and share information and ideas, often of public interest. In this context, social media users are their own curators of information — however, they can only select their information sources, who they follow, but cannot choose the information they are exposed to, which content they receive. A natural question is thus to assess how efficient are users at selecting their information sources. In this work, we model social media users as information processing systems whose goal is acquiring a set of (unique) pieces of information. We then define a computational framework, based on minimal set covers, that allows us both to evaluate every user's performance as information curators within the system. Our framework is general and applicable to any social media system where every user follows others within the system to receive the information they produce. We leverage our frame...

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Influence maximization is a widely studied topic in network science, where the aim is to reach th... more Influence maximization is a widely studied topic in network science, where the aim is to reach the maximum possible number of nodes, while only targeting a small initial set of individuals. It has critical applications in many fields, including viral marketing, information propagation, news dissemination, and vaccinations. However, the objective does not usually take into account whether the final set of influenced nodes is fair with respect to sensitive attributes, such as race or gender. Here we address fair influence maximization, aiming to reach minorities more equitably. We introduce Adversarial Graph Embeddings: we co-train an auto-encoder for graph embedding and a discriminator to discern sensitive attributes. This leads to embeddings which are similarly distributed across sensitive attributes. We then find a good initial set by clustering the embeddings. We believe we are the first to use embeddings for the task of fair influence maximization. While there are typically trade...

Proceedings of the Conference on Fairness, Accountability, and Transparency
Misinformation on social media has become a critical problem, particularly during a public health... more Misinformation on social media has become a critical problem, particularly during a public health pandemic. Most social platforms today rely on users' voluntary reports to determine which news stories to fact check first. Despite the importance, no prior work has explored the potential biases in such a reporting process. This work proposes a novel methodology to assess how users perceive truth or misinformation in online news stories. By conducting a large-scale survey (N=15,000), we identify the possible biases in news perceptions and explore how partisan leanings influence the news selection algorithm for fact-checking. Our survey reveals several perception biases or inaccuracies in estimating the truth level of stories. The first kind, called the total perception bias (TPB), is the aggregate difference in the ground truth and perceived truth level. The next two are the false-positive bias (FPB) and false-negative bias (FNB), which measure users' gullibility and cynicality of a given claim. We also propose ideological mean perception bias (IMPB), which quantifies a news story's ideological disputability. Collectively, these biases indicate that user perceptions are not correlated with the ground truth of new stories; users believe some stories to be more false and vice versa. This calls for the need to fact-check news stories that exhibit the most considerable perception biases first, which the current voluntary reporting does not offer. Based on these observations, we propose a new framework that can best leverage users' truth perceptions to (1) remove false stories, (2) correct misperceptions of users, or (3) decrease ideological disagreements. We discuss how this new prioritizing scheme can aid platforms to significantly reduce the impact of fake news on user beliefs.

Proceedings of the Conference on Fairness, Accountability, and Transparency - FAT* '19, 2019
Targeted advertising is meant to improve the efficiency of matching advertisers to their customer... more Targeted advertising is meant to improve the efficiency of matching advertisers to their customers. However, targeted advertising can also be abused by malicious advertisers to efficiently reach people susceptible to false stories, stoke grievances, and incite social conflict. Since targeted ads are not seen by non-targeted and non-vulnerable people, malicious ads are likely to go unreported and their effects undetected. This work examines a specific case of malicious advertising, exploring the extent to which political ads 1 from the Russian Intelligence Research Agency (IRA) run prior to 2016 U.S. elections exploited Facebook's targeted advertising infrastructure to efficiently target ads on divisive or polarizing topics (e.g., immigration, race-based policing) at vulnerable subpopulations. In particular, we do the following: (a) We conduct U.S. census-representative surveys to characterize how users with different political ideologies report, approve, and perceive truth in the content of the IRA ads. Our surveys show that many ads are "divisive": they elicit very different reactions from people belonging to different socially salient groups. (b) We characterize how these divisive ads are targeted to sub-populations that feel particularly aggrieved by the status quo. Our findings support existing calls for greater transparency of content and targeting of political ads. (c) We particularly focus on how the Facebook ad API facilitates such targeting. We show how the enormous amount of personal data Facebook aggregates about users and makes available to advertisers enables such malicious targeting.

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016
Social media sites are information marketplaces, where users produce and consume a wide variety o... more Social media sites are information marketplaces, where users produce and consume a wide variety of information and ideas. In these sites, users typically choose their information sources, which in turn determine what specific information they receive, how much information they receive and how quickly this information is shown to them. In this context, a natural question that arises is how efficient are social media users at selecting their information sources. In this work, we propose a computational framework to quantify users' efficiency at selecting information sources. Our framework is based on the assumption that the goal of users is to acquire a set of unique pieces of information. To quantify user's efficiency, we ask if the user could have acquired the same pieces of information from another set of sources more efficiently. We define three different notions of efficiency-link, inflow , and delay-corresponding to the number of sources the user follows, the amount of (redundant) information she acquires and the delay with which she receives the information. Our definitions of efficiency are general and applicable to any social media system with an underlying information network, in which every user follows others to receive the information they produce. In our experiments, we measure the efficiency of Twitter users at acquiring different types of information. We find that Twitter users exhibit sub-optimal efficiency across the three notions of efficiency, although they tend to be more efficient at acquiring nonpopular pieces of information than they are at acquiring popular pieces of information. We then show that this lack of efficiency is a consequence of the triadic closure mechanism by which users typically discover and follow other users in social media. Thus, our study reveals a tradeoff between the efficiency and discoverability of information sources. Finally, we develop a heuristic algorithm that enables users to be significantly more efficient at acquiring the same unique pieces of information.
In this letter we studied the epidemic spreading on scale-free networks assuming a limited budget... more In this letter we studied the epidemic spreading on scale-free networks assuming a limited budget for immunization. We proposed a general model in which the immunity of an individual against the disease depends on its immunized friends in the network. Furthermore, we considered the possibility that each individual might be eager to pay a price to buy the vaccine and become immune against the disease. Under these assumptions we proposed an algorithm for improving the performance of all previous immunization algorithms. We also introduced a heuristic extension of the algorithm, which works well in scale-free networks.
Social Network Analysis and Mining, Oct 10, 2011
Social networking has become a part of daily life for many individuals across the world. Widespre... more Social networking has become a part of daily life for many individuals across the world. Widespread adoption of various strategies in such networks can be utilized by business corporations as a powerful means for advertising. In this study, we investigated viral marketing strategies in which buyers are influenced by other buyers who already own an item. Since finding an optimal marketing strategy is NP-hard, a simple strategy has been proposed in which giving the item for free to a subset of influential buyers in a network ...
Physical Review E, Oct 25, 2011
Network science has attracted much attention in recent years, primarily due to its application in... more Network science has attracted much attention in recent years, primarily due to its application in many areas ranging from biology to medicine, engineering, and social sciences [1, 2]. Research in network science starts by observing a phenomenon in real data and then tries to construct models to mimic its behavior. Many real-world networks share some common structural properties such as scale-free degree distribution, small-worldness, and modularity. The dynamic behavior of networks largely depends on their structural properties [3, 4]. ...

Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
Although diverse news stories are actively posted on social media, readers often focus on news wh... more Although diverse news stories are actively posted on social media, readers often focus on news which reinforces their pre-existing views, leading to 'filter bubble' effects. To combat this, some recent systems expose and nudge readers toward stories with different points of view. One example is the Wall Street Journal's 'Blue Feed, Red Feed' system, which presents posts from biased publishers on each side of a topic. However, these systems have had limited success. In this work, we present a complementary approach which identifies high consensus 'purple' posts that generate similar reactions from both 'blue' and 'red' readers. We define and operationalize consensus for news posts on Twitter in the context of US politics. We identify several high consensus posts and discuss their empirical properties. We present a highly scalable method for automatically identifying high and low consensus news posts on Twitter, by utilizing a novel category of audience leaning based features, which we show are well suited to this task. Finally, we build and publicly deploy our 'Purple Feed' system (twitter-app.mpi-sws.org/purple-feed), which highlights high consensus posts from publishers on both sides of the political spectrum.
Uploads
Papers by Mahmoudreza Babaei