Skip to main content

Grant R Y Schoenebeck

University of Michigan, School of Information, Faculty Member

University of Michigan, Computer Science and Engineering, Faculty Member

Followers

149

Following

54

Co-authors

7

Public Views

NSF CAREER Award, Google Faculty Award, and
Facebook Faculty Award Recipient
Supervisors: Luca Trevisan

less

Adam Francis Smith

Christos Dimitrakakis

Université de Lille

Benjamin Rubinstein

University of California, Berkeley

Varun Narasimhachar

University of Calgary

University of Strathclyde

InterestsView All (9)

Uploads

Papers by Grant R Y Schoenebeck

Optimal Local Bayesian Differential Privacy over Markov Chains

arXiv (Cornell University), Jun 22, 2022

In the literature of data privacy, differential privacy is the most popular model. An algorithm i... more In the literature of data privacy, differential privacy is the most popular model. An algorithm is differentially private if its outputs with and without any individual's data are indistinguishable. In this paper, we focus on data generated from a Markov chain and argue that Bayesian differential privacy (BDP) offers more meaningful guarantees in this context. Our main theoretical contribution is providing a mechanism for achieving BDP when data is drawn from a binary Markov chain. We improve on the state-of-the-art BDP mechanism and show that our mechanism provides the optimal noise-privacy tradeoffs for any local mechanism up to negligible factors. We also briefly discuss a non-local mechanism which adds correlated noise. Lastly, we perform experiments on synthetic data that detail when DP is insufficient, and experiments on real data to show that our privacy guarantees are robust to underlying distributions that are not simple Markov chains.

Information Elicitation Mechanisms for Statistical Estimation

Proceedings of the ... AAAI Conference on Artificial Intelligence, Apr 3, 2020

We study learning statistical properties from strategic agents with private information. In this ... more We study learning statistical properties from strategic agents with private information. In this problem, agents must be incentivized to truthfully reveal their information even when it cannot be directly verified. Moreover, the information reported by the agents must be aggregated into a statistical estimate. We study two fundamental statistical properties: estimating the mean of an unknown Gaussian, and linear regression with Gaussian error. The information of each agent is one point in a Euclidean space. Our main results are two mechanisms for each of these problems which optimally aggregate the information of agents in the truth-telling equilibrium: • A minimal (non-revelation) mechanism for large populations-agents only need to report one value, but that value need not be their point. • A mechanism for small populations that is non-minimalagents need to answer more than one question. These mechanisms are "informed truthful" mechanisms where reporting unaltered data (truth-telling) 1) forms a strict Bayesian Nash equilibrium and 2) has strictly higher welfare than any oblivious equilibrium where agents' strategies are independent of their private signals. We also show a minimal revelation mechanism (each agent only reports her signal) for a restricted setting and use an impossibility result to prove the necessity of this restriction. We build upon the peer prediction literature in the singlequestion setting; however, most previous work in this area focuses on discrete signals, whereas our setting is inherently continuous, and we further simplify the agents' reports.

Multitask Peer Prediction With Task-dependent Strategies

Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed... more Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed with any objective ground truthful information. In the multi-task setting where each agent is asked multiple questions, a sequence of mechanisms have been proposed which are truthful-truth-telling is guaranteed to be an equilibrium, or even better, informed truthful-truth-telling is guaranteed to be one of the best-paid equilibria. However, these guarantees assume agents' strategies are restricted to be task-independent: an agent's report on a task is not affected by her information about other tasks. We provide the first discussion on how to design (informed) truthful mechanisms for task-dependent strategies, which allows the agents to report based on all her information on the assigned tasks. We call such stronger mechanisms (informed) omni-truthful. In particular, we propose the joint-disjoint task framework, a new paradigm which builds upon the previous penalty-bonus task framework. First, we show a natural reduction from mechanisms in the penalty-bonus task framework to mechanisms in the joint-disjoint task framework that maps every truthful mechanism to an omnitruthful mechanism. Such a reduction is non-trivial as we show that current penalty-bonus task mechanisms are not, in general, omni-truthful. Second, for a stronger truthful guarantee, we design the matching agreement (MA) mechanism which is informed omnitruthful. Finally, for the MA mechanism in the detail-free setting where no prior knowledge is assumed, we show how many tasks are required to (approximately) retain the truthful guarantees. CCS CONCEPTS • Theory of computation → Algorithmic mechanism design; • Information systems → Incentive schemes; • Mathematics of computing → Probability and statistics.

Social Learning in a Changing World

Springer eBooks, 2011

We study a model of learning on social networks in dynamic environments, describing a group of ag... more We study a model of learning on social networks in dynamic environments, describing a group of agents who are each trying to estimate an underlying state that varies over time, given access to weak signals and the estimates of their social network neighbors. We study three models of agent behavior. In the fixed response model, agents use a fixed linear combination to incorporate information from their peers into their own estimate. This can be thought of as an extension of the DeGroot model to a dynamic setting. In the best response model, players calculate minimum variance linear estimators of the underlying state. We show that regardless of the initial configuration, fixed response dynamics converge to a steady state, and that the same holds for best response on the complete graph. We show that best response dynamics can, in the long term, lead to estimators with higher variance than is achievable using well chosen fixed responses. The penultimate prediction model is an elaboration of the best response model. While this model only slightly complicates the computations required of the agents, we show that in some cases it greatly increases the efficiency of learning, and on complete graphs is in fact optimal, in a strong sense.

Consensus of Interacting Particle Systems on Erdös-Rényi Graphs

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms

High-Effort Crowds: Limited Liability via Tournaments

Proceedings of the ACM Web Conference 2023

Multitask Peer Prediction With Task-dependent Strategies

Proceedings of the ACM Web Conference 2023

Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed... more Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed with any objective ground truthful information. In the multi-task setting where each agent is asked multiple questions, a sequence of mechanisms have been proposed which are truthful-truth-telling is guaranteed to be an equilibrium, or even better, informed truthful-truth-telling is guaranteed to be one of the best-paid equilibria. However, these guarantees assume agents' strategies are restricted to be task-independent: an agent's report on a task is not affected by her information about other tasks. We provide the first discussion on how to design (informed) truthful mechanisms for task-dependent strategies, which allows the agents to report based on all her information on the assigned tasks. We call such stronger mechanisms (informed) omni-truthful. In particular, we propose the joint-disjoint task framework, a new paradigm which builds upon the previous penalty-bonus task framework. First, we show a natural reduction from mechanisms in the penalty-bonus task framework to mechanisms in the joint-disjoint task framework that maps every truthful mechanism to an omnitruthful mechanism. Such a reduction is non-trivial as we show that current penalty-bonus task mechanisms are not, in general, omni-truthful. Second, for a stronger truthful guarantee, we design the matching agreement (MA) mechanism which is informed omnitruthful. Finally, for the MA mechanism in the detail-free setting where no prior knowledge is assumed, we show how many tasks are required to (approximately) retain the truthful guarantees. CCS CONCEPTS • Theory of computation → Algorithmic mechanism design; • Information systems → Incentive schemes; • Mathematics of computing → Probability and statistics.

Eliciting Expertise without Verification

Proceedings of the 2018 ACM Conference on Economics and Computation

A central question 1 of crowdsourcing is how to elicit expertise from agents. This is even more d... more A central question 1 of crowdsourcing is how to elicit expertise from agents. This is even more difficult when answers cannot be directly verified. A key challenge is that sophisticated agents may strategically withhold effort or information when they believe their payoff will be based upon comparison with other agents whose reports will likely omit this information due to lack of effort or expertise. Our work defines a natural model for this setting based on the assumption that more sophisticated agents know the beliefs of less sophisticated agents. We then provide a mechanism design framework for this setting. From this framework, we design several novel mechanisms, for both the single and multiple tasks settings, that (1) encourage agents to invest effort and provide their information honestly; (2) output a correct "hierarchy" of the information when agents are rational.

BONUS! Maximizing Surprise

Proceedings of the ACM Web Conference 2022

Wisdom of the Crowd Voting: Truthful Aggregation of Voter Information and Preferences

arXiv (Cornell University), Aug 8, 2021

We consider two-alternative elections where voters' preferences depend on a state variable that i... more We consider two-alternative elections where voters' preferences depend on a state variable that is not directly observable. Each voter receives a private signal that is correlated to the state variable. Voters may be "contingent" with different preferences in different states; or predetermined with the same preference in every state. In this setting, even if every voter is a contingent voter, agents voting according to their private information need not result in the adoption of the universally preferred alternative, because the signals can be systematically biased. We present an easy-to-deploy mechanism that elicits and aggregates the private signals from the voters, and outputs the alternative that is favored by the majority. In particular, voters truthfully reporting their signals forms a strong Bayes Nash equilibrium (where no coalition of voters can deviate and receive a better outcome).

Bayesian Persuasion in Sequential Trials

We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take... more We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take a particular action via a sequence of signals. This we model by considering multi-phase trials with different experiments conducted based on the outcomes of prior experiments. In contrast to most of the literature, we consider the problem with constraints on signals imposed on the sender. This we achieve by fixing some of the experiments in an exogenous manner; these are called determined experiments. This modeling helps us understand real-world situations where this occurs: e.g., multi-phase drug trials where the FDA determines some of the experiments, start-up acquisition by big firms where late-stage assessments are determined by the potential acquirer, multiround job interviews where the candidates signal initially by presenting their qualifications but the rest of the screening procedures are determined by the interviewer. The non-determined experiments (signals) in the multi-phase...

Influence Maximization on Undirected Graphs

ACM Transactions on Economics and Computation, 2020

We study the influence maximization problem in undirected networks, specifically focusing on the ... more We study the influence maximization problem in undirected networks, specifically focusing on the independent cascade and linear threshold models. We prove APX-hardness (NP-hardness of approximation within factor (1-τ) for some constant τ > 0) for both models, which improves the previous NP-hardness lower bound for the linear threshold model. No previous hardness result was known for the independent cascade model. As part of the hardness proof, we show some natural properties of these cascades on undirected graphs. For example, we show that the expected number of infections of a seed set S is upper bounded by the size of the edge cut of S in the linear threshold model and a special case of the independent cascade model, the weighted independent cascade model. Motivated by our upper bounds, we present a suite of highly scalable local greedy heuristics for the influence maximization problem on both the linear threshold model and the weighted independent cascade model on undirected g...

Outsourcing Computation: The Minimal Refereed Mechanism

Web and Internet Economics, 2019

We consider a setting where a verifier with limited computation power delegates a resource intens... more We consider a setting where a verifier with limited computation power delegates a resource intensive computation task-which requires a T ×S computation tableau-to two provers where the provers are rational in that each prover maximizes their own payoff-taking into account losses incurred by the cost of computation. We design a mechanism called the Minimal Refereed Mechanism (MRM) such that if the verifier has O(log S + log T) time and O(log S + log T) space computation power, then both provers will provide a honest result without the verifier putting any effort to verify the results. The amount of computation required for the provers (and thus the cost) is a multiplicative log S-factor more than the computation itself, making this schema efficient especially for low-space computations.

Equilibrium Selection in Information Elicitation without Verification via Information Monotonicity

Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-i... more Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-interested agents---formally, truth-telling is a strict Bayes Nash equilibrium of the mechanism. The original Peer-prediction mechanism suffers from two main limitations: (1) the mechanism must know the "common prior" of agents' signals; (2) additional undesirable and non-truthful equilibria exist which often have a greater expected payoff than the truth-telling equilibrium. A series of results has successfully weakened the known common prior assumption. However, the equilibrium multiplicity issue remains a challenge. In this paper, we address the above two problems. In the setting where a common prior exists but is not known to the mechanism we show (1) a general negative result applying to a large class of mechanisms showing truth-telling can never pay strictly more in expectation than a particular set of equilibria where agents collude to "relabel" the signals a...

The Volatility of Weak Ties : Co-evolution of Selection and Influence in Social Networks

In this work we look at opinion formation and the effects of two phenomena both of which promote ... more In this work we look at opinion formation and the effects of two phenomena both of which promote consensus between agents connected by ties: influence, agents changing their opinions to match their neighbors; and selection, agents re-wiring to connect to new agents when the existing neighbor has a different opinion. In our agent-based model, we assume that only weak ties can be rewired and strong ties do not change. The network structure as well as the opinion landscape thus co-evolve with two important parameters: the probability of influence versus selection; and the fraction of strong ties versus weak ties. Using empirical and theoretical methodologies we discovered that on a two-dimensional spatial network: • With no/low selection the presence of weak ties enables fast consensus. This conforms with the classical theory that weak ties are helpful for quicklymixing and spreading information, and strong ties alone act much more slowly. • With high selection, too many weak ties inhi...

An Information Theoretic Framework For Designing Information Elicitation Mechanisms That Reward Truth-telling

ACM Transactions on Economics and Computation, 2019

In the setting where information cannot be verified, we propose a simple yet powerful information... more In the setting where information cannot be verified, we propose a simple yet powerful information theoretical framework—the Mutual Information Paradigm—for information elicitation mechanisms. Our framework pays every agent a measure of mutual information between her signal and a peer’s signal. We require that the mutual information measurement has the key property that any “data processing” on the two random variables will decrease the mutual information between them. We identify such information measures that generalize Shannon mutual information. Our Mutual Information Paradigm overcomes the two main challenges in information elicitation without verification: (1) how to incentivize high-quality reports and avoid agents colluding to report random or identical responses; (2) how to motivate agents who believe they are in the minority to report truthfully. Aided by the information measures, we found (1) we use the paradigm to design a family of novel mechanisms where truth-telling is...

Putting Peer Prediction Under the Micro(economic)scope and Making Truth-Telling Focal

Lecture Notes in Computer Science, 2016

Peer-prediction [18] is a (meta-)mechanism which, given any proper scoring rule, produces a mecha... more Peer-prediction [18] is a (meta-)mechanism which, given any proper scoring rule, produces a mechanism to elicit privately-held, non-verifiable information from self-interested agents. Formally, truth-telling is a strict Nash equilibrium of the mechanism. Unfortunately, there may be other equilibria as well (including uninformative equilibria where all players simply report the same fixed signal, regardless of their true signal) and, typically, the truth-telling equilibrium does not have the highest expected payoff. The main result of this paper is to show that, in the symmetric binary setting, by tweaking peer-prediction, in part by carefully selecting the proper scoring rule it is based on, we can make the truth-telling equilibrium focal-that is, truth-telling has higher expected payoff than any other equilibrium. Along the way, we prove the following: in the setting where agents receive binary signals we 1) classify all equilibria of the peer-prediction mechanism; 2) introduce a new technical tool for understanding scoring rules, which allows us to make truth-telling pay better than any other informative equilibrium; 3) leverage this tool to provide an optimal version of the previous result; that is, we optimize the gap between the expected payoff of truth-telling and other informative equilibria; and 4) show that with a slight modification to the peer-prediction framework, we can, in general, make the truth-telling equilibrium focal-that is, truth-telling pays more than any other equilibrium (including the uninformative equilibria).

Tight integrality gaps for Lovasz-Schrijver LP relaxations of vertex cover and max cut

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, 2007

We study linear programming relaxations of Vertex Cover and Max Cut arising from repeated applica... more We study linear programming relaxations of Vertex Cover and Max Cut arising from repeated applications of the "lift-and-project" method of Lovasz and Schrijver starting from the standard linear programming relaxation. For Vertex Cover, Arora, Bollobas, Lovasz and Tourlakis prove that the integrality gap remains at least 2 − ε after Ω ε (log n) rounds, where n is the number of vertices, and Tourlakis proves that integrality gap remains at least 1.5 − ε after Ω((log n) 2) rounds. Fernandez de la Vega and Kenyon prove that the integrality gap of Max Cut is at most 1 2 + ε after any constant number of rounds. (Their result also applies to the more powerful Sherali-Adams method.) We prove that the integrality gap of Vertex Cover remains at least 2 − ε after Ω ε (n) rounds, and that the integrality gap of Max Cut remains at most 1/2 + ε after Ω ε (n) rounds.

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

Twenty-Second Annual IEEE Conference on Computational Complexity (CCC'07), 2007

We study semidefinite programming relaxations of Vertex Cover arising from repeated applications ... more We study semidefinite programming relaxations of Vertex Cover arising from repeated applications of the LS+ "lift-and-project" method of Lovasz and Schrijver starting from the standard linear programming relaxation. Goemans and Kleinberg prove that after one round of LS+ the integrality gap remains arbitrarily close to 2. Charikar proves an integrality gap of 2 for a stronger relaxation that is, however, incomparable with two rounds of LS+ and is strictly weaker than the relaxation resulting from a constant number of rounds. We prove that the integrality gap remains at least 7/6 − ε after c ε n rounds, where n is the number of vertices and c ε > 0 is a constant that depends only on ε.

Chora: Expert-Based P2P Web Search

Lecture Notes in Computer Science

We present Chora, a P2P web search engine which complements, not replaces, traditional web search... more We present Chora, a P2P web search engine which complements, not replaces, traditional web search by using peers' web viewing history to recommend useful web sites to queriers. Chora is designed around a two-step paradigm. First, Chora determines which peers to query and then it executes a query across these peers. Each peer uses a desktop search engine to query their local web history and retrieve results ordered by relevance. To determine which peers to query, a small sketch of the information available from each peer is stored in a DHT. Peers with sketches indicating that they may have relevant information are queried. The query is dispersed through an ad hoc network connecting only those machines in the query and is optimized for getting good results as quickly as possible.

Optimal Local Bayesian Differential Privacy over Markov Chains

arXiv (Cornell University), Jun 22, 2022

In the literature of data privacy, differential privacy is the most popular model. An algorithm i... more In the literature of data privacy, differential privacy is the most popular model. An algorithm is differentially private if its outputs with and without any individual's data are indistinguishable. In this paper, we focus on data generated from a Markov chain and argue that Bayesian differential privacy (BDP) offers more meaningful guarantees in this context. Our main theoretical contribution is providing a mechanism for achieving BDP when data is drawn from a binary Markov chain. We improve on the state-of-the-art BDP mechanism and show that our mechanism provides the optimal noise-privacy tradeoffs for any local mechanism up to negligible factors. We also briefly discuss a non-local mechanism which adds correlated noise. Lastly, we perform experiments on synthetic data that detail when DP is insufficient, and experiments on real data to show that our privacy guarantees are robust to underlying distributions that are not simple Markov chains.

Information Elicitation Mechanisms for Statistical Estimation

Proceedings of the ... AAAI Conference on Artificial Intelligence, Apr 3, 2020

We study learning statistical properties from strategic agents with private information. In this ... more We study learning statistical properties from strategic agents with private information. In this problem, agents must be incentivized to truthfully reveal their information even when it cannot be directly verified. Moreover, the information reported by the agents must be aggregated into a statistical estimate. We study two fundamental statistical properties: estimating the mean of an unknown Gaussian, and linear regression with Gaussian error. The information of each agent is one point in a Euclidean space. Our main results are two mechanisms for each of these problems which optimally aggregate the information of agents in the truth-telling equilibrium: • A minimal (non-revelation) mechanism for large populations-agents only need to report one value, but that value need not be their point. • A mechanism for small populations that is non-minimalagents need to answer more than one question. These mechanisms are "informed truthful" mechanisms where reporting unaltered data (truth-telling) 1) forms a strict Bayesian Nash equilibrium and 2) has strictly higher welfare than any oblivious equilibrium where agents' strategies are independent of their private signals. We also show a minimal revelation mechanism (each agent only reports her signal) for a restricted setting and use an impossibility result to prove the necessity of this restriction. We build upon the peer prediction literature in the singlequestion setting; however, most previous work in this area focuses on discrete signals, whereas our setting is inherently continuous, and we further simplify the agents' reports.

Multitask Peer Prediction With Task-dependent Strategies

Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed... more Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed with any objective ground truthful information. In the multi-task setting where each agent is asked multiple questions, a sequence of mechanisms have been proposed which are truthful-truth-telling is guaranteed to be an equilibrium, or even better, informed truthful-truth-telling is guaranteed to be one of the best-paid equilibria. However, these guarantees assume agents' strategies are restricted to be task-independent: an agent's report on a task is not affected by her information about other tasks. We provide the first discussion on how to design (informed) truthful mechanisms for task-dependent strategies, which allows the agents to report based on all her information on the assigned tasks. We call such stronger mechanisms (informed) omni-truthful. In particular, we propose the joint-disjoint task framework, a new paradigm which builds upon the previous penalty-bonus task framework. First, we show a natural reduction from mechanisms in the penalty-bonus task framework to mechanisms in the joint-disjoint task framework that maps every truthful mechanism to an omnitruthful mechanism. Such a reduction is non-trivial as we show that current penalty-bonus task mechanisms are not, in general, omni-truthful. Second, for a stronger truthful guarantee, we design the matching agreement (MA) mechanism which is informed omnitruthful. Finally, for the MA mechanism in the detail-free setting where no prior knowledge is assumed, we show how many tasks are required to (approximately) retain the truthful guarantees. CCS CONCEPTS • Theory of computation → Algorithmic mechanism design; • Information systems → Incentive schemes; • Mathematics of computing → Probability and statistics.

Social Learning in a Changing World

Springer eBooks, 2011

We study a model of learning on social networks in dynamic environments, describing a group of ag... more We study a model of learning on social networks in dynamic environments, describing a group of agents who are each trying to estimate an underlying state that varies over time, given access to weak signals and the estimates of their social network neighbors. We study three models of agent behavior. In the fixed response model, agents use a fixed linear combination to incorporate information from their peers into their own estimate. This can be thought of as an extension of the DeGroot model to a dynamic setting. In the best response model, players calculate minimum variance linear estimators of the underlying state. We show that regardless of the initial configuration, fixed response dynamics converge to a steady state, and that the same holds for best response on the complete graph. We show that best response dynamics can, in the long term, lead to estimators with higher variance than is achievable using well chosen fixed responses. The penultimate prediction model is an elaboration of the best response model. While this model only slightly complicates the computations required of the agents, we show that in some cases it greatly increases the efficiency of learning, and on complete graphs is in fact optimal, in a strong sense.

Consensus of Interacting Particle Systems on Erdös-Rényi Graphs

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms

High-Effort Crowds: Limited Liability via Tournaments

Proceedings of the ACM Web Conference 2023

Multitask Peer Prediction With Task-dependent Strategies

Proceedings of the ACM Web Conference 2023

Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed... more Peer prediction aims to incentivize truthful reports from agents whose reports cannot be assessed with any objective ground truthful information. In the multi-task setting where each agent is asked multiple questions, a sequence of mechanisms have been proposed which are truthful-truth-telling is guaranteed to be an equilibrium, or even better, informed truthful-truth-telling is guaranteed to be one of the best-paid equilibria. However, these guarantees assume agents' strategies are restricted to be task-independent: an agent's report on a task is not affected by her information about other tasks. We provide the first discussion on how to design (informed) truthful mechanisms for task-dependent strategies, which allows the agents to report based on all her information on the assigned tasks. We call such stronger mechanisms (informed) omni-truthful. In particular, we propose the joint-disjoint task framework, a new paradigm which builds upon the previous penalty-bonus task framework. First, we show a natural reduction from mechanisms in the penalty-bonus task framework to mechanisms in the joint-disjoint task framework that maps every truthful mechanism to an omnitruthful mechanism. Such a reduction is non-trivial as we show that current penalty-bonus task mechanisms are not, in general, omni-truthful. Second, for a stronger truthful guarantee, we design the matching agreement (MA) mechanism which is informed omnitruthful. Finally, for the MA mechanism in the detail-free setting where no prior knowledge is assumed, we show how many tasks are required to (approximately) retain the truthful guarantees. CCS CONCEPTS • Theory of computation → Algorithmic mechanism design; • Information systems → Incentive schemes; • Mathematics of computing → Probability and statistics.

Eliciting Expertise without Verification

Proceedings of the 2018 ACM Conference on Economics and Computation

A central question 1 of crowdsourcing is how to elicit expertise from agents. This is even more d... more A central question 1 of crowdsourcing is how to elicit expertise from agents. This is even more difficult when answers cannot be directly verified. A key challenge is that sophisticated agents may strategically withhold effort or information when they believe their payoff will be based upon comparison with other agents whose reports will likely omit this information due to lack of effort or expertise. Our work defines a natural model for this setting based on the assumption that more sophisticated agents know the beliefs of less sophisticated agents. We then provide a mechanism design framework for this setting. From this framework, we design several novel mechanisms, for both the single and multiple tasks settings, that (1) encourage agents to invest effort and provide their information honestly; (2) output a correct "hierarchy" of the information when agents are rational.

BONUS! Maximizing Surprise

Proceedings of the ACM Web Conference 2022

Wisdom of the Crowd Voting: Truthful Aggregation of Voter Information and Preferences

arXiv (Cornell University), Aug 8, 2021

We consider two-alternative elections where voters' preferences depend on a state variable that i... more We consider two-alternative elections where voters' preferences depend on a state variable that is not directly observable. Each voter receives a private signal that is correlated to the state variable. Voters may be "contingent" with different preferences in different states; or predetermined with the same preference in every state. In this setting, even if every voter is a contingent voter, agents voting according to their private information need not result in the adoption of the universally preferred alternative, because the signals can be systematically biased. We present an easy-to-deploy mechanism that elicits and aggregates the private signals from the voters, and outputs the alternative that is favored by the majority. In particular, voters truthfully reporting their signals forms a strong Bayes Nash equilibrium (where no coalition of voters can deviate and receive a better outcome).

Bayesian Persuasion in Sequential Trials

We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take... more We consider a Bayesian persuasion problem where the sender tries to persuade the receiver to take a particular action via a sequence of signals. This we model by considering multi-phase trials with different experiments conducted based on the outcomes of prior experiments. In contrast to most of the literature, we consider the problem with constraints on signals imposed on the sender. This we achieve by fixing some of the experiments in an exogenous manner; these are called determined experiments. This modeling helps us understand real-world situations where this occurs: e.g., multi-phase drug trials where the FDA determines some of the experiments, start-up acquisition by big firms where late-stage assessments are determined by the potential acquirer, multiround job interviews where the candidates signal initially by presenting their qualifications but the rest of the screening procedures are determined by the interviewer. The non-determined experiments (signals) in the multi-phase...

Influence Maximization on Undirected Graphs

ACM Transactions on Economics and Computation, 2020

We study the influence maximization problem in undirected networks, specifically focusing on the ... more We study the influence maximization problem in undirected networks, specifically focusing on the independent cascade and linear threshold models. We prove APX-hardness (NP-hardness of approximation within factor (1-τ) for some constant τ > 0) for both models, which improves the previous NP-hardness lower bound for the linear threshold model. No previous hardness result was known for the independent cascade model. As part of the hardness proof, we show some natural properties of these cascades on undirected graphs. For example, we show that the expected number of infections of a seed set S is upper bounded by the size of the edge cut of S in the linear threshold model and a special case of the independent cascade model, the weighted independent cascade model. Motivated by our upper bounds, we present a suite of highly scalable local greedy heuristics for the influence maximization problem on both the linear threshold model and the weighted independent cascade model on undirected g...

Outsourcing Computation: The Minimal Refereed Mechanism

Web and Internet Economics, 2019

We consider a setting where a verifier with limited computation power delegates a resource intens... more We consider a setting where a verifier with limited computation power delegates a resource intensive computation task-which requires a T ×S computation tableau-to two provers where the provers are rational in that each prover maximizes their own payoff-taking into account losses incurred by the cost of computation. We design a mechanism called the Minimal Refereed Mechanism (MRM) such that if the verifier has O(log S + log T) time and O(log S + log T) space computation power, then both provers will provide a honest result without the verifier putting any effort to verify the results. The amount of computation required for the provers (and thus the cost) is a multiplicative log S-factor more than the computation itself, making this schema efficient especially for low-space computations.

Equilibrium Selection in Information Elicitation without Verification via Information Monotonicity

Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-i... more Peer-prediction is a mechanism which elicits privately-held, non-variable information from self-interested agents---formally, truth-telling is a strict Bayes Nash equilibrium of the mechanism. The original Peer-prediction mechanism suffers from two main limitations: (1) the mechanism must know the "common prior" of agents' signals; (2) additional undesirable and non-truthful equilibria exist which often have a greater expected payoff than the truth-telling equilibrium. A series of results has successfully weakened the known common prior assumption. However, the equilibrium multiplicity issue remains a challenge. In this paper, we address the above two problems. In the setting where a common prior exists but is not known to the mechanism we show (1) a general negative result applying to a large class of mechanisms showing truth-telling can never pay strictly more in expectation than a particular set of equilibria where agents collude to "relabel" the signals a...

The Volatility of Weak Ties : Co-evolution of Selection and Influence in Social Networks

In this work we look at opinion formation and the effects of two phenomena both of which promote ... more In this work we look at opinion formation and the effects of two phenomena both of which promote consensus between agents connected by ties: influence, agents changing their opinions to match their neighbors; and selection, agents re-wiring to connect to new agents when the existing neighbor has a different opinion. In our agent-based model, we assume that only weak ties can be rewired and strong ties do not change. The network structure as well as the opinion landscape thus co-evolve with two important parameters: the probability of influence versus selection; and the fraction of strong ties versus weak ties. Using empirical and theoretical methodologies we discovered that on a two-dimensional spatial network: • With no/low selection the presence of weak ties enables fast consensus. This conforms with the classical theory that weak ties are helpful for quicklymixing and spreading information, and strong ties alone act much more slowly. • With high selection, too many weak ties inhi...

An Information Theoretic Framework For Designing Information Elicitation Mechanisms That Reward Truth-telling

ACM Transactions on Economics and Computation, 2019

In the setting where information cannot be verified, we propose a simple yet powerful information... more In the setting where information cannot be verified, we propose a simple yet powerful information theoretical framework—the Mutual Information Paradigm—for information elicitation mechanisms. Our framework pays every agent a measure of mutual information between her signal and a peer’s signal. We require that the mutual information measurement has the key property that any “data processing” on the two random variables will decrease the mutual information between them. We identify such information measures that generalize Shannon mutual information. Our Mutual Information Paradigm overcomes the two main challenges in information elicitation without verification: (1) how to incentivize high-quality reports and avoid agents colluding to report random or identical responses; (2) how to motivate agents who believe they are in the minority to report truthfully. Aided by the information measures, we found (1) we use the paradigm to design a family of novel mechanisms where truth-telling is...

Putting Peer Prediction Under the Micro(economic)scope and Making Truth-Telling Focal

Lecture Notes in Computer Science, 2016

Peer-prediction [18] is a (meta-)mechanism which, given any proper scoring rule, produces a mecha... more Peer-prediction [18] is a (meta-)mechanism which, given any proper scoring rule, produces a mechanism to elicit privately-held, non-verifiable information from self-interested agents. Formally, truth-telling is a strict Nash equilibrium of the mechanism. Unfortunately, there may be other equilibria as well (including uninformative equilibria where all players simply report the same fixed signal, regardless of their true signal) and, typically, the truth-telling equilibrium does not have the highest expected payoff. The main result of this paper is to show that, in the symmetric binary setting, by tweaking peer-prediction, in part by carefully selecting the proper scoring rule it is based on, we can make the truth-telling equilibrium focal-that is, truth-telling has higher expected payoff than any other equilibrium. Along the way, we prove the following: in the setting where agents receive binary signals we 1) classify all equilibria of the peer-prediction mechanism; 2) introduce a new technical tool for understanding scoring rules, which allows us to make truth-telling pay better than any other informative equilibrium; 3) leverage this tool to provide an optimal version of the previous result; that is, we optimize the gap between the expected payoff of truth-telling and other informative equilibria; and 4) show that with a slight modification to the peer-prediction framework, we can, in general, make the truth-telling equilibrium focal-that is, truth-telling pays more than any other equilibrium (including the uninformative equilibria).

Tight integrality gaps for Lovasz-Schrijver LP relaxations of vertex cover and max cut

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, 2007

We study linear programming relaxations of Vertex Cover and Max Cut arising from repeated applica... more We study linear programming relaxations of Vertex Cover and Max Cut arising from repeated applications of the "lift-and-project" method of Lovasz and Schrijver starting from the standard linear programming relaxation. For Vertex Cover, Arora, Bollobas, Lovasz and Tourlakis prove that the integrality gap remains at least 2 − ε after Ω ε (log n) rounds, where n is the number of vertices, and Tourlakis proves that integrality gap remains at least 1.5 − ε after Ω((log n) 2) rounds. Fernandez de la Vega and Kenyon prove that the integrality gap of Max Cut is at most 1 2 + ε after any constant number of rounds. (Their result also applies to the more powerful Sherali-Adams method.) We prove that the integrality gap of Vertex Cover remains at least 2 − ε after Ω ε (n) rounds, and that the integrality gap of Max Cut remains at most 1/2 + ε after Ω ε (n) rounds.

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

Twenty-Second Annual IEEE Conference on Computational Complexity (CCC'07), 2007

We study semidefinite programming relaxations of Vertex Cover arising from repeated applications ... more We study semidefinite programming relaxations of Vertex Cover arising from repeated applications of the LS+ "lift-and-project" method of Lovasz and Schrijver starting from the standard linear programming relaxation. Goemans and Kleinberg prove that after one round of LS+ the integrality gap remains arbitrarily close to 2. Charikar proves an integrality gap of 2 for a stronger relaxation that is, however, incomparable with two rounds of LS+ and is strictly weaker than the relaxation resulting from a constant number of rounds. We prove that the integrality gap remains at least 7/6 − ε after c ε n rounds, where n is the number of vertices and c ε > 0 is a constant that depends only on ε.

Chora: Expert-Based P2P Web Search

Lecture Notes in Computer Science

We present Chora, a P2P web search engine which complements, not replaces, traditional web search... more We present Chora, a P2P web search engine which complements, not replaces, traditional web search by using peers' web viewing history to recommend useful web sites to queriers. Chora is designed around a two-step paradigm. First, Chora determines which peers to query and then it executes a query across these peers. Each peer uses a desktop search engine to query their local web history and retrieve results ordered by relevance. To determine which peers to query, a small sketch of the information available from each peer is stored in a DHT. Peers with sketches indicating that they may have relevant information are queried. The query is dispersed through an ad hoc network connecting only those machines in the query and is optimized for getting good results as quickly as possible.

Consensus of Interacting Particle Systems on Erdös-Rényi Graphs

by Fang-Yi Yu and Grant R Y Schoenebeck

The 29th Annual ACM-SIAM Symposium on Discrete Algorithms, 2018

We study opinion dynamics on networks with two communities. Each node has one of two opinions an... more We study opinion dynamics on networks with two communities. Each node has one of two opinions and updates its opinion as a ``majority-like" function of the frequency of opinions among its neighbors. The networks we consider are weighted graphs comprised of two equally sized communities where intracommunity edges have weight $p$, and intercommunity edges have weight $q$. Thus $q$ and $p$ parameterize the connectivity between the two communities.

We prove a dichotomy theorem about the interaction of the two parameters: 1) the ``majority-like" update function, and 2) the level of intercommunity connectivity. For each setting of parameters, we show that either: the system quickly converges to consensus with high probability in time $\Theta(n \log(n))$; or, the system can get ``stuck" and take time $2^{\Theta(n)}$ to reach consensus.
We note that $O(n \log(n))$ is optimal because it takes this long for each node to even update its opinion.

Technically, we achieve this fast convergence result by exploiting the connection between a family of reinforced random walks and dynamical systems literature. Our main result shows if the system is a reinforced random walk with a gradient-like function, it converges to an arbitrary neighborhood of a local attracting point in $O(n\log n)$ time with high probability. This result adds to the recent literature on saddle-point analysis and shows a large family of stochastic gradient descent algorithm converges to a local minimal in $O(n\log n)$ when the step size $O(1/n)$.

Our opinion dynamics model captures a broad range of systems, sometimes called interacting particle systems, exemplified by the voter model, iterative majority, and iterative $k-$majority processes---which have found use in many disciplines including distributed systems, statistical physics, social networks, and Markov chain theory.

The Volatility of Weak Ties: Co-evolution of Selection and Influence in Social Networks

by Fang-Yi Yu and Grant R Y Schoenebeck

The 18th International Conference on Autonomous Agents and Multiagent Systems, 2019

In this work we look at opinion formation and the effects of two phenomena both of which promote ... more In this work we look at opinion formation and the effects of two phenomena both of which promote consensus between agents connected by ties: influence, agents changing their opinions to match their neighbors; and selection, agents rewiring to connect to new agents when the existing neighbor has a different opinion. In our agent-based model, we assume that only weak ties can be rewired and strong ties do not change. The network structure as well as the opinion landscape thus co-evolve with two important parameters: the probability of influence versus selection; and the fraction of strong ties versus weak ties. Using empirical and theoretical methodologies we discovered that on a two-dimensional spatial network: • With no/low selection the presence of weak ties enables fast consensus. This conforms with the classical theory that weak ties are helpful for quickly mixing and spreading information, and strong ties alone act much more slowly. • With high selection, too many weak ties inhibit any consensus at all-the graph partitions. The weak ties reinforce the differing opinions rather than mixing them. However, sufficiently many strong ties promote convergence, though at a slower pace. We additionally test the aforementioned results using a real network. Our study relates two theoretical ideas: the strength of weak ties-that weak ties are useful for spreading information; and the idea of echo chambers or filter bubbles, that people are typically bombarded by the opinions of like-minded individuals. The difference is in how (much) selection operates.

Cascades and Myopic Routing in Nonhomogeneous Kleinberg’s Small World Model

by Fang-Yi Yu, Grant R Y Schoenebeck, and Jie Gao

International Conference on Web and Internet Economics, 2017

Kleinberg's small world model simulates social networks with both strong and weak ties. In his or... more Kleinberg's small world model simulates social networks with both strong and weak ties. In his original paper, Kleinberg showed how the distribution of weak-ties, parameterized by γ, influences the efficacy of myopic routing on the network. Recent work on social influence by k-complex contagion models discovered that the distribution of weakties also impacts the spreading rate in a crucial manner on Kleinberg's small world model . In both cases the parameter of γ = 2 proves special: when γ is anything but 2 the properties no longer hold.

General Threshold Model for Social Cascades: Analysis and Simulations

by Fang-Yi Yu, Jie Gao, and Grant R Y Schoenebeck

EC16

Social behaviors and choices spread through interactions and may lead to a cascading behavior. Un... more Social behaviors and choices spread through interactions and may lead to a cascading behavior. Understanding how such social cascades spread in a network is crucial for many applications ranging from viral marketing to political campaigns. The behavior of cascade depends crucially on the model of cascade or social influence and the topological structure of the social network.

In this paper we study the general threshold model of cascades which are parameterized by a distribution over the natural numbers, in which the collective influence from infected neighbors, once beyond the threshold of an individual u, will trigger the infection of u. By varying the choice of the distribution, the general threshold model can model cascades with and without the submodular property. In fact, the general threshold model captures many previously studied cascade models as special cases, including the independent cascade model, the linear threshold model, and k-complex contagions.

We provide both analytical and experimental results for how cascades from a general threshold model spread in a general growing network model, which contains preferential attachment models as special cases. We show that if we choose the initial seeds as the early arriving nodes, the contagion can spread to a good fraction of the network and this fraction crucially depends on the fixed points of a function derived only from the specified distribution. We also show, using a coauthorship network derived from DBLP databases and the Stanford web network, that our theoretical results can be used to predict the infection rate up to a decent degree of accuracy, while the configuration model does the job poorly.

Sybil Detection Using Latent Network Structure

by Fang-Yi Yu and Grant R Y Schoenebeck

EC16

Sybil attacks, in which an adversary creates a large number of identities, present a formidable p... more Sybil attacks, in which an adversary creates a large number of identities, present a formidable problem for the robustness of recommendation systems. One promising method of sybil detection is to use data from social network ties to implicitly infer trust.

Previous work along this dimension typically a) assumes that it is difficult/costly for an adversary to create edges to honest nodes in the network; and b) limits the amount of damage done per such edge, using conductance-based methods. However, these methods fail to detect a simple class of sybil attacks which have been identified in online systems. Indeed, conductance-based methods seem inherently unable to do so, as they are based on the assumption that creating many edges to honest nodes is difficult, which seems to fail in real-world settings.

We create a sybil defense system that accounts for the adversary's ability to launch such attacks yet provably withstands them by:

1.Not assuming any restriction on the number of edges an adversary can form, but instead making a much weaker assumption that creating edges from sybils to most honest nodes is difficult, yet allowing that the remaining nodes can be freely connected to.
2.Relaxing the goal from classifying all nodes as honest or sybil to the goal of classifying the "core" nodes of the network as honest; and classifying no sybil nodes as honest.
3.Exploiting a new, for sybil detection, social network property, namely, that nodes can be embedded in low-dimensional spaces.