0% found this document useful (0 votes)
329 views615 pages

D

Uploaded by

simran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
329 views615 pages

D

Uploaded by

simran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 615

the cambridge handbook of consumer privacy

Businesses are rushing to collect personal data to fuel surging demand. Data enthusiasts claim that
personal information obtained from the commercial Internet, including mobile platforms, social
networks, cloud computing, and connected devices, will unlock pathbreaking innovation, including
advanced data security. By contrast, regulators and activists contend that corporate data practices too
often disempower consumers by creating privacy harms and related problems.
As the Internet of Things matures and facial recognition, predictive analytics, big data, and
wearable tracking grow in power, scale, and scope, a controversial ecosystem exacerbates the
acrimony over commercial data capture and analysis. The only productive way forward is to get a
grip on key problems right now and change the conversation, which is exactly what Jules Polonetsky,
Omer Tene, and Evan Selinger do. They bring together diverse views from leading academics,
business leaders, and policymakers to discuss the opportunities and challenges of the new data
economy.

evan selinger is Professor of Philosophy and Head of Research Communications, Community,


and Ethics at the Center for Media, Arts, Games, Interaction, and Creativity at the Rochester Institute
of Technology. Selinger is also Senior Fellow at the Future of Privacy Forum. His most recent book,
co-written with Brett Frischmann, is Re-Engineering Humanity (Cambridge University Press, 2018).
Selinger’s primary research is on the ethical and privacy dimensions of emerging technology. A strong
advocate of public philosophy, he writes regularly for magazines, newspapers, and blogs, including
The Guardian, The Atlantic, Slate, and Wired.
jules polonetsky is the CEO of the Future of Privacy Forum (FPF), a non-profit organization that
serves as a catalyst for privacy leadership and scholarship, advancing principled data practices in
support of emerging technologies. FPF is supported by the chief privacy officers of more than
130 leading companies and several foundations, as well as by an advisory board comprised of the
country’s leading academics and advocates. Polonetsky’s previous roles have included serving as Chief
Privacy Officer at AOL and before that at DoubleClick, as Consumer Affairs Commissioner for New
York City, as an elected New York State legislator, and as an attorney.
omer tene is Vice President of Research and Education at the International Association of Privacy
Professionals. He is a consultant to governments, regulatory agencies, and businesses on privacy,
cybersecurity, and data management. He is an affiliate scholar at the Stanford Center for Internet and
Society and Senior Fellow at the Future of Privacy Forum. He comes from Israel, where he was a
professor at the College of Management School of Law.
The Cambridge Handbook of
Consumer Privacy

Edited by
EVAN SELINGER
Rochester Institute of Technology

JULES POLONETSKY
Future of Privacy Forum

OMER TENE
International Association of Privacy Professionals
University Printing House, Cambridge cb2 8bs, United Kingdom
One Liberty Plaza, 20th Floor, New York, ny 10006, USA
477 Williamstown Road, Port Melbourne, vic 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906

Cambridge University Press is part of the University of Cambridge.


It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.

[Link]
Information on this title: [Link]/9781107181106
doi: 10.1017/9781316831960
© Cambridge University Press 2018
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2018
Printed in the United States of America by Sheridan Books, Inc.
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
names: Selinger, Evan, 1974- editor. | Polonetsky, Jules, 1965- editor. | Tene, Omer, editor.
title: The Cambridge handbook of consumer privacy / edited by Evan Selinger, Jules Polonetsky, Omer Tene.
description: Cambridge, United Kingdom ; New York, NY : Cambridge University Press, 2018. | Includes
bibliographical references and index.
identifiers: lccn 2017054702 | isbn 9781107181106 (hardback : alk. paper)
subjects: lcsh: Consumer protection. | Consumer protection–Law and legislation. | Consumer profiling. |
Privacy, Right of.
classification: lcc hc79.c63 c36 2018 | ddc 381.3/4–dc23
LC record available at [Link]

isbn 978-1-107-18110-6 Hardback


Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
Contents

List of Contributors page ix

part i introduction 1

1 Consumer Privacy and the Future of Society 3


Jules Polonetsky, Omer Tene, and Evan Selinger

part ii the pervasiveness and value of tracking technologies 23

2 Data Brokers: Should They Be Reviled or Revered? 25


Jennifer Barrett Glasgow

3 In Defense of Big Data Analytics 47


Mark MacCarthy

4 Education Technology and Student Privacy 70


Elana Zeide

5 Mobile Privacy Expectations: How Privacy Is Respected in Mobile Devices 85


Kirsten Martin and Katie Shilton

6 Face Recognition, Real-Time Identification, and Beyond 102


Yana Welinder and Aeryn Palmer

7 Smart Cities: Privacy, Transparency, and Community 125


Kelsey Finch and Omer Tene

part iii ethical and legal reservations about tracking


technologies 149

8 Americans and Marketplace Privacy: Seven Annenberg National Surveys in


Perspective 151
Joseph Turow

9 The Federal Trade Commission’s Inner Privacy Struggle 168


Chris Jay Hoofnagle
vi Contents

10 Privacy and Human Behavior in the Information Age 184


Alessandro Acquisti, Laura Brandimarte, and George Lowenstein

11 Privacy, Vulnerability, and Affordance 198


Ryan Calo

12 Ethical Considerations When Companies Study and Fail to Study


Their Customers 207
Michelle N. Meyer

13 Algorithmic Discrimination vs. Privacy Law 232


Alvaro M. Bedoya

14 Children, Privacy, and the New Online Realities 241


Stephen Balkam

15 Stakeholders and High Stakes: Divergent Standards for Do Not Track 251
Aleecia M. McDonald

16 Applying Ethics When Using Data beyond Individuals’ Understanding 270


Martin E. Abrams and Lynn A. Goldstein

part iv international perspectives 283

17 Profiling and the Essence of the Right to Data Protection 285


Bilyana Petkova and Franziska Boehm

18 Privacy, Freedom of Expression, and the Right to Be Forgotten in Europe 301


Stefan Kulk and Frederik Zuiderveen Borgesius

19 Understanding the Balancing Act behind the Legitimate Interest of the


Controller Ground: A Pragmatic Approach 321
Irene Kamara and Paul De Hert

part v new approaches to improve the status quo 353

20 The Intersection of Privacy and Consumer Protection 355


Julie Brill

21 A Design Space for Effective Privacy Notices 365


Florian Schaub, Rebecca Balebako, Adam L. Durity, and Lorrie Faith Cranor

22 Enter the Professionals: Organizational Privacy in a Digital Age 394


J. Trevor Hughes and Cobun Keegan

23 Privacy Statements: Purposes, Requirements, and Best Practices 413


Mike Hintze

24 Privacy versus Research in Big Data 433


Jane R. Bambauer
Contents vii

25 A Marketplace for Privacy: Incentives for Privacy Engineering and Innovation 446
Courtney Bowman and John Grant

26 The Missing Role of Economics in FTC Privacy Policy 465


James C. Cooper and Joshua Wright

27 Big Data by Design: Establishing Privacy Governance by Analytics 489


Dale Skivington, Lisa Zolidis, and Brian P. O’Connor

28 The Future of Self-Regulation Is Co-Regulation 503


Ira S. Rubinstein

29 Privacy Notices: Limitations, Challenges, and Opportunities 524


Mary J. Culnan and Paula J. Bruening

30 It Takes Data to Protect Data 546


David A. Hoffman and Patricia A. Rimo

31 Are Benefit-Cost Analysis and Privacy Protection Efforts Incompatible? 561


Adam Thierer

32 Privacy after the Agile Turn 579


Seda Gürses and Joris van Hoboken
Contributors

Martin E. Abrams is Executive Director of the Information Accountability Foundation.


Alessandro Acquisti is Professor of Information Technology and Public Policy at Heinz
College, Carnegie Mellon University.
Rebecca Balebako is an information scientist at Rand Corporation.
Stephen Balkam is Founder and CEO of the Family Online Safety Institute.
Jane R. Bambauer is Professor of Law at the University of Arizona James E. Rogers College
of Law.
Alvaro Bedoya is Executive Director of the Center on Privacy and Technology at Georgetown
University Law Center.
Franziska Boehm is Law professor at FIZ Karlsruhe, the Leibniz Institute of Information
Infrastructure and the Karlsruhe Institute of Technology.
Frederik Zuiderveen Borgesius is a researcher at the Law, Science, Technology & Society
group at the Vrije Universiteit Brussels, and a fellow of the Institute for Information Law at the
University of Amsterdam.
Courtney Bowman is a civil liberties engineer at Palantir Technologies.
Laura Brandimarte is assistant professor of Management Information Systems at the University
of Arizona.
Julie Brill is Corporate Vice President and Deputy General Counsel for Privacy and Regulatory
Affairs at Microsoft.
Paula J. Bruening is Founder and Principal of Casentino Strategies LLC. She is the former
Director of Global Privacy Policy at Intel Corporation.
Ryan Calo is Lane Powell and D. Wayne Gittinger Associate Professor at the University of
Washington School of Law.
James C. Cooper is associate professor of Law and Director of the Program on Economics and
Privacy at the Antonin Scalia Law School at George Mason University.
x Contributors

Lorrie Faith Cranor is FORE Systems Professor of Computer Science and of Engineering &
Public Policy at Carnegie Mellon University.
Mary J. Culnan is Professor Emeritus of Information and Process Management at Bentley
University.
Adam L. Durity is a privacy engineer at Google.
Kelsey Finch is Policy Counsel at the Future of Privacy Forum.
Jennifer Barrett Glasgow is Chief Privacy Officer Emeritus at Acxiom Corporation.
Lynn A. Goldstein is a senior strategist at the Information Accountability Foundation.
John Grant is a civil liberties engineer at Palantir Technologies.
Seda Gürses is a Flanders Research Foundation (FWO) Postdoctoral Fellow at the Computer
Security and Industrial Cryptography Group (COSIC) at the University of Leuven and an
affiliate at the Center for Information Technology and Policy (CITP) at Princeton University.
Paul De Hert is a professor at the Vrije Universiteit Brussel, Associated Professor at Tilburg
Institute for Law, Technology, and Society at Tilburg University & Co Director of Research
Group on Law, Science, Technology, and Society (LSTS) at Vrije Universiteit Brussel.
Mike Hintze is a partner with Hintze Law and a part time lecturer at the University of
Washington School of Law. He is the former Chief Privacy Counsel at Microsoft, where he
worked on privacy compliance, policy, and strategy from 1998 to 2016.
Joris van Hoboken is Chair of Fundamental Rights and the Digital Transformation at the Vrije
Universiteit Brussels (VUB) and a Senior Researcher at the Institute for Information Law (IViR)
at the University of Amsterdam.
David A. Hoffman is Associate General Counsel and Global Privacy Officer at Intel Corporation.
Chris Jay Hoofnagle is Adjunct Professor of Information and Law at the University of Califor
nia, Berkeley.
J. Trevor Hughes is President and CEO of the International Association of Privacy Professionals.
Irene Kamara is attorney at law and doctoral Researcher at the Tilburg Institute for Law,
Technology, and Society at Tilburg University and the Research Group on Law, Science,
Technology and Society at the Vrije Universiteit Brussel.
Cobun Keegan is a Westin fellow at the International Association of Privacy Professionals.
Stefan Kulk is a doctoral researcher at Utrecht University School of Law.
George Lowenstein is Herbert A. Simon University Professor of Economics and Psychology at
Carnegie Mellon University.
Mark MacCarthy is Senior Vice President for Public Policy at the Software and Information
Industry Association and Adjunct Professor in the Communication, Culture, and Technology
Program at Georgetown University.
Kirsten Martin is associate professor of Strategic Management and Public Policy at the George
Washington University School of Business.
Contributors xi

Aleecia M. McDonald is a privacy researcher and nonresident fellow at the Center for Internet
and Society at Stanford University.
Michelle N. Meyer is assistant professor and Associate Director of Research Ethics at the
Center for Translational Bioethics and Health Care Policy at Geisinger Health System.
Brian P. O’Connor is Senior Privacy Manager at Dell.
Aeryn Palmer is Senior Legal Counsel at the Wikimedia Foundation.
Bilyana Petkova is a postdoctoral researcher at the Information Law Institute at New York
University.
Jules Polonetsky is CEO of the Future of Privacy Forum.
Patricia A. Rimo is Vice President of Public Affairs at RH Strategic Communications.
Ira Rubinstein is Adjunct Professor of Law and Senior Fellow at the Information Law Institute
at the New York University School of Law.
Florian Schaub is assistant professor of Information and Electrical Engineering and Computer
Science at the University of Michigan.
Evan Selinger is Professor of Philosophy at the Rochester Institute of Technology, where he also
is Head of Research Communications, Community, and Ethics at the Media, Arts, Games,
Interaction, Creativity Center (MAGIC). Selinger is also a Senior Fellow at the Future of
Privacy Forum.
Katie Shilton is associate professor at the College of Information Studies at the University of
Maryland.
Dale Skivington is Vice President for Global Compliance and Chief Privacy Officer at Dell.
Omer Tene is Senior Fellow at the Future of Privacy Forum. Tene is also Vice President of
Research and Education at the International Association of Privacy Professionals and associate
professor at the College of Management School of Law, Rishon Lezion, Israel.
Adam Thierer is a senior research fellow in the Technology Policy Program at the Mercatus
Center at George Mason University.
Joseph Turow is Robert Lewis Shayon Professor of Communication at the Annenberg School
for Communication at the University of Pennsylvania.
Yana Welinder is a nonresidential fellow at the Stanford Center for Internet and Society and
affiliate at the Berkman Klein Center for Internet and Society at Harvard University.
Joshua D. Wright is University Professor and Executive Director of the Global Antitrust
Institute at the Antonin Scalia Law School at George Mason University.
Elana Zeide is a visiting assistant professor at Seton Hall University School of Law.
Lisa Zolidis is Privacy Counsel for the Americas at Dell.
part i

Introduction
1

Consumer Privacy and the Future of Society

Jules Polonetsky, Omer Tene, and Evan Selinger

In the course of a single day, hundreds of companies collect massive amounts of information
from individuals. Sometimes they obtain meaningful consent. Often, they use less than trans-
parent means. By surfing the web, using a cell phone and apps, entering a store that provides
Wi-Fi, driving a car, passing cameras on public streets, wearing a fitness device, watching a show
on a smart TV or ordering a product from a connected home device, people share a steady
stream of information with layers upon layers of hardware devices, software applications, and
service providers. Almost every human activity, whether it is attending school or a workplace,
seeking healthcare or shopping in a mall, driving on a highway or watching TV in the living
room, leaves behind data trails that build up incrementally to create a virtual record of our daily
lives. How companies, governments, and experts should use this data is among the most pressing
global public policy concerns.
Privacy issues, which are at the heart of many of the debates over data collection, analysis, and
distribution, range extensively in both theory and practice. In some cases, conversations about
privacy policy focus on marketing issues and the minutiae of a website’s privacy notices or an
app’s settings. In other cases, the battle cry for privacy extends to diverse endeavors, such as the
following: calls to impose accountability on the NSA’s counterterrorism mission;1 proposals for
designing safe smart toys;2 plans for enabling individuals to scrub or modify digital records of
their pasts;3 pleas to require database holders to inject noise into researchers’ queries to protect
against leaks that disclose an individuals’ identity;4 plans to use crypto currencies5 or to prevent
criminals and terrorists from abusing encryption tools;6 proposals for advancing medical research

1
Richard Clarke, Michael Morell, Geoffrey Stone, Cass Sunstein & Peter Swire, The NSA Report:
Liberty and Security in a Changing World (The President’s Review Group on Intelligence and Communications
Technologies, Princeton University Press, 2014).
2
Kids and the Connected Home: Privacy in the Age of Connected Dolls, Talking Dinosaurs, and Battling Robots (Future
of Privacy Forum and Family Online Safety Institute, Dec. 2016), [Link]
[Link].
3
Case C-131/12 Google Spain v. Agencia Española de Protección de Datos (AEPD) and Mario Costeja González, ECLI:
EU:C:2014:317.
4
Cynthia Dwork, Frank McSherry, Kobbi Nissim & Adam Smith, Calibrating Noise to Sensitivity in Private Data
Analysis, in Proceedings of the 3rd Theory of Cryptography Conference, 265–284 (2006).
5
Arvind Narayanan, Joseph Bonneau, Edward Felten, Andrew Miller & Steven Goldfeder, Bitcoin and
Cryptocurrency Technologies (Princeton University Press, 2016).
6
In re Order Requiring Apple, Inc. to Assist in the Execution of a Search Warrant Issued by This Court, No. 15-mc-1902
(JO) (E.D.N.Y. Feb. 29, 2016).

3
4 Jules Polonetsky, Omer Tene, and Evan Selinger

and improving public health without sacrificing patients’ control over their data;7 and ideas for
how scientists can make their data more publicly available to facilitate replication of studies
without, at the same time, inadvertently subjecting entire populations to prejudicial treatment,
including discrimination.8
At a time when fake news influences political elections, new and contentious forms of
machine to machine communications are emerging, algorithmic decision making is calling
more of the shots in civic, corporate, and private affairs, and ruinous data breaches and
ransomware attacks endanger everything from financial stability to patient care in hospitals,
“privacy” has become a potent shorthand. Privacy is a boundary, a limiting principle, and a
litmus test for identifying and adjudicating the delicate balance between the tremendous
benefits and dizzying assortment of risks that insight filled data offers.

diverse privacy perspectives


The wide scope of perspectives found in this collection reflects the very diversity of privacy
discourse.
Since privacy is front page news, politicians regularly weigh in on it. Some politicians make
privacy their signature issue by submitting legislative proposals, convening committee hearings,
and sending letters to technology companies as they launch and test new tools. Interestingly, in
the United States, privacy can be a bipartisan issue that brings together coalitions from opposite
sides of the aisle. For example, on questions of national security surveillance, right wing
libertarians side with left wing civil rights activists in opposing government powers and advocat
ing for robust oversight mechanisms. However, in the consumer privacy space, traditional roles
are often on display as supporters of regulation spar with free market activists on issues ranging
from telecom regulation to the legitimacy of the data broker industry. In Europe, left wing
parties, such as the German Greens or the Scandinavian Pirate Party, have played important
roles in privacy advocacy by embracing an expansive reading of data protection principles.
Conservatives, by contrast, have sought to balance data protection against economic interests
and free trade. This political tension manifests itself in the twin, often conflicting objectives of
the European data protection regime, which instructs Member States to “protect the fundamen
tal rights and freedoms of natural persons, and in particular their right to privacy with respect
to the processing of personal data,” while, at the same time, “neither restrict[ing] nor prohibit
[ing] the free flow of personal data between Member States.”
Industry interest in privacy often aligns with businesses uniformly vying for more data use and
less regulation. Even so, opinions still splinter across a broad spectrum. Some publishers believe
that stronger limits on ad tracking will advantage them to collect ad revenue that is earned today
by advertising technology companies or large platforms. Other companies believe that new data
portability rules will enable them to leverage data now held by platforms to better compete or to
launch new services. Nevertheless, incumbents in many sectors worry that new regulations and
more extensive liability will impede their digital strategies.

7
Salil Vadhan, David Abrams, Micah Altman, Cynthia Dwork, Paul Kominers, Scott Duke Kominers, Harry Lewis, Tal
Moran & Guy Rothblum, Comments on Advance Notice of Proposed Rulemaking: Human Subjects Research
Protections: Enhancing Protections for Research Subjects and Reducing Burden, Delay, and Ambiguity for Investi-
gators, Docket ID No. HHS-OPHS-2011–0005 (2011), [Link]
advance-notice-proposed-rulemaking-human-subjects-research.
8
Daniel Goroff, Jules Polonetsky & Omer Tene, Privacy Protective Research: Facilitating Ethically Responsible Access to
Administrative Data, 65 Ann. Am. Acad. Pol. & Soc. Sci. 46–66 (2018).
Consumer Privacy and the Future of Society 5

Regulators chase the flurry of market developments with carrots and sticks. Approaches vary,
with some regulators, such as the UK Information Commissioner’s Office, offering advice, best
practices, and compliance tools. Others, such as the Canadian Federal Privacy Commissioner,
enhance limited enforcement powers by actively engaging with the media to “name and shame”
alleged violations of privacy laws. Some European data protection regulators are known to levy
stiff fines and penalties even for technical violations of local statutes. The compliance risks for
businesses will escalate sharply with the imposition of formidable sanctions under the General
Data Protection Regulation. The Federal Trade Commission (FTC), the main federal privacy
regulator in the United States, has developed a complex privacy and security regulatory
approach that is built on two pillars. On the one hand, it includes a string of settlements referred
to by Daniel Solove and Woodrow Hartzog as a “common law” of privacy.9 On the other hand,
the FTC issues a line of policy guidelines through workshops and reports on cutting edge issues
ranging from connected vehicles and consumer genetics to the sharing economy.
Privacy academics are a heterogeneous group who occupy a central place in policy debates.
Some are data optimists. They see a bright future in data intensive technologies and seek to
facilitate their adoption while respecting individuals’ rights. Others are data pessimists. They
warn against the disruptive risk of data technologies and in extreme cases even see an inevitable
decline toward a “database of ruin.”10 More traditionally, academics can be loosely categorized
according to their disciplines. Law and policy scholars explore issues such as the Fourth
Amendment, privacy legislation such as the Health Insurance Portability Act, the Family
Educational Rights and Privacy Act, the Fair Credit Reporting Act, the Children’s Online
Privacy Protection Act, and the FTC’s body of privacy law. Computer scientists deal with issues
such as security and privacy in online, mobile operating systems and software, network security,
anonymity, human machine interaction, and differential privacy. Engineers work on network
security, values in design, privacy by design, blockchain, and privacy enhancing technologies.
Economists assess the value and markets for data, as well as such issues as the value of privacy,
privacy incentives and nudges, data based price discrimination, privacy in credit and health
markets, the behavioral economics of privacy, and more. Design schools innovate privacy
messaging, information schools explore the role of privacy in media and culture, psychologists
experiment on individuals’ responses to incentives in cyber and real world spaces, and ethicists
weigh in on all of this.

consumer privacy
This book brings together academics, policy makers, and industry leaders to critically address the
subset of issues that are raised in the context of consumer privacy. It purposefully sets aside the
fateful dilemmas raised by government surveillance. This includes the continuing fallout from
Edward Snowden’s revelations about the prevalence of government access to private communi
cations data. And it extends to newly emerging challenges, such as deploying military drones to
assassinate suspected terrorists, using data driven software for criminal sentencing, and monitor
ing people awaiting trial and serving court mandated sentences in the seclusion of their homes.
Yet, even narrowed to consumer privacy, this book still addresses a rich spectrum of issues
triggered by an exceedingly broad swath of activities. While consumer privacy once was limited

9
Daniel Solove & Woodrow Hartzog, The FTC and the New Common Law of Privacy, 114 Colum. L. Rev. 583 (2014).
10
Paul Ohm, Don’t Build a Database of Ruin, Harv. Bus. Rev, Aug. 23, 2012, [Link]
database-of-ruin.
6 Jules Polonetsky, Omer Tene, and Evan Selinger

to the realm of online tracking for targeted advertising,11 the topic now extends to wearable
technologies and implantable medical devices, smart homes and autonomous vehicles, facial
recognition and behavioral biometrics, and algorithmic decision making and the Internet of
Things.12 As companies collect massive amounts of data through the Internet, mobile communi
cations, and a vast infrastructure of devices and sensors embedded in healthcare facilities, retail
outlets, public transportation, social networks, workplaces, and homes, they use the information
to test new products and services, improve existing offerings, and conduct research.
Given the wide scale and scope of consumer privacy, the topic can’t be easily distinguished
from government surveillance. With companies amassing huge warehouses of personal infor
mation, governments can swoop in when necessary to access the data through procurement,
legal process, or technological capabilities. As Chris Hoofnagle observed more than a decade
ago, “Accumulations of information about individuals tend to enhance authority by making it
easier for authority to reach individuals directly. Thus, growth in society’s record keeping
capability poses the risk that existing power balances will be upset.”13
Since each new space and field of activity raises weighty policy, legal, ethical, economic, and
technological questions and challenges, input on privacy is needed from experts across the
disciplines. Philosophers, social scientists, legal theorists, geneticists, mathematicians, computer
scientists, and engineers all have important roles to play. The pressing debates require a careful
balancing of diverse values, interests, rights, and considerations. In many cases, individual
benefits are pitted against the public good, and this tension tests the contours of autonomy
and fundamental human rights in a constantly shifting techno social environment.
The impact of technology on the economy and global markets cannot be overstated. Several
of the most highly valued companies are data driven innovators. That is why companies such as
Apple, Google, Microsoft, Amazon, and Facebook, alongside traditional technology power
houses, such as Intel, IBM and AT&T, and new upstarts, including Uber and Snap, are the
focus of heated consumer discussion and regulatory debate.14 This trend goes beyond the United
States and, more broadly, the Western world. Chinese tech giants, such as Baidu, Alibaba, JD.
com, and surging new entrants notably, Didi Chuxing, and [Link] are shaking up the Asian
economy and gaining a global footprint.15 These companies have profound impacts our lives.
Every day, they confront a host of complex value laden choices when designing products that
collect, analyze, process, and store information about every aspect of our behavior. Realizing the
magnitude of these decisions, companies have begun to create ethical review processes, employ
data ethicists and philosophers, and seek guidance from academics, think tanks, policymakers,
and regulators.16 The role of the chief privacy officer, once the domain of only a handful of

11
Omer Tene & Jules Polonetsky, To Track or “Do Not Track”: Advancing Transparency and Individual Control in
Online Behavioral Advertising, 13 Minn. J. L. Sci. & Tech. 281 (2012).
12
Woodrow Hartzog & Evan Selinger, The Internet of Heirlooms and Disposable Things, 17 N. C. J. L. & Tech. 581
(2016).
13
Chris Jay Hoofnagle, Big Brother’s Little Helpers: How ChoicePoint and Other Commercial Data Brokers Collect,
Process, and Package Your Data for Law Enforcement, 29 N. C. J. Int’l L. & Com. Reg. 595 (2004).
14
Farhad Manjoo, Tech’s “Frightful 5” Will Dominate Digital Life for Foreseeable Future, N.Y. Times, Jan. 20, 2016,
[Link]
15
Brendon Kochkodin, Chinese Big Five Tech Companies Gain on U.S. Counterparts, Bloomberg Businessweek,
June 22, 2017, [Link]
counterparts.
16
Jules Polonetsky, Omer Tene & Joseph Jerome, Beyond the Common Rule: Ethical Structures for Data Research in
Non-Academic Settings, 13 Colo. Tech. L. J. 333 (2015); also see Ryan Calo, Consumer Subject Review Boards:
A Thought Experiment, 66 Stan. L. Rev. Online 97, 102 (2013); Evan Selinger & Woodrow Hartzog, Facebook’s
Consumer Privacy and the Future of Society 7

technology leaders, has emerged as a strategic C suite position.17 Within a decade, privacy has
matured into a full fledged profession with a body of knowledge, professional certifications, and
formal legal status.18
Increasingly, not only companies but also government entities are transforming into data
service providers for consumers. Consider smart cities, where local governments have become
hubs of data that is collected through growing networks of sensors and connected technologies
to generate actionable, often real time information.19 By relying on ubiquitous telecommuni
cations technologies to provide connectivity to sensor networks and set actuation devices into
operation, smart cities are increasingly collecting information on cities’ air quality, temperature,
noise, street and pedestrian traffic, parking capacity, distribution of government services, emer
gency situations, and crowd sentiments, among other data points. This information can now be
cheaply aggregated, stored, and analyzed to draw conclusions about the intimate affairs of city
dwellers. The more connected a city becomes, the more it will generate steady streams of data
from and about its citizens and the environment they live in.20
The urban data revolution enables cities to better manage traffic congestion, improve energy
efficiency, expand connectivity, reduce crime, and regulate utility flow. By analyzing data trends
and auditing the performance of schools, public transportation, waste management, social
services, and law enforcement, smart cities can better identify and respond to discriminatory
practices and biased decision making, empowering weakened populations and holding insti
tutions to account. At the same time, the specter of constant monitoring threatens to upset the
balance of power between city governments and city residents. At the extreme, it might destroy
the sense of anonymity that has defined urban life over the past century. As Kelsey Finch and
Omer Tene observe, “There is a real risk that, rather than standing as ‘paragons of democracy,’
[smart cities] could turn into electronic panopticons in which everybody is constantly
watched.”21
Smart community policy also highlights the tension between the push for open data mandates
and public records acts and the desire citizens have for privacy. On the one hand, the transpar
ency goals of the open data movement serve important social, economic, and democratic
functions. Open and accessible public data can benefit individuals, companies, communities,
and government by fueling new social, economic, and civic innovations, and improving
government accountability and transparency. On the other hand, because the city collects
and shares information about its citizens, public backlash over intrusive surveillance remains
an ever present possibility.22 Due to these competing concerns, the consumer privacy discussion
requires aligning potentially conflicting interests: maximizing transparency and accountability
without forsaking individual rights.

Emotional Contagion Study and the Ethical Problem of Co-Opted Identity in Mediated Environments Where Users
Lack Control, 12 Research Ethics 35 (2016).
17
Andrew Clearwater & J. Trevor Hughes, In the Beginning . . . An Early History of the Privacy Profession, 74 Ohio St.
L. J. 897 (2013).
18
J. Trevor Hughes & Cobun Keegan, Enter the Professionals: Organizational Privacy in a Digital Age (see Chapter 22).
19
Kelsey Finch & Omer Tene, Welcome to Metropticon: Protecting Privacy in a Hyperconnected Town, 41 Fordham
Urban L. J. 1581 (2015).
20
Kelsey Finch & Omer Tene, The City as a Platform: Enhancing Privacy and Transparency in Smart Communities (see
Chapter 7).
21
Finch & Tene, supra note 16, at 1583.
22
Ben Green, Gabe Cunningham, Ariel Ekblaw, Paul Kominers, Andrew Linzer & Susan Crawford, Open Data
Privacy: A Risk-Benefit, Process-Oriented Approach to Sharing and Protecting Municipal Data (Berkman Klein Center
for Internet & Society Research Publication, 2017), [Link]
[Link].
8 Jules Polonetsky, Omer Tene, and Evan Selinger

beyond privacy
As we have been suggesting, arguments about privacy have become proxy debates for broader
societal choices about fairness, equity, and power. Since data is central to economic activity
across every sector government, non profit, and corporate the privacy debate has spilled over
to adjacent areas. Educational technology is a prime example.
Long confined to using textbooks, blackboards, and pencil and paper testing, schools now use
new applications, hardware, and services. This includes online curricula and tools, social media
and cloud applications for file sharing and storage, note taking, and collaboration platforms, and
a variety of connected tablets and workstations. Student performance data is driving next
generation models of learning and measurements for teacher effectiveness. And connected
learning is fast becoming a path for access to knowledge and academic achievement.
New educational technology offers many advantages for educators, teachers, parents, and
students. Education has become more interactive, adaptive, responsive, and even fun. Parents
can stay apprised of their child’s performance, accomplishments, and difficulties without
weighing down teachers’ limited time resource. Teachers can connect to sophisticated learning
management systems, while school administrations can obtain rich, measurable inputs to better
calibrate resources to needs.23
However, from a privacy perspective, the confluence of enhanced data collection that
contains highly sensitive information about children and teens also makes for a combustive
mix. New data flows raise questions about who should have access to students’ data and what are
the legitimate uses of the information. Should a developer of a math app be authorized to offer
high performing students a version that covers more advanced material, or would that be
considered undesirable marketing to children? Should an educational social network be permit
ted to feature a third party app store for kids? Or, if an education service detects a security
vulnerability on a website that is available for schools to use, should it be able to leverage its
knowledge to protect schools as well as clients outside of the educational sector? And what about
education technology developers who want to use the data they extract from students to develop
software for the general market?
It is clear that when it comes to education, privacy means different things to different people
and traditional privacy problems are only the tip of the policy iceberg. Activists have challenged
data collection and use to debate school reform, common core curricula, standardized testing,
personalized learning, teacher assessments, and more. Some critics even consider efforts to ramp
up education technology misguided altogether, labeling them as the work of “corporate educa
tion reformers” who seek profit at the expense of public education. Ultimately, then, the
challenge for educational technology entails differentiating problems that can be remedied with
privacy solutions from problems that require other resolutions because they are, at bottom,
proxies for conflicts about education policy.
Complex conversations also surround smart cars and autonomous vehicles. On the one hand,
collecting data in cars is old hat. Vehicles have had computerized data systems since the 1960s.
On the other hand, things are profoundly changing now that vehicles are becoming data hubs
that collect, process, and broadcast information about drivers’ performance, geolocation, tele
matics, biometrics, and even media consumption. Furthermore, vehicle to vehicle (V2V)

23
Jules Polonetsky & Omer Tene, Who is Reading Whom Now: Privacy in Education from Books to MOOCs, 17 Vand.
J. Ent. & Tech. L. 927 (2015); also see Jules Polonetsky & Omer Tene, The Ethics of Student Privacy: Building Trust
for Ed Tech, 21 Int’l Rev. Info. Ethics 25 (2014).
Consumer Privacy and the Future of Society 9

technology introduces a new way for smart cars to seamlessly receive and analyze information
about other vehicles. This capability is essentially transforming public thoroughfares into a
seamless network of information about each vehicle’s position, direction of travel, speed,
braking, and other variables that telematics studies.24
Smart car data collection raises all kinds of issues. Consumers and advocates are concerned
about cars extracting personal data that can be shared with government and law enforcement.
Security experts are anxious about self driving cars being vulnerable to hacking. At the same
time, under the banner of privacy concerns, critics also discuss ethics, labor markets, insurance
premiums, and tradeoffs between safety and autonomy. For example, while smart cars and
autonomous vehicles can reduce traffic accidents, they will also need to make decisions with
moral implications, such as choosing to prioritize the safety of passengers or pedestrians. Coding
algorithms to make momentous moral choices is a formidable challenge that transcends the
guidance traditional privacy frameworks offer.
Insurance companies are vigorously embracing the growth in vehicle generated data by
developing usage based applications to harness information emanating from onboard diagnostic
systems. These applications provide insurers with information on how a vehicle is driven, and
they factor in this information when making decisions about safe driver programs and personal
ized insurance rates. While the Fair Credit Reporting Act applies to the process of using data to
make insurance decisions, its standards cannot address all of the questions that are starting to
arise. Concern is being expressed over allocations of risk and the process of creating categories of
drivers who are uninsurable due to traits and tendencies that potentially can be correlated with
health, genetics, race, and ethnicity. Also, within a generation, autonomous vehicles will
fundamentally upend labor markets. Ostensibly consumers will benefit from increased fleet
efficiency and huge savings in labor costs. At the same time, the economic changes seem poised
to dramatically affect employment prospects, especially for the millions of taxi and truck drivers
in the United States and beyond.25 These policy issues clearly extend digital and cyber privacy
debates into new realms and possibly transform them as well.

the future of society


The upshot of the dynamics and processes highlighted here is that the chapters in this book are
about much more than consumer privacy which is to say, they go far beyond consumer privacy
construed as a niche topic. Contributors fundamentally advance conversations about what paths
should be paved in order to create flourishing societies in the future. With every aspect of
human behavior being observed, logged, analyzed, categorized, and stored, technology is forcing
legislatures, regulators, and courts to deal with an incessant flow of weighty policy choices.
These debates have long spilled over from the contours of privacy, narrowly defined as a right to
anonymity, seclusion and intimacy a right to be let alone26 to a discussion about power and
democracy, social organization, and the role humans should occupy in technologically medi
ated spaces. These tough discussions are about matters such as exposure, profiling and discrimin
ation, self expression, individual autonomy, and the relative roles of humans and machines.

24
Lauren Smith & John Verdi, Comments from the Future of Privacy Forum to the Federal Trade Commission and U.S.
Department of Transportation (National Highway Traffic Safety Administration, May 1, 2017), [Link]
content/uploads/2017/05/[Link].
25
See, e.g., The Future of Jobs: Employment, Skills and Workforce Strategy for the Fourth Industrial Revolution (World
Economic Forum, Jan. 2016), [Link] Future of [Link].
26
Samuel Warren and & Louis Brandeis, The Right to Privacy, 4 Harv. L. Rev. 193 (1890).
10 Jules Polonetsky, Omer Tene, and Evan Selinger

Consider what happened when a teacher was fired after a picture was posted on Facebook of
her dressed as a drunk pirate. It was hard to know if the ensuing public debate was about privacy
settings on the social network or the limits of assessing behavior in a world where every action is
documented, tagged, and presented to the public to judge.27 Similarly, it is hard to pinpoint
what parents and teachers are concerned about when they recoil against ephemeral cyberbully
ing messages on apps such as Snapchat. Is it dismay about the software’s privacy settings? Or
might it be sadness over the cruel experiences of childhood being exposed and augmented
through a new medium?28 And what about autonomous vehicles engineers who design a real
life response to the longstanding trolley problem? Are they dealing with fair information practice
principles or ethical challenges that have occupied philosophers from Aristotle to Immanuel
Kant and John Stuart Mill?29
Advances in artificial intelligence and machine learning keep raising the stakes. Developers
deploy artificial intelligence to improve organizations’ performance and derive predictions in
almost every area of the economy. This happens in domains ranging from social networks,
autonomous vehicles, drones, precision medicine, and the criminal justice system. And it
includes such processes as speech and image recognition, universal translators, and ad targeting,
to name a few. Organizations leverage algorithms to make data based determinations that impact
individuals’ rights as citizens, employees, seekers of credit or insurance, and so much more. For
example, employers use algorithms to assess prospective employees by offering neuroscience
based games that are said to measure inherent traits. Even judges turn to algorithms for
sentencing and parole decisions. They use data to predict a person’s risk of recidivism, violence,
or failure to appear in court based on a complicated mix of behavioral and demographic
characteristics.30
Daniele Citron has written about the importance of creating appropriate standards of algo
rithmic due process that include transparency, a right to correct inaccurate information, and a
right to appeal adverse decisions.31 Unfortunately, this goal might be incredibly difficult to meet.
Thanks to machine learning, sophisticated algorithmic decision making processes arguably have
become inscrutable, even to their programmers. The emergent gap between what humans and
machines know has led some critics, such as Frank Pasquale, to warn against the risks of a Black
Box Society32 driven by what Cathy O’Neil dubs Weapons of Math Destruction.33
At the same time, breakthroughs in artificial intelligence have enabled disenfranchised groups
to speak the truth to power by identifying biases and inequities that were previously hidden in

27
Jeffrey Rosen, The Web Means the End of Forgetting, N.Y. Times, July 21, 2010, [Link]
magazine/[Link].
28
J. Mitchell Vaterlaus, Kathryn Barnett, Cesia Roche and & Jimmy Young, “Snapchat is more personal”: An Explora-
tory Study on Snapchat Behaviors and Young Adult Interpersonal Relationships, 62 Computers Hum. Behav. 594
(2016); also see Evan Selinger, Brenda Leong & Bill Fitzgerald, Schools Fail to Recognize Privacy Consequences of
Social Media, Christian Sci. Monitor, Jan. 20, 2016, [Link]
2016/0120/Opinion-Schools-fail-to-recognize-privacy-consequences-of-social-media.
29
Why Self-Driving Cars Must Be Programmed to Kill, MIT Tech. Rev., Oct. 22, 2015, [Link]
.com/s/542626/why-self-driving-cars-must-be-programmed-to-kill/.
30
Omer Tene & Jules Polonetsky, Taming the Golem: Challenges of Ethical Algorithmic Decision Making, 19 N. C. J. L.
& Tech. (forthcoming 2019).
31
Danielle Keats Citron, Technological Due Process, 85 Wash. U. L. Rev. 1249 (2008).
32
Frank Pasquale, The Black Box Society (Harvard University Press, 2015).
33
Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens
Democracy (Crown, 2016).
Consumer Privacy and the Future of Society 11

opaque databases or behind faceless human bureaucracies.34 New uses of data can also save
lives. For example, the United Nations Global Pulse project uses data from cell phones in
developing countries to detect pandemics, relieve famine, and fight human trafficking.35 New
policy initiatives have been started that recognize the mixed blessings of artificial intelligence
and the inevitable trade offs created by using it. The Partnership on Artificial Intelligence, for
example, is set “to address such areas as fairness and inclusivity, explanation and transparency,
security and privacy, values and ethics, collaboration between people and artificial intelligence
systems, interoperability of systems, and the trustworthiness, reliability, containment, safety, and
robustness of the technology.”36
In an ideal world, due process would be secured and every company would follow all of the
technical privacy rules all of the time. But even in such a utopia, consumers and commentators
probably still would be unsettled by “creepy” technologically mediated behavior. Attributions of
“creepy” revolve around activities where people believe that harm is occurring even though
privacy settings are not circumvented and data use technically remains within the scope of its
intended purposes. These are instances where new technology further erodes cherished values,
such as obscurity, or new uses of existing technologies produce novel outcomes, such as
unexpected data use or customization.37 People are rattled by such threats to traditional social
norms and the prospect that unsettling new practices will be normalized. In these moments,
they wonder why engineers and marketers fail to anticipate problems. Sometimes, they hold
these groups accountable.
All of this suggests that, far from what a first glance at the title of this volume might lead
readers to expect, the Cambridge Handbook of Consumer Privacy critically explores core issues
that will determine how the future is shaped. To do justice to the magnitude and complexity of
these topics, we have asked contributors to address as many parts and perspectives of the
consumer privacy debate as possible. How we, all of us, collectively grapple with these issues
will determine the fate of technology and course of humanity.38

chapter summaries 39
The Pervasiveness and Value of Tracking Technologies
In Chapter 2, “Data Brokers Should They Be Reviled or Revered,” Jennifer Barrett Glasgow
defines the various types of data brokers as they exist today in the United States. She discusses
where they get their data and how much of it is aggregated from multiple sources. Glasgow also
describes how data brokers deliver data to the marketplace and who buys data from a data broker.
She covers how data brokers are regulated by law or self regulation and how they interact with
consumers. Finally, Glasgow outlines the risks that data brokers pose, and briefly poses some
thoughts about their future.
34
Big Data: A Tool for Fighting Discrimination and Empowering Groups (Future of Privacy Forum and Anti-
Defamation League Report, 2014), [Link]
[Link].
35
The State of Mobile Data for Social Good Report (United Nations Global Pulse, June 2017), [Link]
sites/default/files/MobileDataforSocialGoodReport [Link].
36
Goals statement, Partnership on AI, [Link]
37
Woodrow Hartzog & Evan Selinger, Surveillance as Loss of Obscurity, 72 Wash. & Lee L. Rev. 1343 (2015).
38
See, e.g., Evan Selinger & Brett Frischmann, Utopia?: A Technologically Determined World of Frictionless Transac-
tions, Optimized Production, and Maximal Happiness, 64 UCLA L. Rev. Disc. 372 (2016).
39
To ensure all of the chapters are fairly summarized, we asked contributors to provide their own. What follows are
versions of their summaries, in some cases verbatim.
12 Jules Polonetsky, Omer Tene, and Evan Selinger

In Chapter 3, “In Defense of Big Data Analytics,” Mark MacCarthy argues that big data
analytics, including machine learning and artificial intelligence, are natural outgrowths of
recent developments in computer technology such as the availability of massive data sets, vast
increases in computing power, and breakthroughs in analytical techniques. These techniques
promise unprecedented benefits for consumers, workers, and society at large, but they also pose
challenges for privacy and fairness. MacCarthy’s chapter contains a short summary of the range
of potential benefits made possible by these new analytic techniques and then discusses privacy
and fairness challenges. Principles of privacy policy requiring data minimization and restricting
secondary data use need to be reformulated to allow for both the successful delivery of big data
benefits and effective privacy protection. Ubiquitous re identification risks and information
externalities reduce the ability of individuals to control the disclosure of information and suggest
less reliance on notice and choice mechanisms. Big data analytics can pose fairness challenges,
but these techniques are not exempt from existing antidiscrimination and consumer protection
laws. Regulatory agencies and courts need to enforce these laws against any abuses accom
plished through big data analysis. Disclosure of source code is not an effective way to respond to
the challenges of designing and using unbiased algorithms. Instead, enterprises should develop
and implement a framework for responsible use of data analytics that will provide for fairness by
design and after the fact audits of algorithms in use. Such a framework will need to adopt
standards of fairness and appropriate remedies for findings of disparate impact. This will require
moving beyond technical matters to address sensitive normative issues where the interests of
different groups collide and moral intuitions diverge. A collaborative effort of businesses,
governments, academics, and civil rights and public interest groups might sharpen the issues
and allow sharing of information and best practices in a way that would benefit all.
In Chapter 4, “Education Technology and Student Privacy,” Elana Zeide argues that new
education technology (ed tech) creates new ways to manage, deliver, and measure education
that generate a previously unimaginable array and detail of information about students’ actions
both within and outside classrooms. She claims that data driven education tools have the
potential to revolutionize the education system and, in doing so, provide more access to better
quality, lower cost education and broader socioeconomic opportunity. The information
generated by such tools also provides fodder for more informed teacher, school, and policy
decision making. At the same time, Zeide maintains, these data practices go against traditional
expectations about student privacy. The education context requires a tailored approach to data
protection. Few students can opt out of school information practices, making consent based
protections potentially problematic. Maturing data subjects, Zeide cautions, raises concerns
about creating modern day “permanent records” with outdated information that unfairly
foreclose opportunities. Many fear for profit providers will prioritize generating revenue over
students’ educational interests. Traditional student privacy regulations aren’t designed for an era
of the tremendous volume, variety, and velocity of big data, because they rely on privacy self
management and institutional oversight. Many newer state laws restrict commercial educational
technology services to using student data only for “school purposes,” but don’t cover the
potential unintended consequences and nuanced ethical considerations surrounding educa
tional use of data. As a result, Zeide concludes, the responsibility rests on entities generating,
collecting, and using student data to adopt best practices to meet the specific expectations and
considerations of education environments.
In Chapter 5, “Mobile Privacy Expectations: How Privacy Is Respected in Mobile Devices,”
Kirsten Martin and Katie Shilton describe privacy challenges raised by mobile devices, explore
user privacy expectations for mobile devices, and discuss developer responses to privacy
Consumer Privacy and the Future of Society 13

concerns. Martin and Shilton argue that mobile technologies change social practices and
introduce new surveillance concerns into consumers’ everyday lives. Yet, consumers, regulators,
and even the developers who build mobile applications struggle to define their expectations for
the privacy of this data, and consumers and developers express different privacy expectations.
The authors argue firms and regulators can help mitigate the gap between developers and
consumers by making privacy by design part of corporate governance and establishing privacy as
a first order concern for protecting consumer trust.
In Chapter 6, “Face Recognition, Real Time Identification, and Beyond,” Yana Welinder
and Aeryn Palmer provide an overview of face recognition technology and recent efforts to
regulate its use. They first explain the process by which face recognition technology operates,
including recent advancements in its capabilities through the use of neural networks. They then
discuss various consumer applications of the technology, such as mobile apps and social network
features that can identify people in photos. Next, they survey regulatory responses to face
recognition technology across the globe, highlighting new developments and previewing pos
sible trends to come in the United States, the European Union, Canada, China, and other
jurisdictions. The discussion demonstrates the lack of regulation in some areas and reveals global
uncertainty about how best to control face recognition technology under the law. The chapter
concludes with recommendations to two types of stakeholders. First, it addresses policy makers,
encouraging them to balance support for innovation with protection of individual privacy rights.
It stresses the importance of obtaining consent from all relevant parties, and of giving special
consideration to government access to privately held face recognition data. Finally, Welinder
and Palmer suggest that developers leverage User Experience Design as a notice tool, collect and
retain a minimal amount of data, and keep the principles of security by design at the forefront of
their minds.
In Chapter 7, “Smart Cities: Privacy, Transparency, Community,” Kelsey Finch and Omer
Tene argue that today’s cities are pervaded by growing networks of connected technologies to
generate actionable, often real time data about the city and its citizens. The more connected a
city becomes, the more it will generate a steady stream of data from and about its citizens. As
smart city technologies are being rapidly adopted around the globe, we must determine how
communities can leverage the benefits of a data rich society while minimizing threats to
individuals’ privacy and civil liberties. Just as there are many methods and metrics to assess a
smart city’s livability, or sustainability, or efficiency, so too there are different lenses through
which cities can evaluate their privacy preparedness. This chapter lays out three such perspec
tives, considering a smart city’s privacy responsibilities in the context of its roles as a data steward,
data platform, and government authority. By considering the deployment of smart city technolo
gies in these three lights, communities will be better prepared to reassure residents of smart cities
that their rights will be respected and their data protected.

Ethical and Legal Reservations about Tracking Technologies


In Chapter 8, “Americans and Marketplace Privacy: Seven Annenberg National Surveys in
Perspective,” Joseph Turow sketches the growing surveillance and personalized targeting of
Americans carried out by marketers as well as the public arguments used to defend these
activities. At the core of their justifications is the notion that despite professed concerns over
privacy, people are rationally willing to trade information for the relevant benefit that marketers
provide. Drawing on seven nationally representative telephone surveys from 1999 through 2015,
Turow presents findings that tend to refute marketers’ justifications for increased personalized
14 Jules Polonetsky, Omer Tene, and Evan Selinger

surveillance and targeting of individuals. Contrary to the claim that a majority of Americans
consent to data collection because the commercial benefits are worth the costs, he also shows
that the 2015 survey supports a different explanation: a large pool of Americans feel resigned to
the inevitability of surveillance and the power of marketers to harvest data. When they give up
information as they shop it merely appears they are interested in tradeoffs. The overall message
of the surveys is that legislators, regulators, and courts ought to rethink the traditional regulatory
understanding of harm in the face of a developing American marketplace that ignores the
majority of Americans’ views and is making overarching tracking and surreptitious profiling an
aspect of society taken for granted.
In Chapter 9, “The Federal Trade Commission’s Inner Privacy Struggle,” Chris Jay Hoofna
gle’s discusses the cultural and ideological conflicts on privacy internal to the FTC, and explains
why the lawyers at the Commission are leading the privacy charge. This is because the Bureau of
Economics is constitutionally skeptical of information privacy. Privacy skepticism reflects the
economists’ academic methods and ideological commitments. While information privacy is a
deeply multidisciplinary field, the Bureau of Economics adheres to a disciplinarity that bounds
its inquiry and causes it to follow a laissez faire literature. Commitments to “consumer welfare,”
concerns about innovation policy, lingering effects of Reagan era leadership, the lack of a
clearly defined market for privacy, and the return of rule of reason analysis in antitrust also
contribute to the Bureau of Economics’ skepticism toward rights based privacy regimes. Hoof
nagle concludes with a roadmap for expanding the BE’s disciplinary borders, for enriching its
understanding of the market for privacy, and for a reinvigoration of the FTC’s civil penalty
factors as a lodestar for privacy remedies.
In Chapter 10, “Privacy and Human Behavior in the Information Age,” Alessandro Acquisti,
Laura Brandimarte, and George Lowenstein provide a review that summarizes and draws
connections between diverse streams of empirical research on privacy behavior. They use three
themes to connect insights from social and behavioral sciences: people’s uncertainty about the
consequences of privacy related behaviors and their own preferences over those consequences;
the context dependence of people’s concern, or lack thereof, about privacy; and the degree to
which privacy concerns are malleable manipulable by commercial and governmental inter
ests. Organizing our discussion by these themes, the authors offer observations concerning the
role of public policy in the protection of privacy in the information age.
In Chapter 11, “Privacy, Vulnerability, and Affordance,” Ryan Calo argues that the relation
ship between privacy and vulnerability is complex. Privacy can be both a shield against
vulnerability and a sword in its service. What is needed to capture this nuanced interaction is
a theoretical lens rooted in the physical and social environments as they exist, but also sensitive
to the differing ways people perceive and experience that environment. Calo further contends
that James Gibson’s theory of affordance is an interesting candidate to capture this complexity,
including in the context of consumer privacy. Affordance theory, Calo demonstrates, helps
generate and unify some of consumer privacy’s most important questions and will perhaps one
day lead to better answers.
In Chapter 12, “Ethical Considerations When Companies Study and Fail to Study Their
Customers,” Michelle N. Meyer provides an overview of the different ways in which businesses
increasingly study their customers, users, employees, and other stakeholders, and the different
reasons why they do so. Meyer argues, however, that a complete ethical analysis of business
research requires consideration not only of the purpose, nature, and effects of such research but
also of a business’s choice not to study the effects of its products, services, and practices on
stakeholders. Depending on a variety of criteria she discusses, a particular business study even
Consumer Privacy and the Future of Society 15

one conducted without study specific informed consent can fall on a spectrum from unethical
to ethically permissible to ethically laudable or even obligatory. Although business research is
now ubiquitous in many ways, happily so the fact that individual, study specific informed
consent is usually infeasible in this context means that a careful consideration of a study’s risks
and expected benefits is called for. For reasons that Meyer explains, the content of federal
regulations that govern risk benefit analyses of most academic and some industry research the
so called Common Rule is not easily translated to the business setting. But she argues that
companies should consider adopting something like the process used by institutional reviews
boards (IRBs) to prospectively review and oversee research, and provides recommendations
about how such company “research review boards” might operate.
In Chapter 13, “Algorithmic Discrimination vs. Privacy Law,” Alvaro Bedoya addresses the
intersection of two pressing debates: the desire to eliminate bias in automated decision making
systems, and the recent industry led push to enforce privacy protections at the point of data use,
rather than the point of data collection. Bedoya highlights that most proposed solutions to the
problem of algorithmic bias have tended to focus on post collection remedies. Honing in on a
specific technology, face recognition, Bedoya argues that correcting for algorithmic bias in this
way will prove to be difficult, if not impossible. Instead, he says, the most effective means to
counter algorithmic discrimination may come at the beginning of the data life cycle at the
point of collection. In making this argument, he emphasizes the importance of collection
controls in any comprehensive privacy protection regime.
In Chapter 14, “Children, Privacy, and the New Online Realities,” Stephen Balkam discusses
the extraordinary challenges we all face in staying private in our hyperconnected lives. He
emphasizes the difficulties parents, platforms, and policy makers face in keeping children’s data
private in an age of connected toys, devices, and always on connectivity. Balkam looks at the
history and evolution of the Children’s Online Privacy Protection Act (COPPA) and addresses
its benefits and shortcomings. He looks at how major social media platforms, such as Facebook,
have responded to COPPA as well as some of the companies that have fallen foul of the law. In
addition to considering the likes of Hello Barbie and Amazon’s Echo, Balkam also considers the
range of potential privacy issues brought by innovations in virtual, augmented, and mixed reality
devices, apps and games. He concludes with a look at the future of children’s privacy in
an AI infused, constantly monitored world. Balkam suggests that solutions will have to be
found across the public, private, and non profit sectors and then communicated clearly and
consistently to parents and their digitally savvy children.
In Chapter 15, “Stakeholders and High Stakes: Divergent Standards for Do Not Track,”
Aleecia M. McDonald provides an in depth look at the history of Do Not Track, informed by
McDonald’s personal experience as an original cochair of the World Wide Web Committee
standards group. In the United States, the Do Not Call list is considered one of the big successes
in consumer privacy. In contrast, Do Not Track was dubbed “worse than a miserable failure”
before it even got out of the standards committee trying to define it. At this time, Do Not Track is
a soon to be published standard from the World Wide Web Committee (W3C), where standards
emerge for web technologies such as HTML, which is the language of web pages. Meanwhile,
the Electronic Frontier Foundation (EFF), an online rights group, has devised its own privacy
enhanced version of Do Not Track, with multiple companies pledging to use it. Several ad
blockers will permit ads from companies that honor EFF’s Do Not Track, providing a carrot and
stick approach to user privacy and control. In yet a third approach, Do Not Track was suggested
as a way to signal compliance with European Union privacy laws, both in a recent international
Privacy Bridges project, as well as in publications by this author and leading European privacy
16 Jules Polonetsky, Omer Tene, and Evan Selinger

scholars. The best thing about standards, as the saying goes, is that there are so many to choose
from. Yet from a user’s perspective, how can the multiplicity of Do Not Track approaches be
anything but confusion?
In Chapter 16, “Applying Ethics When Using Data Beyond Individuals’ Understanding,”
Martin E. Abrams and Lynn A. Goldstein contend that with the expanding use of observational
data for advanced analytics, organizations are increasingly looking to move beyond technical
compliance with the law to the ethical use of data. Organizations need to understand the fair
processing risks and benefits they create for individuals, whether they are ethically appropriate,
and how they might be demonstrated to others. Their chapter explores the evolution of data
driven research and analytics, discusses how ethics might be applied in an assessment process,
and sets forth one process for assessing whether big data projects are appropriate.

International Perspectives
In Chapter 17, “Profiling and the Essence of the Right to Data Protection,” Bilyana Petkova and
Franziska Boehm begin by reviewing the legislative history of the provision on automated
decision making in the 1995 EU Data Protection Directive (the 1995 Directive), as it was
amended in the process of adopting a new EU General Data Protection Regulation that would
enter into force in 2018. Next, they discuss profiling in the context of the case law of the Court of
Justice of the European Union (CJEU) in the Google Spain, Digital Rights Ireland, and
Schrems cases. Petkova and Boehm argue that the CJEU might be making a subtle move in
its interpretation of the EU Charter of Fundamental Rights toward protecting against undesir
able profiling measures instead of merely protecting against the identification of an individual.
Finally, from the employment context, they discuss a few hypotheticals of algorithmic decision
making that illustrate how the relevant legislative framework might be applied.
In Chapter 18, “Privacy, Freedom of Expression, and the Right to be Forgotten in Europe,”
Stefan Kulk and Frederik Zuiderveen Borgesius discuss the relation between privacy and
freedom of expression in Europe. In principle, the two rights have equal weight in Europe
which right prevails depends on the circumstances of a case. To illustrate the difficulties when
balancing privacy and freedom of expression, Kulk and Borgesius discuss the Google Spain
judgment of the Court of Justice of the European Union, sometimes called the “right to be
forgotten” judgment. The court decided in Google Spain that, under certain conditions, people
have the right to have search results for their name delisted. The authors discuss how Google
and Data Protection Authorities deal with such delisting requests in practice. Delisting requests
illustrate that balancing the interests of privacy and freedom of expression will always remain
difficult.
In Chapter 19, “Understanding the Balancing Act Behind the Legitimate Interest of
the Controller Ground: A Pragmatic Approach,” Irene Kamara and Paul De Hert analyse the
provision of the legitimate interest ground in the new EU data protection framework, the
General Data Protection Regulation. The authors explain that the rationale of the legitimate
interest ground is that under certain conditions, controllers’ or third parties’ interests might
be justified to prevail over the interests, rights, and freedoms of the data subject. When and how
the prevailing may take place under the GDPR provisions is not a one dimensional assessment.
De Hert and Kamara suggest a formalisation of the legitimate interest ground steps toward the
decision of the controller on whether to base his or her processing on the legitimate interest
ground. They argue that the legitimate interest ground should not be seen in isolation, but
through the lens of the data protection principles of Article 5 GDPR and Article 8 Charter
Consumer Privacy and the Future of Society 17

Fundamental Rights EU. The authors further analyse the relevant case law of the Court of
Justice EU, as well as the cases of Network and Information Security and Big Data and Profiling.
Kamara and De Hert conclude that the legitimate interest of the controller is not a loophole in
the data protection legislation, as has often been alleged, but an equivalent basis for lawful
processing, which can distinguish controllers in bad faith from controllers processing data in
good faith.

New Approaches to Improve the Status Quo


In Chapter 20, “The Intersection of Privacy and Consumer Protection,” Julie Brill explores
the intersection between privacy and consumer protection in the United States. She surveys the
consumer protection laws that simultaneously address privacy harms, and also examines how the
Federal Trade Commission’s consumer protection mission has allowed the Commission to
become a lead privacy regulator. Along the way, Brill delves into the challenges posed by data
brokers, lead generators, and alternative credit scoring as well as potential avenues for the
United States to strengthen privacy protections.
In Chapter 21, “A Design Space for Effective Privacy Notices,” Florian Schaub, Rebecca
Balebako, Adam L. Durity, and Lorrie Faith Cranor argue that notifying users about a system’s
data practices is supposed to enable users to make informed privacy decisions. Yet, current
notice and choice mechanisms, such as privacy policies, are often ineffective because they are
neither usable nor useful, and are therefore ignored by users. Constrained interfaces on mobile
devices, wearables, and smart home devices connected in an Internet of Things exacerbate the
issue. Much research has studied the usability issues of privacy notices and many proposals for
more usable privacy notices exist. Yet, there is little guidance for designers and developers on the
design aspects that can impact the effectiveness of privacy notices. In this chapter, Schaub,
Balebako, Durity, and Cranor make multiple contributions to remedy this issue. They survey the
existing literature on privacy notices and identify challenges, requirements, and best practices for
privacy notice design. Further, they map out the design space for privacy notices by identifying
relevant dimensions. This provides a taxonomy and consistent terminology of notice approaches
to foster understanding and reasoning about notice options available in the context of specific
systems. Our systemization of knowledge and the developed design space can help designers,
developers, and researchers identify notice and choice requirements and develop a comprehen
sive notice concept for their system that addresses the needs of different audiences and considers
the system’s limitations and opportunities for providing notice.
In Chapter 22, “Enter the Professionals: Organizational Privacy in a Digital Age,” J. Trevor
Hughes and Cobun Keegan observe that contemporary privacy professionals apply legal,
technological, and management knowledge to balance the important concerns of citizens and
consumers with the interests of companies and governments worldwide. They further note that
the field of information privacy has rapidly matured into an organized, interdisciplinary profes
sion with international reach. Their chapter compares the burgeoning privacy profession with
other modern professions, describing its history and similar growth curve while highlighting the
unique characteristics of a profession that combines law, policy, technology, business, and ethics
against a rapidly shifting technological landscape. As it has grown into a profession, Hughes and
Keegan argue that privacy has developed a broad body of knowledge with multiple specialties,
gained recognition as a vital component of organizational management, and become formally
organized through professional associations and credentialing programs. Government recogni
tion and enforcement actions have legitimized the role of privacy professionals even as these
18 Jules Polonetsky, Omer Tene, and Evan Selinger

professionals work collectively to synthesize comprehensive and lasting ethical norms. In an era
increasingly fueled and defined by data, significant changes in the shape of our economy and
professional workforce are inevitable. By guiding the governance and dissemination of personal
information, Hughes and Keegan argue that the privacy profession is well situated to grow and
mature in these rapidly changing times.
In Chapter 23, “Privacy Statements: Purposes, Requirements, and Best Practices,” Mike
Hintze addresses common criticisms of privacy statements and argues that many criticisms
misunderstand the most important purposes of privacy statements, while others can be addressed
through careful and informed drafting. Hintze suggests that while drafting a privacy statement
may be considered by some to be one of the most basic tasks of a privacy professional, doing it
well is no simple matter. One must understand and reconcile a host of statutory and self
regulatory obligations. One must consider different audiences who may read the statement from
different perspectives. One must balance pressures to make the statement simple and readable
against pressures to make it comprehensive and detailed. A mistake can form the basis for an
FTC deception claim. And individual pieces can be taken out of context and spun into public
relations debacles. Hintze’s chapter explores the art of crafting a privacy statement. It explains
the multiple purposes of a privacy statement. It lists and discusses the many elements included in
a privacy statement some required by law and others based on an organization’s objectives.
Finally, it describes different approaches to drafting privacy statements and suggests best prac
tices based on a more complete understanding of a privacy statement’s purposes and audiences.
In Chapter 24, “Privacy Versus Research in Big Data,” Jane R. Bambauer analyzes how
traditional notions of privacy threaten the unprecedented opportunity to study humans in the
Big Data era. After briefly describing the set of laws currently constraining research, the chapter
identifies puzzles and potential flaws in three popular forms of privacy protection. First, data
protection laws typically forbid companies from repurposing data that was collected for a
different, unrelated use. Second, there is a growing appreciation that anonymized data can be
reidentified, so regulators are increasingly skeptical about using anonymization to facilitate the
sharing of research data. And third, research law generally prohibits researchers from performing
secret interventions on human subjects. Together, these restrictions will interfere with a great
amount of Big Data research potential, and society may not get much in return for the
opportunity costs.
In Chapter 25, “A Marketplace for Privacy: Incentives for Privacy Engineering and Innov
ation,” Courtney Bowman and John Grant inquire into what drives businesses to offer tech
nologies and policies designed to protect consumer privacy. The authors argue that in capitalist
systems, the primary levers would be market demand supplemented by government regulation
where the market fails. But when it comes to privacy, consumers’ demand can appear inconsist
ent with their expressed preferences, as they ignore high profile data breaches and gleefully
download trivial smartphone apps in exchange for mountains of their own personal data. Yet,
even in places where government regulation is light (such as the United States), many com
panies increasingly appear to be pursuing high profile and sometimes costly positions,
practices, and offerings in the name of protecting privacy. Ultimately, Bowman and Grant
suggest that in order to understand the true market for privacy, beyond consumer driven
demand, it is necessary also to consider the ethos of the highly skilled engineers who build
these technologies and their level of influence over the high tech companies that have created
the data economy.
In Chapter 26, “The Missing Role of Economics in FTC Privacy Policy,” James Cooper and
Joshua Wright note that the FTC has been in the privacy game for almost twenty years. In that
Consumer Privacy and the Future of Society 19

time span, the digital economy has exploded. As a consequence, the importance to the economy
of privacy regulation has grown as well. Unfortunately, Cooper and Wright insist, its sophistica
tion has yet to keep pace with its stature. As they see it, privacy stands today where antitrust stood
in the 1970s. Antitrust’s embrace then of economics helped transform it into a coherent body of
law that despite some quibbles at the margin almost all agree has been a boon for consumers.
Cooper and Wright thus argue that privacy at the FTC is ripe for a similar revolution. The
chapter examines the history of FTC privacy enforcement and policy making, with special
attention paid to the lack of economic analysis. It shows the unique ability of economic analysis
to ferret out conduct that is likely to threaten consumer welfare, and provide a framework for
FTC privacy analysis going forward. Specifically, Cooper and Wright argue that the FTC needs
to be more precise in identifying privacy harms and to develop an empirical footing for both its
enforcement posture and such concepts as “privacy by design” and “data minimization.” The
sooner that the FTC begins to incorporate serious economic analysis and rigorous empirical
evidence into its privacy policy, the authors maintain, the sooner consumers will begin to reap
the rewards.
In Chapter 27, “Big Data by Design: Establishing Privacy Governance by Analytics,” Dale
Skivington, Lisa Zolidis, and Brian P. O’Connor argue that a significant challenge for corporate
big data analytics programs is deciding how to build an effective structure for addressing privacy
risks. They further contend that privacy protections, including thoughtful Privacy Impact
Assessments, add essential value to the design of such programs in the modern marketplace
where customers demand adequate protection of personal data. The chapter thus provides a
practical approach to help corporations weigh risks and benefits for data analytics projects as they
are developed to make the best choices for the products and services they offer.
In Chapter 28, “The Future of Self Regulation is Co Regulation,” Ira Rubinstein contends
that privacy self regulation and especially voluntary codes of conduct suffers from an overall
lack of transparency, weak or incomplete realization of the Fair Information Practice Principles,
inadequate incentives to ensure wide scale industry participation, and ineffective compliance
and enforcement mechanisms. He argues that the US experiment with voluntary codes has gone
on long enough and that it is time to try a new, more co regulatory approach. In co regulation,
firms still enjoy considerable flexibility in shaping self regulatory guidelines, but consumer
advocacy groups have a seat at the table, and the government retains general oversight authority
to approve and enforce statutory requirements. Rubenstein examines three recent co regulatory
efforts: (1) privacy management programs designed by multinational firms to demonstrate
accountability under both European and US privacy laws; (2) the NTIA multistakeholder
process, under which industry and privacy advocates have sought to develop voluntary but
enforceable privacy codes without any explicit legal mandate; and (3) Dutch codes of conduct
under national data protection law, which allows industry sectors to draw up privacy codes
specifying how statutory requirements apply to their specific sector. He concludes by identifying
lessons learned and offering specific policy recommendations that might help shape any future
consumer privacy legislation in the United States or abroad.
In Chapter 29, “Privacy Notices: Limitations, Challenges, and Opportunities,” Mary
J. Culnan and Paula J. Bruening contend that openness is the first principle of fair information
practices. While in practice “notice” has been used to create openness, notices have been widely
criticized as being too complex, legalistic, lengthy, and opaque. Culnan and Bruening argue
that to achieve openness, data protection should move from a “notice” model to a model that
requires organizations to create an environment of “transparency.” They assert that while often
used interchangeably, the terms “notice” and “transparency” are not synonymous. In their
20 Jules Polonetsky, Omer Tene, and Evan Selinger

chapter, Culnan and Bruening review the history of notice in the United States, its traditional
roles in data protection, the challenges and limitations of notice, the efforts to address them, and
the lessons learned from these efforts. They examine the challenges emerging technologies pose
for traditional notice and propose a move away from a reliance on notice to the creation of an
environment of transparency that includes improved notices, attention to contextual norms,
integrating notice design into system development, ongoing public education, and new techno
logical solutions. Finally, Culnan and Bruening present arguments for business buy in and
regulatory guidance.
In Chapter 30, “It Takes Data to Protect Data,” David A. Hoffman and Patricia A. Rimo note
that we live in a world of constant data flow, and safeguarding data has never been more
important. Be it medical records, financial information or simple online passwords, the amount
of private data that needs to be protected continues to grow. Along with this growth in the need
to secure data, Hoffman and Rimo insist, however, are the privacy concerns people have with
their data. While some would pit security and privacy against each other, arguing that individ
uals must choose one over the other, the two actually can and should reinforce each other. It’s
this model that forms the basis of the chapter: Privacy and security should be pursued hand in
hand as we move toward an increasingly connected, digital world. To fully realize the benefits of
information technology, big data, and Internet of Things, Hoffman and Rimo argue individuals
must be confident that their devices are designed in a way that protects their data and that any
data being collected and processed from those devices is used responsibly. Using internationally
recognized mechanisms such as the Fair Information Privacy Principles, public and private
organizations can enable both the innovative and ethical use of data. The key is not avoiding
data but using it mindfully. It takes data to protect data.
In Chapter 31, “Are Benefit Cost Analysis and Privacy Protection Efforts Incompatible?”
Adam Thierer argues that benefit cost analysis (BCA) helps inform the regulatory process by
estimating the benefits and costs associated with proposed rules. At least in the United States,
BCA has become a more widely accepted part of regulatory policy making process and is
formally required before many rules can take effect. The BCA process becomes far more
contentious, however, when the variables or values being considered are highly subjective in
character. This is clearly the case as it pertains to debates over online data collection and digital
privacy. The nature and extent of privacy rights and privacy harms remain open to widely
different conceptions and interpretations. This makes BCA more challenging, some would say
impossible. In reality, however, this same problem exists in many different fields and does not
prevent BCA from remaining an important part of the rule making process. Even when some
variables are highly subjective, others are more easily quantifiable. Thierer thus contends that
policymakers should conduct BCA for any proposed rules related to data collection and privacy
protection to better understand the trade offs associated with those regulatory proposals.
In Chapter 32, “Privacy After the Agile Turn,” Seda Gürses and Joris van Hoboken explore
how recent paradigmatic transformations in the production of everyday digital systems are
changing the conditions for privacy governance. Both in popular media and in scholarly work,
great attention is paid to the privacy concerns that surface once digital technologies reach
consumers. As a result, the strategies proposed to mitigate these concerns, be it through
technical, social, regulatory or economic interventions, are concentrated at the interface of
technology consumption. The authors propose to look beyond technology consumption, invit
ing readers to explore the ways in which consumer software is produced today. By better
understanding recent shifts in software production, they argue, it is possible to get a better grasp
of how and why software has come to be so data intensive and algorithmically driven, raising a
Consumer Privacy and the Future of Society 21

plethora of privacy concerns. Specifically, the authors highlight three shifts: from waterfall to
agile development methodologies; from shrink wrap software to services; and, from software
running on personal computers to functionality being carried out in the cloud. Their shorthand
for the culmination of these shifts is the “agile turn.” With the agile turn, the complexity,
distribution, and infrastructure of software have changed. What are originally intended to be
techniques to improve the production of software development, e.g., modularity and agility, also
come to reconfigure the ways businesses in the sector are organized. In fact, the agile turn is so
tectonic, it unravels the authors’ original distinction: The production and consumption of
software are collapsed. Services bind users into a long term transaction with software companies,
a relationship constantly monitored and improved through user analytics. Data flows, algo
rithms, and user profiling have become the bread and butter of software production, not only
because of business models based on advertisements, but because of the centrality of these
features to a successful disruptive software product. Understanding these shifts has great impli
cations for any intervention that aims to address, and mitigate, consumer privacy concerns.
part ii

The Pervasiveness and Value of Tracking Technologies


2

Data Brokers: Should They Be Reviled or Revered?

Jennifer Barrett Glasgow

Data brokers are largely unknown by the average individual and often accused by the press and
privacy advocates for doing all kinds of unsavory things because they make money collecting and
sharing personal data with others, often without the knowledge of the consumer.
In 2012, the Federal Trade Commission investigated the industry and published a report,
“Data Brokers: A Call for Transparency and Accountability.”1 The report said, “In today’s
economy, Big Data is big business. Data brokers companies that collect consumers’ personal
information and resell or share that information with others are important participants in this
Big Data economy.” However, the report went on to say,
Many of these findings point to a fundamental lack of transparency about data broker industry
practices. Data brokers acquire a vast array of detailed and specific information about con
sumers; analyze it to make inferences about consumers, some of which may be considered
sensitive; and share the information with their clients in a range of industries. All of this activity
takes place behind the scenes, without consumers’ knowledge.

The FTC also recommended that Congress consider legislation requiring data brokers to
provide consumers access to their data, including sensitive data, and the ability to opt out from
marketing uses.
The marketing community is an aggressive user of data about people. A study published in
2013 by John Deighton of the Harvard Business School and Peter Johnson of mLightenment
Economic Impact Research, a recent Columbia University professor, “The Value of Data 2013:
Consequences for Insight, Innovation, and Efficiency in the U.S. Economy,” found that the
data driven marketing economy added $156 billion in revenue to the US economy and fueled
more than 675,000 jobs in 2012. These issues were revisited in 2015 in the follow up study, “The
Value of Data 2015: Consequences for Insight, Innovation and Efficiency in the U.S. Econ
omy,” which found that in two years, revenue had grown to $202 billion, a 35 percent increase,
and jobs had grown to 966,000, a 49 percent increase.
So, are data brokers doing great things for our economy or are they operating a personal data
exchange behind the backs of consumers?

1
FTC report, Data Brokers: A Call for Transparency and Accountability, [Link]
reports/data-brokers-call-transparency-accountability-report-federal-trade-commission-may-2014/140527databrokerreport
.pdf. (Data Brokers)

25
26 Jennifer Barrett Glasgow

an introduction to data brokers


Data brokers have been around since the 1960s. They are not homogeneous entities nor are they
easily defined. Some are regulated entities, and some are not. Some companies that broker data
are consumer facing; others are not. The key differences relate to where the data comes from,
the types of data the brokers bring to the marketplace, and the various uses of the data by
the buyer.
The definition of “data broker” has been debated for some time, and many companies work
hard to avoid the label because it is easy to make data brokers sound scary or evil. For purposes of
this handbook, we will use an expansive definition. Data brokers are companies that collect
personal and non personal information about individuals and license, sell, share or allow use of
that information by another entity for the other entity’s benefit or for their mutual benefit. One
area of debate is over whether a company that allows other entities to “use” consumer data they
posses to advertise on the company’s websites should be considered a data broker. Such entities
argue they are not a data broker since they don’t actually “give” the data to the advertiser.
However, for purposes of discussion, we will include them in the definition.
Historically, data brokers in the offline space in the 1960s dealt primarily with personally
identifiable information (PII),2 such as names, addresses, and telephone numbers. In the early
2000s, we began to see data brokers in the digital space dealing mainly with data collected
anonymously through cookies and other digital markers, which is considered non personally
identifiable information (Non PII).3 Today, data brokers often combine offline and online data
for their clients to use to market and advertise across all digital channels.
Some data brokers collect data directly through some type of relationship with the individual.
These data brokers are considered “first party” data brokers. Some data brokers are second parties
and collaboratively share their customer data with each other for some mutual benefit. Third
party data brokers have no relationship with the individual, but instead buy or license data from
public sources and from both first party and third party data brokers. It’s worth noting that these
definitions are not always mutually exclusive the same data broker can act as a first , second
and third party source in different cases. Clearly, data can pass through a number of hands in the
chain of custody for the data broker ecosystem. Also, it is common for data brokers to offer their
clients multiple data products and ancillary data related services, such as data hygiene.
Commercial data brokers generally fall into four categories.

• Providers of public information, including government agencies, professional organiza


tions, and research, look up and locate services (e.g., people search services).
• Consumer reporting agencies (CRAs), including background screening services.
• Risk mitigation services (e.g., identity verification and anti fraud services).
• Marketing data brokers, including general marketing data brokers, lead generation services,
and large advertising supported websites (e.g., social media and search engine sites).

2
Personally Identifiable Information (PII) as used in US privacy law and information security, is “information that can
be used on its own or with other information to identify, contact, or locate a single person, or to identify an individual
in context,” [Link] identifiable information.
3
“Non-Personal Information (Non-PII) is data that is linked or reasonably linkable to a particular computer or device.
Non-PII includes, but is not limited to, unique identifiers associated with users’ computers or devices and IP addresses,
where such identifiers or IP addresses are not linked to PII. Non-PII does not include De-Identified Data.” [Link]
.[Link]/understanding-online-advertising/glossary.
Data Brokers: Should They Be Reviled or Revered? 27

There are a variety of data sources for data brokers, ranging from public records and other
publicly available information to digital behavior data. Different data brokers are interested in
different combinations of data. However, one common thread for most data brokers, especially
third party brokers, is that they aggregate data from multiple sources to make it easier for the
buyer a kind of one stop shopping.
The media have been known to focus on one of these categories of data brokers, often marketing
data brokers, but will use the generic ‘data broker’ moniker as if there was one homogeneous
industry. However, due to various regulations on some data and on certain uses of data, these
categories differentiate quite significantly based on the data they broker and the use of the brokered
data by the client. A discussion of each of these broad categories of data brokers follows.

providers of public information, including government agencies,


professional organizations, and third-party research, look-up,
and locate services (e.g., people-search services)
Many federal and state government agencies and most professional organizations are first party
data brokers. Government agencies provide access to and broker data that is considered public
record. They may license this data to other agencies, companies, non profit organizations and
third party data brokers for commercial and non commercial purposes. A few examples of first
party government agencies that broker their data include state motor vehicle and driver’s license
divisions, state voter registration divisions, property deed agencies, tax agencies, courts, the
Federal Aviation Administration, and even the Social Security Administration.
In addition, many professional organizations and licensing agencies broker their membership and
licensing data. Examples include physician, nursing, dental, and realtor professional associations.
A number of websites also provide directory information to look up and/or locate individuals. These
are often referred to as people search sites. They include many free sites, usually supported by
advertising, such as [Link], as well as sites such as [Link] and peoplefi[Link]
where some basic data is free, but the user must pay for additional information related to the search.
Ancestry sites offering genealogical information would also fall into this category.
Since this is a very broad category of data brokers, the type of data they bring to the market can
vary drastically. Property ownership records contain identifying information about the owner
along with characteristics of the property (number of bedrooms, baths, etc.), as well as infor
mation about any mortgages on the property. Voter records would contain name, address, date of
birth, and party affiliation. Court proceedings would include such identifying information on
the defendant as name, address, and possibly Social Security number (SSN), along with infor
mation about the charges and sentence.
Many consumers do not fully understand all the types of government records that are public.
Some are dismayed at the level of personal detail that is available in these records. However, they
also lack an understanding of the value public records and open access bring.
These agencies and organizations provide critical information that has been the bedrock of
our democracy for decades. According to the Coalition for Sensible Public Record Access
(CSPRA),4 “The debate over access to public records focuses primarily on concerns about
privacy and identity theft and fraud. Technology advances and the growing trend of providing
electronic access to public records have helped advance this debate.”
CSPRA further reports,

4
Coalition for Sensible Public Record Access (CSPRA), [Link]
28 Jennifer Barrett Glasgow

Information and data compiled by private companies from public records, including Social
Security numbers, addresses, dates of birth and phone numbers, are used every day to help
combat identity theft. Social Security numbers have proven to be the most reliable tool in
verifying an individual’s identity. Certain public and court records contain this vital information,
and provide a reliable source for data matching, which helps prevent the rapid increase in
identity fraud victims. Further, commercial databases compiled using public records for identity
verification are routinely used online and offline to detect credit card application fraud, and
insurance application and claims fraud.

CSPRA points to additional benefits: “The use of public records improves the speed and
accuracy of check acceptances, combats identity theft, and reduces check fraud, which has
the combined effect of lowering costs for all consumers.”
Public information providers give access to information that is also very valuable for research
purposes. The National Public Records Research Association (NPRRA)5 is a trade association of
businesses engaged in an industry for which public records serve as the backbone. Its member
ship includes document filers, researchers, retrievers, corporate service providers, entity forma
tion agents, and registered agents.
LexisNexis and Experian are two of the largest aggregators of public records; however,
numerous smaller brokers focus on aggregating specific types of public records, such as property
records and marriage and divorce records.

consumer reporting agencies (cra s), including background


screening services
The Fair Credit Reporting Act (FCRA)6 was enacted in 1970 due to a lack of transparency in the
credit industry in the 1960s, and has been amended numerous times since.7 The law places a
number of obligations on the category of data broker known in the law as a consumer reporting
agency (CRA). CRAs provide data in the form of consumer reports for certain permissible
purposes, decisions of eligibility for credit, employment, insurance, and housing, and similar
determinations.
Credit bureaus are a type of CRA and provide consumer reports in the form of credit reports.
Background screening brokers are another type of CRA that provides consumer reports in the
form of background checks, when the screening is done for employment purposes.
CRAs have a number of obligations under the law. These include requirements to maintain
reasonable procedures for maximum accuracy of consumer reports, provide access to the infor
mation they hold, handle disputes by a consumer, and remove negative information, such as
bankruptcies, after ten years and other negative information after seven years.
Credit bureaus maintain information about the status of a consumer’s credit accounts and
some bill payment information. They maintain information about how often payments are
made, how much credit is available, how much credit is currently in use, and any debts that are
past due. They also maintain rental information and public records such as liens, judgments,
and bankruptcies that are helpful in assessing a consumer’s financial status. CRAs also use all
this information to create a credit score, an easy way to summarize a consumer’s credit history in
one rating.

5
National Public Records Research Association (NPRRA), [Link]
6
Fair Credit Reporting Act (FCRA), [Link]
7
Fair and Accurate Credit Transaction Act of 2003 (FACTA) PUBLIC LAW 108–159—Dec. 4, 2003.
Data Brokers: Should They Be Reviled or Revered? 29

Background screening brokers verify the information provided by a job applicant for the
prospective employer. The verification can range from past employment to education and
criminal history. The applicant must authorize the screening and has the right to challenge
any errors in the report prior to the employer taking any adverse action.
The passage of the FCRA led to significant consolidation in the credit bureau industry from
thousands of small local CRAs to a much smaller number of large credit bureaus and specialty CRAs.
In addition to the three major credit bureaus, Experian, Equifax, and TransUnion, there are
dozens of other specialty CRAs. These include companies that maintain medical records and
payments, residential or tenant histories, and other publicly available information for permissible
purposes. Some of the companies considered specialty CRAs include: First Data Telecheck,
Innovis, MIB Group, and Milliman.
Although the major CRAs are required by law to provide a central source website for
consumers to request a copy of the consumer report about them, the nationwide specialty
consumer reporting agencies are not required to provide a centralized online source. Instead
they must establish a streamlined process for consumers to request a report about them, which
must include, at a minimum, a toll free telephone number.
Because the decisions being made by users of consumer reports are very impactful on the
consumer, the accuracy of the data is of paramount importance. The FCRA requires, “When
ever a consumer reporting agency prepares a consumer report it shall follow reasonable proced
ures to assure maximum possible accuracy of the information concerning the individual about
whom the report relates.”8
It should be mentioned that smaller companies who offer look up and locate or background
screening services must be careful to understand when they are subject to the FCRA. This occurs
when they provide services for credit, employment, insurance, housing, and similar eligibility
determinations. One such instance occurred in 2012 when Spokeo paid $800,000 to settle charges
by the FTC that they marketed their consumer profiles to companies in the human resources,
background screening, and recruiting industries without complying with the FCRA.9

risk mitigation services (e.g., identity verification


and anti-fraud)
There are third party data brokers who provide identity verification services and, in some instances,
identity information for use in detecting and preventing fraud. In order to guard against fraudulent
transactions, a number of laws have been passed that require a company to “know your customer.”
In other instances it is just a smart business practice to screen customers and monitor suspicious
transactions. Such legal obligations are common in the financial services sector. The Gramm
Leach Bliley Act (GLBA)10 regulating financial institutions is one such law.
According to the FTC report on Data Brokers, “Risk mitigation products provide significant
benefit to consumers, by, for example, helping prevent fraudsters from impersonating unsus
pecting consumers.”11

8
Fair Credit Reporting Act (FCRA) 15 U.S.C 1681, p. 34, Accuracy of Report, [Link]
[Link].
9
Spokeo to Pay $800,000 to Settle FTC Charges, [Link]
800000-settle-ftc-charges-company-allegedly-marketed.
10
Gramm-Leach-Bliley Act (GLBA), [Link]
11
FTC report, Data Brokers, p. v, [Link]
accountability-report-federal-trade-commission-may-2014/[Link].
30 Jennifer Barrett Glasgow

Most often, these services provide verification of data the user already has. They can verify that
a name, address, SSN, and phone all belong to the same person. They may also identify whether
the individual is on a money laundering list or a terrorists watch list. In some cases, they assist the
fraud department of a financial institution investigate suspected fraudulent transactions by
providing a variety of identifying information about the individual, such as multiple SSNs
associated with the person, or past addresses or phone numbers.
While these risk mitigation services are able to identify and respond to most requests, they may
not satisfy 100 percent of inquiries. Thus, the user of the service has to have alternative
verification processes, usually manual ones, to handle inquires that cannot be serviced.
The consumer may or may not be aware that third party verification is taking place, unless the
verification is denied. Then they are usually referred to the data broker providing the service to
determine if there are errors in their files.
The credit bureaus, along with LexisNexis, Acxiom, and Thompson Reuters, are a few of the
largest third party providers of these types of services.

marketing data brokers including general marketing data


brokers, lead generation services, and large advertising-
supported websites (including social media and search engines)
There are tens of thousands of first , second , and third party entities that provide data for
marketing purposes. The number of players and the scope of the data they collect and bring
to market make this the largest, most aggressive and sophisticated category of data brokers.
Many in the retail, catalog, and publishing industries rent some of their first party customer
data for marketing and advertising purposes. Financial institutions do a lot of second party joint
marketing with airlines, retailers, and other brands. Acxiom, Experian, and IMS Health are some
of the largest third party marketing data brokers, but there are thousands of smaller, more
specialized ones, such as the real estate multi listing service.
As the world becomes more digitally connected, a growing number of digital data brokers focus
on providing data for online, mobile, and addressable TV advertising, and as the Internet of Things
expands, advertising will support many other devices, including wearables,12 smart devices,13 and
even smart cars.14 This segment of the community is the fastest growing segment.

12
Wearables are the general category of “wearable devices, tech togs, or fashion electronics,” i.e., clothing and
accessories incorporating computer and advanced electronic technologies. The designs often incorporate practical
functions and features. “Wearable devices such as activity trackers are a good example of the Internet of Things,” since
they are part of the network of physical objects or “things” embedded with “electronics, software, sensors, and
connectivity” to “enable objects to exchange data . . . with a manufacturer, operator and/or other connected devices,
without requiring human intervention,” [Link] technology.
13
Smart devices are electronic devices, “generally connected to other devices or networks via different wireless protocols
such as Bluetooth, NFC, Wi-Fi, #G, etc., that can operate to some extent interactively and autonomously. Several
notable types of smart devices are smartphones, phablets, tablets, smart watches, smart bands, and smart key chains . . .
Smart devices can be designed to support a variety of form factors, a range of properties pertaining to ubiquitous
computing, and use in three main system environments: the physical world, human-centered environments, and
distributed computing environments,” [Link] device.
14
Smart cars are automobiles with advanced electronics. Microprocessors have been used in car engines since the late
1960s and have steadily increased in usage throughout the engine and drivetrain to improve stability, braking, and
general comfort. The 1990s brought enhancements such as GPS navigation, reverse sensing systems, and night vision
(able to visualize animals and people beyond the normal human range). The 2000s added assisted parking, Web and
e-mail access, voice control, smart card activation instead of keys, and systems that keep the vehicle a safe distance
from cars and objects in its path. Of course, the ultimate smart car is the one that drives itself (see autonomous vehicle
and connected cars), [Link]
Data Brokers: Should They Be Reviled or Revered? 31

General marketing data brokers provide PII and Non PII data for both offline and digital
marketing15 purposes. They license a list of individuals who meet certain criteria to a marketer, a
process known as “list rental,” or they append specified data elements to a marketer’s customer
list, a process known as “enhancement.”
These data brokers aggregate and provide data on individuals and households identifying
information (e.g., name, address, phone, and email), demographic information (e.g., age or
date of birth, education, and ethnicity), household characteristics (e.g., identifying and
demographic information on the spouse, how many children there are and their approximate
ages, and how many people live in the house), general financial information (e.g., modeled
ranges of estimated household income and modeled estimated net worth), interests (e.g., like
to cook, read, water ski, travel abroad, or redecorate the house), lifestyle (e.g., types of cars, and
information about property owned, such as price, value, size, age, features, and mortgage
company), and major life events (e.g., recently got married, divorced, had a child, or bought a
new house).
There are many first party websites, known as lead generation services, that sell varying levels
of qualified leads, meaning individuals inquiring about or shopping for certain products or
services. These include individuals shopping for automobiles, insurance policies, hotel rooms,
and much more. These websites typically provide competitive quotes from various providers and
then sell the inquiry back to the provider and sometimes to others for marketing purposes. These
data brokers provide identifying and contact information and some information related to the
products of interest. For example, if the lead is for a used car, the type of car the individual is
interested in would be included, or if the lead is for life insurance, the age of the individual may
be provided.
Some websites, typically those with large user bases, monetize their site by allowing advertisers
to market to their users based on data about their customers. As discussed, some of these first
party companies argue that they are not true data brokers, because the data does not actually end
up in the hands of the advertiser. However, because they allow the use of consumer data for the
advertiser’s marketing purpose, we are including them in our definition of a data broker. This
type of data broker includes many social media sites, including Facebook, search engines, such
as Google, news and magazine sites, and many others.
Because data from data brokers has become a part of almost every marketing campaign, it is
difficult to measure the precise value brought by exchanging marketing data. As cited previously,
the DDMI’s 2015 study, “The Value of Data,” quantifies the value of the Data Driven Marketing
Economy (DDME), both in terms of revenues generated for the US economy and jobs fueled
across the nation. The DDME contributed 966,000 jobs and $202 billion to the US economy in
2014. The report went on to point out that 70 percent of the value is in the exchange of data
across the DDME, and if markets had to operate without the ability to broker data, our economy
would be significantly less efficient. Because the study used a very conservative methodology, the
actual impact could be even larger.
According to the FTC “Data Brokers” report, “Marketing products benefit consumers by
allowing them to more easily find and enjoy goods and services they need and prefer. In
addition, consumers benefit from increased and innovative product offerings fueled by increased

15
Digital marketing is an umbrella term for “the marketing of products or services using digital technologies, mainly on
the Internet, but also including mobile phones, display advertising, and any other digital medium,” [Link]
.[Link]/wiki/Digital marketing.
32 Jennifer Barrett Glasgow

competition from small businesses that are able to connect with consumers they may not have
otherwise been able to reach.”16
While acknowledging the benefits, the FTC and others have criticized the marketing data
broker category for not being more transparent. The category has also been criticized because
the data that brokers aggregate is not accurate.
All data has errors. Unavoidable errors can be introduced during the data aggregation process.
However, marketing data from brokers is accurate enough to provide significant benefit, even
with a reasonable percentage of errors. Furthermore, the negative impact from inaccurate
marketing data is that someone may not see the ad they would consider relevant, or they may
see an ad they would not consider relevant, a situation that is exacerbated when data is less
available. Consumers often assume marketing data from brokers is completely accurate, but this
is neither a reasonable expectation nor a marketplace need.
The 2014 FTC report on data brokers was critical of the extent to which marketing data
brokers offered choices to consumers. While some data brokers offered consumers choices about
how data was used, because the brokers are third parties, and not consumer facing, consumers
did not always know about the choice. They went onto say that the opt outs were sometimes
unclear about the scope of the choice.
The report also acknowledged that, “marketing products benefit consumers by allowing them
to more easily find and enjoy goods and services they need and prefer. In addition, consumers
benefit from increased and innovative product offerings fueled by increased competition from
small businesses that are able to connect with consumers they may not have otherwise been able
to reach.”

where do data brokers get their data?


Data brokers are often reported to have detailed dossiers on individuals. Certain types of data
brokers, such as CRAs, do have extensive profiles. But because of laws, industry codes of
conduct, and the basic economics of collecting or purchasing data, aggregating it from multiple
sources, storing it, and making it available in the market, different types of data brokers collect or
acquire different data.
The kind of data available in the offline space is quite different from the data available in the
digital space. Offline data is almost entirely PII. However, in the digital space, much of the data
is Non PII, relating to a device rather than to an individual.
Finally, data brokers provide data that is legal, of interest, and relevant to their clients’ ultimate
uses. Furthermore, there must be a sufficient quantity of a data element for it to be of value to a
data broker. An identity verification service that can only verify forty percent of the requests they
get is not financially viable. Marketers don’t market to a few individuals, instead they market to
tens of thousands or millions of individuals. With the exception of lead generation services
which don’t need such high volumes to be viable, this means data has to be available on a
substantial portion of the total audience before it is commercially feasible to collect, store,
package, and sell it for marketing purposes.
The offline data sources of interest and available to a data broker vary with the type of data
broker, often based on laws and industry codes of conduct as well as the type of data most
valuable for the ultimate use.

16
FTC report, Data Brokers, p. v, [Link]
accountability-report-federal-trade-commission-may-2014/[Link].
Data Brokers: Should They Be Reviled or Revered? 33

In general, offline data comes from the following sources:

• Public Records and Publicly Available Information: Public records are a common source of
offline data for all types of third party data brokers. In the United States, we believe that
certain government actions, licenses, and registrations should be available for all to view,
both individuals and organizations. These include court proceedings, property ownership,
census data, and much more, as described previously. Some believe such access should be
curtailed, but today, this data supports our democracy and provides significant benefits
ranging from locating missing individuals to assisting with business, legal, and personal
affairs. According to CSPRA, public records are actually one of the best protections against
criminal activity.
The benefits of public records17 are summarized in a CSPRA whitepaper. They
include a reliable system of recording property owners and assessing the credit worthiness
of buyers and sellers, the ability to recover a debt, enforce a law, collect child support, find
witnesses and bail jumpers, identify sex offenders, find safe drivers, and hire responsible
trustworthy employees. As indicated in the chart that follows this section, public records are
used by every category of data broker.
A variety of telephone directories, professional directories and listings, motor vehicle
records, driver’s license records, real property records, assessor information, court proceed
ings, voter registration records, birth records, and death records are a few of the common
sources of public records. Many state level public records are governed by state laws that
restrict some uses of such data. Some of the best examples are motor vehicle records that are
restricted by both federal and state Driver’s Privacy Protection acts. For example, these
records can be used for identity verification and anti fraud purposes, but not for marketing.
Another example would be voter registration records. In some states voter records generally
can be used for election related purposes by candidates and parties, but again, not for general
marketing purposes. A third example is real property records that are very helpful to the real
estate industry, but in some states are prohibited from use for other types of marketing.
Data that is not a government public record, but is generally publicly accessible or
viewable, can sometimes be used for many other purposes unless restrictions are put on it
by the source. This includes directories on websites for professional organizations and
publicly viewable data on social media sites. However, many such sites, especially social
media sites, have terms of use that restrict commercialization without specific authoriza
tion. Anyone collecting this data should carefully review the site’s terms of access and use
before collecting and brokering this type of data. Just because you can view it doesn’t mean
you can legally use it.
• Surveys: Consumer surveys are conducted by survey companies and by consumer facing
entities such as manufacturers who provide a short survey when the buyer registers the
product, often for warranty purposes.
Responsible companies conducting surveys inform the individual at the time the
survey is conducted about whether the answers will be used only by them or brokered to
others and for what purposes. They also offer an opt out from such brokering.
Surveys are less accurate than other sources of data, so they are of value for marketing
purposes, but not for such other purposes as risk mitigation.

17
The Benefits of Commercial and Personal Access to Public Records, [Link] site admin/assets/docs/
The Benefits of Commercial and Personal Access to Public Records [Link].
34 Jennifer Barrett Glasgow

• First and Second Parties: Consumer facing brands in many industries including finance,
media, publishing, catalog, retail, travel, entertainment, automotive, and, of course, lead
generation entities, license some or all of their customer information for various purposes
including consumer reports, risk mitigation, and marketing.
For marketing purposes this information can be general, such as that a consumer
subscribed to a certain magazine, or it can be fairly detailed, such as information about auto
sales, service, and repairs. While the data would not contain a credit card number, it may
contain a type of credit card, such as VISA Gold Card, and may include the amount of the
purchase or flag the transaction in a range of amounts, along with some indication about
the product(s) bought.
For risk mitigation purposes the data may be limited to identifying information, such
as name and address, phone number, email address, and SSN.
Companies that offer credit furnish information to credit bureaus as governed by the
GLBA and FCRA. Examples of these companies include credit card companies, auto
finance companies, and mortgage banking institutions. Other examples include collection
agencies, state and municipal courts, and employers. Under the FCRA, entities that furnish
information to a credit bureau must provide complete and accurate information, inform
consumers about negative information, and investigate consumer disputes.
• Other Third Party Data Brokers: It is typical for third party data brokers to aggregate
information from many sources that often include information from other data brokers,
providing one stop shopping for the data buyer. The FTC data brokers report said, “seven of
the nine data brokers buy from or sell information to each other. Accordingly, it may be
virtually impossible for a consumer to determine the originator of a particular data
element.”18
• Modeled (or Derived) Data: Data is also created by the data broker from statistical
modeling activities using some or all of these data sources. Data is modeled when the
desired data elements are not otherwise available. These analytical processes range from
fairly simple derivations to highly sophisticated predictive analytics. There are primarily two
forms of modeled data directly modeled and look alike models. Both types of modeled
data are rapidly growing in variety and popularity.
Directly modeled data is created by statistically analyzing a large number of data
elements and determining which of them can predict the desired outcome, such as
predicting a transaction is likely to be fraudulent or estimating the financial capacity of a
household for marketing purposes. In general, the most common elements that are
statistically modeled for marketing are estimated income and estimated net worth, since
precise information is not available. Such elements may be modeled from data such as zip
code, education/profession, price of the home, cars owned, and age. For risk mitigation
purposes, fraud scores are an example of modeled data.
Look alike models are used to create data when a small target audience is known say,
people who traveled abroad more than two times a year for pleasure but information on
the total audience is unavailable. The known audience is enhanced with other available
data, such as demographic, household, interest, and financial, and then statistically studied
to determine which of the enhancement elements most predict the individuals in the target
audience. Once the study is complete, the original audience data is discarded and the

18
FTC report, Data Brokers, p. 14, [Link]
accountability-report-federal-trade-commission-may-2014/[Link].
Data Brokers: Should They Be Reviled or Revered? 35

model is applied to a file with only the enhancement data to identify the target audience.
While less precise than directly modeled data, look alike data can be an excellent way to
derive data that is unavailable otherwise or in sufficient quantities to be useful.
Scores are growing in popularity as more data is used to make predictions. A report by
the World Privacy Forum, “The Scoring of America: How Secret Consumer Scores
Threaten Your Privacy and Your Future,”19 points out unexpected problems that arise
from new types of predictive consumer scoring. These activities usually fall outside FCRA
and other legal regulated practices, but use thousands of pieces of information to predict
how consumers will behave. The report raises issues of fairness, discrimination, accuracy,
and transparency.
The following chart depicts where different data brokers commonly get their offline data.

Where do offline data brokers get their information?


Other
Public Records & First Data Modeled
Type of Data Broker Individual Publicly Available Surveys Parties Brokers Data
Government/ X X
Professional/People
Search
Consumer Reporting X X X
Agencies
Risk Mitigation X X X X
Marketing X X X X X X

The digital segment of the data broker community typically operates with data that is non
personally identifiable information (Non PII). In addition to anonymizing all the offline data
and using it in the digital ecosystem, digital data brokers also collect usage data (known as
behavioral data) that relates to a device a browser, an App, a mobile ID, IDs that identify TV
set top boxes, etc. rather than to a person.
As we further embrace the Internet of Things (IoT), we will see an even wider variety of
device related data being brokered.

• Behavioral Data: Generally digital behavioral data is information collected about the sites/
apps/TV shows with which a user of a device interacts over time. Companies who collect
this type of data across multiple devices for advertising purposes are called network
advertisers20. This type of data is referred to as “interest based advertising” (IBA) data21
and is usually Non PII. Because the individual is often unaware that their actions are being
observed and recorded, robust self regulation has been in place since the early 2000s to

19
Pam Dixon & Robert Gellman, The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Your
Future, [Link] Scoring of America April2014 [Link].
20
Network advertisers are third-party online advertising technology companies, including networks, exchanges, DMPs,
SSPs, RTB platforms, analytics companies, and service providers. The advertising technologies deployed by
100 member companies in NAI provide considerable economic benefits across the online and mobile ecosystems,
including for publishers, advertisers, and consumers, [Link]
21
Interest Based Advertising (IBA) means the collection of data across web domains owned or operated by different
entities for the purpose of delivering advertising based on preferences or interests known or inferred from the data
collected, p. 5, [Link] [Link].
36 Jennifer Barrett Glasgow

provide individuals some level of awareness and control over these practices. Self regulation
for IBA is discussed in greater detail later in this chapter.
Behavioral data may also be collected for anti fraud purposes. Monitoring behavioral
activity on a website helps detect and prevent fraud by recognizing unusual and risky user
patterns of activity. All activity is monitored and both typical or normal and atypical
behavior is identified, so future activity correlating to high risk behavioral patterns, or those
inconsistent with normal patterns, can be identified.

how do data brokers aggregate data from multiple sources?


Since one of the key aspects of third party data brokers is the aggregation of data from many,
sometimes thousands of sources, the processes used to aggregate data accurately are very
important. What the data broker needs is a common data element a name and address, a
phone number, an email, a device ID, etc. across all their data sources. When the data
element is finite, such as a phone number or email address, the matching is straightforward. But
when the elements are more free form, such as a name and address, with various anomalies,
such as initials or nicknames, the matching becomes much more complicated and introduces
more chance for errors.
In the digital space, one method of aggregation is ID syncing. Any type of digital ID can be
synced. For example, when party A gives party B permission to read their cookie on a browser
where party B also has a cookie, the IDs in each cookie can be synced, creating the opportunity
to aggregate information associated with both cookies. In the online world, this is called cookie
syncing22. Across the digital ecosystem, syncing IDs across different media, such as browsers,
Apps, and other addressable devices, is referred to as a cross device graph.23
While all the offline data is considered PII, this data may be stripped of its personal identifying
characteristics and tied to a Non PII device through a process known as “onboarding.” Once
onboarded, offline data can be linked to digital data associated with the same device.
In the digital space, third party IBA data relates to a device instead of to an individual. The
IBA data is analyzed to create audience segments. A segment usually has multiple characteris
tics, such as women over fifty who like to play golf, or people making more than $100,000/year
who have a college degree and like to travel abroad. The characteristics of a segment are based
on what kind of IBA activity is available and what the marketplace wants to buy.
It is important to note that some types of data are typically not available or useful to marketers.
Marketing data brokers do not typically sell specific details about a purchase. Instead they
summarize purchase data into demographic, interest, and lifestyle data or ranges of financial
data. They also shy away from more sensitive information such as credit information, medical
information or Social Security numbers, most of which are regulated by law. They also typically
focus on personal data about adults, individuals over the age of seventeen. While they may be
interested in the family make up, say how many kids there are in the household and of what
ages, adults usually control the purse strings and marketing messages are directed to them, not
the children.
With laws, such as the Children’s Online Privacy Protection Act (COPPA), there has been
interest in having the ability to know if a registered user is a child or not. However, since such
data is unavailable on children, such verification services are not currently possible.

22
Cookie syncing and how it works, [Link]
23
DMA’s Cross-Device Identity Solutions RFI Template, [Link]
Data Brokers: Should They Be Reviled or Revered? 37

how do data brokers deliver their data to the marketplace?


There are various ways a data broker delivers their data or services supported by their data into
the marketplace.

• User Search and View: Some data brokers allow clients to do real time searches and view
the response online or through a subscription service or app. A search can provide response
data on one individual or a few individuals may meet the search criteria. Batch searches are
also usually available for high volume inquiries that are not time sensitive.
• Lists: Requests for lists can be placed with the data broker that give a count of how many
individuals/households meet the criteria. The request can be refined so the result is within
the usage parameters of the buyer. Once the specifications are set, the data broker pulls a
list of relevant records and sends it to the buyer. Such lists are typically delivered for
one time use or for use over a specific period of time. Once the usage limits have been met,
the list should be destroyed.
• Enhancement: The process known as enhancement takes place when the buyer sends the
data broker the contact information for the consumers on whom they want information,
such as name and address, email or phone number. The buyer also identifies the data
elements they want to license from the data broker. The data broker matches the buyer’s
data to its master database. Where there is a match, the data broker appends the desired data
elements from their database to the buyers file, or in other words, enhances the buyer’s data
with the specified data elements requested, and returns the information to the marketer.
Enhancement data is usually provided under a rental contract with an expiration date,
typically a six or twelve month period, after which the buyer needs to come back to the
data broker and append fresh data.
In the digital space, data is delivered through digital connections. Such companies as Acxiom/
LiveRamp and Datalogix facilitate connectivity between the advertiser and the site, app, or other
digital device where the ad is displayed.

• Use on Site Only: Large search engines and social media sites often offer the ability for
brands to advertise on their sites to users based on the information they have about the user.
This information may be only the usage history of the site by the user, or it may be
enhanced with other offline marketing data. While the advertiser does not actually take
possession of the data, they do get the benefit of using it to target their messages on that site.
• Onboarding: This is the process of taking PII from offline sources and going through a
process to strip away the identifying information, rendering it Non PII before matching it to
a cookie or other digital ID. It can then be used in the digital space and synced with other
data for advertising purposes.
• Cookie and/or Device Syncing: Syncing digital IDs is done in several different ways, but
results in one digital ID (e.g., cookie ID from company A) being connected to another
digital ID (e.g., cookie ID from company B), thus allowing an exchange of data.

who buys data from data brokers?


The vast majority of data brokers sell only to qualified businesses, political organizations or
government agencies. Virtually every consumer facing company in the financial services, health
care, retail, publishing, cataloging, travel and entertainment, telecommunications, and
38 Jennifer Barrett Glasgow

technology industries, regardless of size, are users of lists and enhancement data as well as
onboarded data from data brokers. Many of these organizations use people search and risk
mitigation services, as well.
However, few data brokers, such as research, look up, and locate brokers, sell directly both to
individuals and organizations. The individual may be searching for an old friend or family
member, or the organization may be looking to track down a bad debt.
Ultimately, size does not matter. Almost every marketer, from mom and pop operations to the
largest, most sophisticated financial institutions, buys lists and enhancement data from data
brokers. For startups, data reduces barriers to market entry. The 2015 DDMI study, “The Value
of Data,” found that small businesses benefit significantly from third party marketing data
because it enables them to compete effectively with big players and enter markets with lower
start up costs. Prospect lists are a cost effective way to build a customer base over time. Well
known brands, such as Montgomery Ward, built their businesses by buying prospect lists and
sending them catalogs in the mail. For decades, the non profit sector has also been a big user of
lists and enhancement data to acquire new donors and raise funds for charitable causes.
For years, political candidates have used lists for fundraising purposes, but recently, they have
become very sophisticated in their use of digital data from data brokers. Many states allow voter
registration lists to be used by candidates for fundraising and to get out the vote. The Obama
campaign was one of the most sophisticated users of data to motivate people to vote and raise
support. Since then, almost all candidates and political parties have used these techniques.
Federal, state, and local government agencies use identity verification and risk mitigation
services on a regular basis.

how do consumers learn about and interact with data brokers?


How to make data brokers more transparent to consumers is a question and concern that has
been hotly debated for many years. Each category of data broker has different levels of visibility
and challenges for greater transparency.
Consumers learn over time about public records and publicly available information, as they
encounter them in their daily lives. However, few consumers understand the full scope of this
information or the benefits and risks associated with publicly available information. Furthermore,
it is a difficult process for consumers either to correct or remove inaccurate information from
public sources. They must know where the data came from and deal directly with the source. If the
data a broker has is accurate with respect to the source, there is no obligation for the data broker to
correct or delete it. Public records and publicly available information represent foundational data
used by all categories of data brokers, therefore, the impact of inaccurate information can be quite
widespread. Fortunately, public records are known for a pretty high degree of accuracy.
Most research, look up, and locate services are on the Internet, so it is pretty easy for
consumers to search and learn about these kinds of services. This is one of the few categories
of data brokers that serve consumers as customers. However, consumers are unlikely to fully
understand the scope of commercial use by this category of data broker. Some, but not all, of
these data brokers offer consumers access, correction, and opt out choices.
Consumer reporting agencies offer access and correction rights as prescribed by the FCRA.
This is an important provision of the law because inaccurate information can have big negative
effects on an individual. Since the United States has determined that there is societal good in
having all of one’s financial history part of a consumer report, individuals generally cannot opt
out or have their data deleted, if it is accurate and up to date. Consumers can only correct or
Data Brokers: Should They Be Reviled or Revered? 39

remove information in their consumer report if it is inaccurate or obsolete, or if the CRA is


unable to verify the disputed information.
The consumer usually does not know there is a risk mitigation data broker involved in a
transaction, unless their identity is not verified. In such instances, the company using the
verification service will employ an alternate means to verify an individual’s identity. The lack
of information may never be reported back to the data broker providing the service. Fortunately,
contentious users of risk mitigation services will refer the consumer to the data broker for access
and correction, if the data is wrong or out of date.
Responsible marketing data brokers offer a choice to consumers to opt out of having information
about them shared with others for marketing purposes. This is required by marketing self regulatory
codes of conduct. However, not all marketing data brokers are members of the trade associations
endorsing these codes and some do not follow the guidance as carefully as they should.
In 2013, Acxiom launched the website [Link], where individuals can provide
identifying information, and once authenticated, view, correct or delete all the data Acxiom has
about them that it sells for marketing purposes, both offline and in the digital space. Even after
extensive news coverage of the site when it was first launched in 2013, and sporadic media reference
ever since, fewer than a million consumers have actually visited the site, with about one third
actually logging in. Of those, just over 60 percent make some changes and about 5 percent opt out.
Fewer than 20 percent return to the site to see what data may have changed or been added over time.
No other marketing data broker has followed Acxiom’s lead with full disclosure about the data
they have on consumers. While large companies offer the individual the ability to opt out from
data about them being licensed for future marketing purposes, as called for by the codes of
conduct put out by the Direct Marketing Association,24 the Digital Advertising Alliance,25 and
the Network Advertisers Initiative,26 they don’t offer the ability to view the actual data.
In addition to offering an individual the ability to opt out from one broker, reputable
marketing data brokers also use the Direct Marketing Association’s industry wide Mail Prefer
ence and E mail Preference suppression files,27 the Direct Marketing Association’s Commit
ment to Consumer Choice opt out service,28 as well as state and federal do not call lists in the
development of their marketing and advertising products. These are all industry wide registries
where the individual signs up requesting the opt out.
Marketers are under one additional obligation when they use data from a data broker. If a
consumer asks where they got the data, they are required under the Direct Marketing Associ
ation’s code of conduct to tell the individual what data broker provided the data.

how are data brokers regulated?


Some say the data broker community is virtually unregulated, but this claim does not recognize the
different categories of data brokers that are highly regulated, such as CRAs. While no one overarch
ing federal law regulates all categories of data brokers, there are a host of sector or data specific

24
Direct Marketing Association (DMA) Guidelines for Ethical Business Practice, [Link]
DMA Guidelines January [Link].
25
Digital Advertising Alliance (DAA) Self-Regulatory Program, [Link]
26
Network Advertiser’s Initiative (NAI) Code of Conduct, [Link]
27
DMA Mail Preference and Email Preference Suppression, [Link]
28
DMA Commitment to Consumer Choice, [Link] other DMA require
[Link].
40 Jennifer Barrett Glasgow

federal and state laws as well as a number of codes of conduct with which various data brokers must
comply.
Generally, laws and self regulations focus on either the data itself or its intended use. Many of
these laws and codes apply to both data brokers and their clients. Some of the more important
laws and codes of conduct that impact data brokers are briefly described below.
A number of federal and state laws focus on limiting the use of certain types of data.

• Gramm Leach Bliley Act (GLBA) and Similar State Laws: Among other things, the
29

Gramm Leach Bliley Act requires financial institutions companies that offer consumers
financial products or services such as loans, financial or investment advice, or insurance
to explain their information sharing practices to their customers and to safeguard sensitive
data. Customers are given a notice of the institution’s practices and their rights at the time
the relationship is established and whenever the policy changes. The GLBA Safeguards
Rule requires financial institutions provide reasonable security for personal financial
information.
Personal financial information cannot be shared with third parties for marketing
purposes (e.g., marketing data brokers) without notice and the offer of an opt out choice
to customers. The law allows personal financial information to be shared with credit
bureaus and identity verification and anti fraud services.
GLBA allows states to pass more restrictive laws, and since the passage of the Califor
nia Financial Information Privacy Act30 that calls for opt in for sharing financial infor
mation with third parties for marketing purposes, this has become the de facto choice for
the whole country.
• Health Insurance Portability and Accountability Act (HIPAA): The HIPAA Privacy Rule
31

establishes national standards to protect individuals’ medical records and other protected
health information. The law applies to health plans, health care clearinghouses, and those
health care providers that conduct certain health care transactions electronically. The law
does not apply to the growing number of health device manufacturers unless the device is
prescribed by a physician. The rule requires appropriate safeguards be in place and sets
limits and conditions on the uses and disclosures that may be made of such information
without patient authorization (e.g., to third party marketing data brokers). The rule also
gives patients rights over their health information, including rights to examine and obtain a
copy of their health records, and to request corrections.
• Children’s Online Privacy Protection Act (COPPA): COPPA imposes requirements to
32

obtain verifiable parental consent, with limited exceptions, when collecting personal infor
mation from a child under thirteen years of age. The requirements apply to websites, online
services, and apps directed to children under thirteen and when they have actual knowledge
that they are dealing with a child under thirteen. Parents must consent to the collection of
personal information and must separately consent to that data being shared with a third party.
Parents can also have access and deletion of their child’s information. These websites, online
services, and apps must also maintain the security of the information.

29
FTC advice on GLBA, [Link]
30
California Financial Information Privacy Act, Financial Code Section 4050–4060, [Link]
displaycode?section fin&group 04001-05000&file 4050-4060.
31
Health Insurance Portability and Privacy Act (HIPPA), [Link]
32
Children’s Online Privacy Protection Rule (COPPA), [Link]
reform-proceedings/childrens-online-privacy-protection-rule.
Data Brokers: Should They Be Reviled or Revered? 41

• Social Security Death Master File (DMF): As a result of a court case under the Freedom
33

of Information Act, the Social Security Administration (SSA) is required to release its death
information to the public. SSA’s Death Master File contains all information in the official
SSA database, as well as updates to the file of other persons reported to SSA as being
deceased. SSA authorizes the use of this database in identity verification solutions. A recent
law limits access to Death Master File records that are less than three years old.
• Driver’s Privacy Protection Acts (DPPAs): The federal DPPA and numerous state DPPAs
34

protect the confidentiality and privacy of records from state departments of motor vehicles.
These laws prohibit disclosure of personal information obtained by the department in
connection with a motor vehicle record, except as expressly permitted in the law. The law
allows disclosures to verify the accuracy of information provided by the consumer, thus
allowing such records to be used by CRAs and other risk mitigation services.
• Voter Registration Files (Voter Files): The use of state voter registration files is governed
35

by laws that vary widely in each state. Some states restrict the use to political purposes or by
political candidates and parties and prohibit their use for marketing purposes, while voter
records in other states are not restricted, at all.
• Real Property Files (Property Files): The use of property files is governed by laws that vary
36

widely in each state. For example, in Arkansas anyone may look up a property record online
and find current property value, previous sale price, and characteristics of the property,
such as number of bedrooms and bathrooms, square feet, year built, land value, and nearby
schools. Property files are widely used by marketing data brokers, especially in the real
estate sector, and are also valuable for identity verification and anti fraud services. In other
states, property records cannot be used for general marketing purposes.
A number of federal and state laws focus on limiting data for specific uses.

• Fair Credit Reporting Act (FCRA): This federal law requires third party entities that
37

provide information related primarily to consumer credit, employment, insurance, and


other eligibility decisions, known as consumer reporting agencies (CRAs), to adopt reason
able procedures with regard to the confidentiality, accuracy, relevancy, and use of such
information.
This includes specific obligations related to the companies that can access the data, a
consumer report, and for what purposes. Any access for purposes of credit, employment,
insurance, and other eligibility decisions is covered by the law, regardless of the data. The
law also includes specific obligations to respond to consumer inquiries about the data in a
consumer report about them and rights to correct inaccurate data.
• FTC Do Not Call Registry (DNC): The National Do Not Call Registry gives consumers
38

a choice about whether to receive telemarketing calls on a land line or mobile phone. An
individual can register their land line or mobile phone number for free.

33
Social Security Administration Death Master File, [Link]
34
Federal Driver’s Privacy Protection Act (DPPA), [Link]
35
Florida Voter Registration website, [Link]
place/.
36
Arkansas Real Property website, [Link] source bing&utm medium
cpc&utm campaign [Link]%20-%20Tax&utm term property%20tax%20much&utm content how%
20much%20is%20property%20tax.
37
Fair Credit Reporting Act (FCRA), [Link]
[Link].
38
FTC Do-Not-Call Registry, [Link]
42 Jennifer Barrett Glasgow

Telemarketers should not call a number on the registry, unless there is a preexisting
business relationship with the consumer. If they do, an individual can file a complaint at
the FTC website. This law applies to information provided by data brokers for
telemarketing.
• Federal Trade Commission Act Section 5 (FTC Unfair or Deceptive Powers): Section
39

5 of the Federal Trade Commission Act (FTC Act) prohibits ‘‘unfair or deceptive acts or
practices in or affecting commerce.’’ The FTC can deem certain practices of a data broker
to be unfair and/or deceptive. This is a broad authority and has been used to obtain consent
agreements with a number of data brokers, such as ChoicePoint and Spokeo.
• State Unfair and Deceptive Practices Acts (UDAP Laws): Every state has one or more
40

consumer protection laws that are generally referred to as UDAP laws. For example, in
Texas the attorney general files civil lawsuits under the Deceptive Trade Practices Act and
other consumer protection statutes. The decision to investigate or file a lawsuit is based on a
number of factors. Consumer complaints filed with the attorney general may form the basis
for an investigation into a company’s business practices. In some cases, significant numbers
of complaints about a business may give rise to legal action, not on behalf of the individual
complainants, but to enforce state law.
• California Online Privacy Protection Act (CalOPPA): This law requires operators of
commercial website and mobile apps to conspicuously post a privacy policy if they collect
personally identifiable information from Californians. The policy must describe the infor
mation collection, use and sharing practices of the company, how the site responds to Do
Not Track signals, and whether third parties may collect personal information about
consumers who use the site. The website should also describe the choices offered to the
consumer regarding sharing of this information.
Note: This short list only represents a sample of the best known and oft cited federal and state
laws applicable to certain data broker practices.
Where no laws govern either the data or certain uses of data, especially in the marketing
space, but where regulators and the public have expressed concerns about certain commercial
practices, industry has put forth self regulatory codes of conduct that have been widely adopted.
Some of the important codes of conduct that affect marketing data brokers are discussed here.
Direct Marketing Association (DMA):41 The DMA’s Guidelines for Ethical Business
Practice are intended to provide individuals and organizations involved in direct marketing
across all media with generally accepted principles of conduct. The guidelines are used by
DMA’s Ethics Operating Committee, a peer review committee, as the standard by which
DMA investigates consumer complaints against members and nonmembers. The guide
lines include requirements provide information on its policies about the transfer of
personally identifiable information for marketing purposes, respond to inquiries and com
plaints in a constructive, timely way, maintain appropriate security policies and practices to
safeguard information, honor requests not to have personally identifiable information
transferred for marketing purposes, and honor requests not to receive future solicitations
from the organization.

39
Federal Trade Commission Act Section 5, [Link]
40
State of Texas Deceptive Trace Practices Act, [Link]
41
Direct Marketing Association (DMA) Guidelines for Ethical Business Practice, [Link]
DMA Guidelines January [Link].
Data Brokers: Should They Be Reviled or Revered? 43

Digital Advertising Alliance (DAA):42 The Digital Advertising Alliance establishes and
enforces responsible privacy practices for certain types of digital advertising, providing
consumers with enhanced transparency and control. DAA principles apply to data gathered
from a particular device in either the desktop or mobile environments that involves multi
site data collection and use. The DAA is an independent non profit organization led by the
leading advertising and marketing trade associations.
Network Advertiser’s Initiative (NAI):43 NAI is a non profit organization that is the
leading self regulatory association dedicated to responsible data collection and its use for
digital advertising. Since 2000, it has worked with the leaders in online advertising to craft
policies that help ensure responsible data collection and use practices. The result is the
development of high standards that are practical and scalable to benefit everyone. The NAI
Code of Conduct is a set of self regulatory principles that require NAI member companies
to provide notice and choice with respect to interest based advertising and ad delivery and
reporting activities.
In recent years, there have been calls for data brokers to be more regulated. These initiatives
usually focus on one type of data broker, often marketing data brokers due to the sheer quantity
of information they collect. Regulators focused on shortcomings in the FCRA would like the
FCRA to include practices that are on the fringe of the law. They also have concerns about
medical devices that fall outside HIPPA. However, as of this writing, none of these initiatives
have progressed.
The marketing self regulatory bodies listed here continue to expand their codes as new
mediums and new types of data enter the marketplace. Self regulatory initiatives, in sectors
such as smart cars, are starting to emerge more and more, and we expect these types of codes to
continue to evolve as consumers embrace the IoT.

are there data brokers outside the united states?


For a number of reasons, there are far more data brokers within than outside the United States.
Other countries have fewer public records and publicly available information. Also, many
countries, such as those in Europe, have broad data protections laws, which limit data sharing.
Both attitudes and laws governing credit are quite different outside the United States. Laws
requiring identity verification and background checks typically do not exist in other countries.
Consequently, other countries have fewer first and third party data brokers.
In the credit bureau space, a research group known as PERC44 reports, “approximately
3.5 billion of the world’s adults do not use formal financial services. Even when access to a
formal lending institution is possible and many lack access, especially in rural areas most of
these persons are ‘Credit Invisibles.’ Credit Invisibles have no credit data, and mainstream
lenders use automated underwriting systems requiring this data such as a credit report. When
none is available, lenders automatically reject an applicant.”
The DDMI “Value of Data” study reports, “The DDME is a uniquely American creation.
Just as the U.S. created digital market making media by commercializing the Internet browser
in the 1990s, so it created postal market making media when Montgomery Ward developed the

42
Digital Advertising Alliance (DAA) Self-Regulatory Program, [Link]
43
Network Advertiser’s Initiative (NAI) Code of Conduct, [Link]
44
PERC drives financial inclusion by using innovative information solutions using original research that serve unmet
needs in the market, [Link]
44 Jennifer Barrett Glasgow

mail order catalog in 1872. Today, data driven marketing is a major export industry. The study’s
employment analysis confirms that the DDME is a Net (export) contributor to US economic
well being. DDME firms derive a considerable portion of their revenue abroad (sometimes
upwards of 15%) while employing nearly all their workers in the U.S. The study confirms that the
U.S. leads the world in data science applied to the marketplace. Ideas developed in the U.S. by
American statisticians and econometricians, running on U.S. designed hardware, and coded in
algorithms developed and tested in the research offices of U.S. firms, are used to generate
revenues throughout the world.”
While this study was focused on marketing, and no comparable studies exist for other types of
data brokers, the relative financial fraud rates are dropping, so there is likely value created in this
sector. Thus we can extrapolate that at least some value can be applied to most types of data
brokers.
There are a few self regulatory efforts by direct marketing associations in Europe and other
developed countries in Asia, and the DAA has expanded into Europe and Canada.
The US economy has enjoyed the innovation and positive economic benefits that come from
the robust use of data provided by data brokers. As technology rapidly moves forward, continued
advancements in self regulation are needed to keep pace. Such guidance can respond to
changes in the marketplace faster than can legislation and should be aggressively supported by
industry.

what risks do data brokers pose?


While the benefits that robust uses of information are significant, there are a number of very real
risks involved with data brokering. While some risks are common across the data broker
community, some are unique to certain categories of data brokers.

Risks Common to All Data Brokers:

• Security: Probably the biggest risk for data brokers is poor security, where fraudsters can
compile enough of a profile on an individual to steal their identity or successfully pose as a
trusted party. Reasonable and appropriate security is a requirement for every data broker.
• Potential Discrimination: As analytics gets more sophisticated, the second common risk is
that we can no longer simply rely on not using certain defined data points, such as age, race
or marital status, to avoid discriminatory consequences. Analytics can predict these charac
teristics with a fairly high degree of accuracy. This is actually a problem with big data in
general, and not limited to data brokers, but data brokers contribute to the issue by bringing
more data into the analytics process. As mentioned earlier, the World Privacy Report on the
growth of consumer scores raises issues of discrimination. All companies taking advantage
of big data, including data brokers, must look for ways to discover when their practices have
adverse effects on certain at risk classes. The 2016 FTC study, “Big Data A Tool for
Inclusion or Exclusion,”45 provides an in depth analysis of this issue.
• Non Compliance: Large players in the community usually do a better job of following the
rules than smaller players do. They have more at stake and usually better understand what is

45
FTC report, Big Data A Tool for Inclusion or Exclusion?, [Link]
data-tool-inclusion-or-exclusion-understanding-issues/[Link].
Data Brokers: Should They Be Reviled or Revered? 45

expected of them. As a greater number of smaller players enter the marketplace, the risk of
more data brokers acting out of compliance with laws and self regulation may grow.
A summary of risks to specific categories of data brokers follows.

Providers of Public Information, Including Government Agencies, Professional


Organizations, and Research, Look Up, and Locate Services:
These organizations rely primarily on public records and publically available information. The
overall benefits of this information being available to the public outweigh any risks of identity
theft and fraud by helping locate the perpetrators of these crimes.

• Opt Out for High Risk Individuals: While some avenues exist, there must be more ways for
individuals who are at high risk (e.g., pubic officials, battered wives, and individuals in
witness protection) to block public access to their personal information.

Consumer Reporting Agencies (CRAs):


Most of the risks associated with CRAs are addressed in the federal and state laws governing these
practices.

• Accuracy: The biggest risk is the potentially devastating effects inaccurate information can
have on individuals, financially and otherwise. The community has been under criticism
for the accuracy of their records. The 2014 FTC study, “Report to Congress Under Section
319 of the Fair and Accurate Credit Transactions Act of 2003,”46 in 2015, which was a
follow up to their 2012 study, highlights the findings.

Risk Mitigation Services:

• Accuracy: Obviously, the need for a high degree of information accuracy is also critical in
this category of data broker. However, risks are low. Due to the inherent latency of
information used for these purposes, the services that such data brokers provide is under
stood not to be 100 percent effective, so alternate methods of verification are always
provided by the user of the service.

Marketing Data Brokers:


While accurate information is good, the consequences of inaccuracies are not nearly as
important for this, the largest category of data broker.

• Transparency: As reported in the FTC report on data brokers, concerns relative to


marketing data brokers relate primarily to transparency. Consumers do not generally read
privacy policies, so they do not know or understand that first party data brokers, survey
companies, and ad supported websites and apps are selling their information to marketers
and advertisers, and third party data brokers are aggregating it with public records. While

46
FTC, Report to Congress Under Section 319 of the Fair and Accurate Credit Transactions Act of 2003 (2015), https://
[Link]/system/files/documents/reports/section-319-fair-accurate-credit-transactions-act-2003-sixth-interim-final-
report-federal-trade/[Link].
46 Jennifer Barrett Glasgow

self regulation promotes more robust transparency, marketing data brokers need to consider
even more creative ways to engage with consumers about both the benefits and risks of data
sharing for advertising and marketing purposes, so consumers can make informed decisions
about what they are comfortable allowing and where the line is for acceptable use.
• Education: While many self regulatory codes call for the community to better educate
consumers about marketing data brokers, privacy policies are not a good way to explain the
ecosystem and how data actually is shared and impacts consumers. The Better Business
Bureau Institute for Marketplace Trust recently launched Digital IQ47 to help consumers
easily access the information they want on the Internet, express their preferences, exercise
informed choices, and shop smart. It provides a digital quiz and short, easily digestible
education modules to help consumers be more savvy shoppers.

what does the future hold for data brokers?


So, are data brokers doing great things for our economy or are they operating a personal data
exchange behind the backs of consumers? The answer to both questions, to some degree, is yes.
Responsible use of data does provide great benefits to our economy, to innovation, and to
consumer convenience. However, most individuals do not understand the data broker commu
nity, the risk it poses, and the benefits they derive from it.
With big data and the Internet of Things accelerating data collection at an increasingly rapid
pace, more and more companies are going to become first , second , and third party data
brokers. This means it is getting harder and harder for individuals even to know about, much
less control, how data about them is collected, created, used, and shared.
This begs the question: What, if anything, should be done to make the practices of the data
broker community more transparent and less risky while preserving the benefits?
Data brokers must take an ethical, not just a compliance oriented, approach to their practices
and look for innovative ways to create a more transparent environment for regulators and provide
more informed engagements that explain when and how consumers can participate in the use of
data about them.
Fortunately, experience tells us that, if the practices of data brokers actually result in real
harms, either tangible or reputational, or other risks to individuals, over time this will damage
consumer confidence and is likely to lead to restrictive legislation and ultimately limit access to
data. This will have a negative impact on the data broker community itself, and in turn, will have
negative economic implications on society.
While new US federal legislation regulating data brokers is unlikely in the next few years,
these moments in time represent a great opportunity for the data broker community to expand
their self regulatory practices. As appropriately protecting consumers against harm and other
risks becomes more and more contextual, the data broker community, in all its various forms,
working with regulators and advocates, has the best chance of writing workable guidelines that
benefit everyone. Time will tell whether it seizes this window of opportunity or not.

47
BBB Digital IQ, [Link]
3

In Defense of Big Data Analytics

Mark MacCarthy*

the rise of big data analytics


Changes in Computer Technology
Today analytics firms, data scientists and technology companies have valuable new tools at their
disposal, derived from three interlocking developments in computer technology. Data sets have
dramatically increased in volume, variety and velocity. Processing capacity and storage capacity
have increased, accommodating and reinforcing these changes. And new analytic techniques
that were ineffective at lower scale and slower processing speeds have had spectacular successes.
Big data consists of data sets with increased volume, variety of formats and velocity.1 Data sets,
especially those derived from Internet activity containing video and images, are massive. The
Internet of Things adds data from sensors embedded in everyday objects connected online.
In addition, big data sets come in a variety of unstructured and semi structured formats such as
text, images, audio, video streams, and logs of web activity. Finally, big data sets change with
astonishing rapidity in real time. A major driver of the increased availability of data is the
computer communications networks that have been growing at astonishing rates since the
Internet went commercial in the early 1990s.
According to IBM, increases in the amount of available data are staggering: “Every day, we
create 2.5 quintillion bytes of data so much that 90% of the data in the world today has been
created in the last two years alone.”2 According to Cisco, the Internet of Things will generate
more than 400 zetabytes of data per year by 2018.3
For more than fifty years, processing speeds and computer memory have doubled every
eighteen to twenty four months. The 1985 Nintendo Entertainment System had half the
processing power of the computer that brought Apollo to the moon in 1969. The Apple iPhone

* The views expressed in this chapter are those of the author and not necessarily those of the Software & Information
Industry Association (SIIA) or any of its member companies.
1
The standard definition of big data includes volume (i.e., the size of the dataset); variety (i.e., data from multiple
repositories, domains, or types); velocity (i.e., rate of flow); and variability (i.e., the change in other characteristics).
National Institute of Standards and Technology (NIST) Big Data Public Working Group Definitions and Taxonomies
Subgroup, NIST Big Data Interoperability Framework, Volume 1: Definitions, NIST Special Publication (SP) 1500–1,
September 2015, p. 4, available at [Link]
2
“2.5 Quintillion Bytes of Data Created Every Day,” IBM, April 24, 2013, available at [Link]
insights-on-business/consumer-products/2-5-quintillion-bytes-of-data-created-every-day-how-does-cpg-retail-manage-it/
3
“The Zetabyte Era: Trends and Analysis,” Cisco, available at [Link]
provider/visual-networking-index-vni/[Link].

47
48 Mark MacCarthy

5 has 2.7 times the processing power of the 1985 Cray 2 supercomputer. Today’s Samsung Galaxy
S6 phone has five times the power of Sony’s 2000 PS2. It would take 18,400 of today’s PS4s to
match the processing power of one of today’s supercomputers, the Tianhe 2.4 This rate of
change has driven spectacular improvements in value for consumers, but most importantly it
has allowed analysis to move in directions that had previously been thought to be unproductive.
New analytic techniques can discover in data connections and patterns that were often
invisible with smaller data sets and with older techniques. Earlier researchers would approach
a defined data set with a well formulated hypothesis and proceed to test it using standard
statistical techniques such as multi variate regression analysis. Researchers brought background
knowledge, theoretical understanding and intuitions into the process of hypothesis creation and
hoped to find a pattern in the data that would verify this hypothesis. But the data themselves
were silent and would tell him nothing. In contrast, new analytic techniques based on machine
learning discover connections in the data that the researcher had not even dreamed of. The data
speak for themselves, leading to completely novel and unexpected connections between factors
that had previously been thought of as unrelated.

Machine Learning and Artificial Intelligence


Artificial intelligence and machine learning are examples of big data analytics. Machine
learning is a programming technique that teaches machines to learn by examples and prece
dents. Artificial intelligence is a generic name for a variety of computational techniques that
allow machines to exhibit cognitive capacities.5 Its current success in pattern recognition tasks
such as speech or object recognition is a natural outgrowth of the developments in computer
technology that we have just described.
The increase in the availability of data and computing power has enabled a previous version
of AI research to move forward dramatically. The initial approaches to AI fit the model of
structured programming prevalent in the 1950s. Computers could only do what they have been
programmed to do. So the field focused on finding simplifying rules that were obtained from
subject matter experts, like doctors, lawyers or chess masters. However, the resulting “expert
systems” were not very effective.
A different approach to programming was initially known as neural networks and came to be
called machine learning. It sought to “create a program that extracts signals from noise in large
bodies of data so those signals can serve as abstractions for understanding the domain or for
classifying additional data.”6 This alternative approach required programmers simply to present

4
“Processing Power Compared,” Experts Exchange, available at [Link]
compared/.
5
Related definitions focus on the ability of machines to function intelligently in their environment, where “intelli-
gently” refers to elements of appropriate behavior and foresight. See Report of the 2015 Study Panel, Artificial
Intelligence and Life in 2030: One Hundred Year Study on Artificial Intelligence, Stanford University, September
2016, p. 7, available at [Link] 100 report [Link] (Study Group Report).
Artificial intelligence is not necessarily aimed at mimicking or reproducing human intelligence. One of the founders
of AI, John McCarthy, said that the idea behind AI was to “get away from studying human behavior and to consider the
computer as a tool for solving certain classes of problems.” AI researchers weren’t “considering human behavior except
as a clue to possible effective ways of doing a task . . . AI was created as a branch of computer science and as a branch of
psychology.” John McCarthy, “Book Review of B. P. Bloomfield, The Question of Artificial Intelligence: Philosophical
and Sociological Perspectives,” Annals of the History of Computing, vol. 10, no. 3 (1998), available at [Link]
.[Link]/jmc/reviews/bloomfi[Link].
6
Jerry Kaplan, Humans Need Not Apply: A Guide to Wealth and Work in the Age of Artificial Intelligence, Yale
University Press, 2015, p. 212 (Kaplan).
In Defense of Big Data Analytics 49

sufficient examples of the task they wanted the computer to solve. However, this alternative
approach faced its own theoretical and practical limitations, and was largely abandoned in the
1980s. In the 1990s and 2000s, however, it reappeared and rapidly made progress in pattern
recognition, displacing its rival approach as the dominant approach in the field.7
The structured approach to programming was suited for the size and scale of the computer
capacity of its time limited processing speeds, memory and data. The machine learning
approach could not demonstrate results with this computer architecture. Computer memory
and processing speeds were so limited, machine learning programs could recognize only very
simple patterns. Data sets did not contain a sufficient number of examples to generate accurate
pattern recognition.
The new computer infrastructure, however, allowed more flexible programming techniques.
Faster computers with larger memory could begin to recognize complex patterns if they had
sufficient data to be trained on. The Internet provided just such a treasure trove of training data.
In this new environment, machine learning rapidly developed.
Machine learning programs get better as they are exposed to more data, which the
spectacular growth of the Internet has been able to provide in ever increasing amounts.
The programs adjust themselves as they are exposed to new data, evolving not only from the
original design of the program but also from the weights developed from their exposure to
earlier training data.
Because these new machine learning techniques are not pre programmed with humanly
created rules, their operation can sometimes resist human comprehension. Often, it is “impos
sible for the creators of machine learning programs to peer into their intricate, evolving structure
to understand or explain what they know or how they solve a problem.”8 In addition, they rely on
correlations found in data, rather than on empirically or theoretically comprehensible causal
connections: “In a big data world . . . we won’t have to be fixated on causality; instead we can
discover patterns and correlations in the data that offer us novel and invaluable insights.”9
Despite the difficulty in discerning the logical or causal structure uncovered by these
machine learning algorithms, they are increasingly moving out of computer science depart
ments and providing substantial benefits in real world applications.

Applications of Big Data Analytics


A common view is that big data analytics has a natural home in science and research institutes,
information technology companies, analytics firms and Internet companies that use it for online
behavioral advertising or recommendation engines for news, music and books. According to this
view, big data analytics lives in computers or smart phones or communications networks such as
the Internet, and we should look for its benefits and challenges there.
But this is to misunderstand the reality and potential of this disruptive technology. Big data
analytics is not confined to separate devices called computers or smart phones, or used only in the
information and communications technology industries. We are just at the beginning of the
application of these techniques in all domains of economic, political and social life. It will
transform everyday life for everyone, creating enormous opportunities and challenges for all of us.

7
Kaplan, p. 25.
8
Kaplan, p. 30.
9
Viktor Mayer-Schonberger and Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work,
and Think, Houghton Mifflin Harcourt, 2013, p. 14 (Mayer-Schonberger and Cukier).
50 Mark MacCarthy

Recent studies document the domains in which it is being used and where its impact is likely
to be greatest in the coming years, including a 2014 big data report from the Obama adminis
tration10 and a 2016 report by the AI Study Group, a panel of industry and academic experts.11

Transportation
The AI Study Group report predicts that in North American cities by 2030 autonomous
transportation would be commonplace including cars, trucks, flying vehicles and personal
robots. Cars will be better drivers than people, who will own fewer cars and live farther from
work.12 McKinsey reports that all major car manufacturers as well as technology companies such
as Google are gearing up to provide autonomous vehicles, and that sometime between 2030 and
2050 autonomous vehicles will become the primary means of transportation, reducing accidents
by 90 percent, thereby saving billions of dollars and billions of square meters in parking space.13
This development is especially striking because it illustrates the surprising capacity of
AI systems to overcome problems of context and perception that were thought just a few years
ago to be well beyond the capacity of computer programs. Computers could not handle driving,
it was thought, because they could not process such a large amount of unstructured visual data
and because the rules to mimic how humans assessed and reacted to traffic would be impossibly
complex. A computer could never understand the context of driving well enough to know that
“a ball rolling into the street is often followed by a young child chasing the ball so you step on
the brakes.”14 But this is exactly what autonomous vehicles are able to do successfully by
following a model of computer programing that is not limited to pre defined rules.

Speech Recognition
Several years ago, people noticed that an early version of Siri could not answer the question,
“Can a dog jump over a house?” They explained this failure by saying that “engineers don’t yet
know how to put enough common sense into software.”15 But that is not how advanced versions
of speech recognition work.
Older attempts to computerize speech recognition assumed that computers would have to
understand the context of an utterance and absorb large amounts of cultural material in order to
recognize speech. Newer versions look instead for how often words appear in certain combin
ations, taking advantage of the vast increase in available examples of language use. The
distinctions between “to,” “two” and “too” can be identified statistically, rather than through
mimicking a human process of understanding contexts. Even rare combinations can be detected
if the amount of training data is large enough.16
10
United States, Executive Office of the President, Big Data: Seizing Opportunities, Preserving Values, The White
House, May 2014, available at [Link] data privacy report may 1
[Link] (White House Big Data Report).
11
Study Group Report.
12
Study Group Report, p. 4.
13
Michele Bertoncello and Dominik Wee, “Ten Ways Autonomous Driving Could Redefine the Automotive World,”
McKinsey, June 2016, available at [Link]
ways-autonomous-driving-could-redefine-the-automotive-world.
14
Frank Levy and Richard J. Murnane, “Dancing with Robots: Humans Skills for Computerized Work,” Third Way, 2013,
p. 9, available at [Link] (Levy and Murnane).
15
Levy and Murnane, p. 10.
16
“A good speech recognition system that ‘hears’ the sentence ‘my last visit to the office took two hours too long’ can
correctly spell the ‘to,’ ‘two,’ and ‘too.’ It can do this not because it understands the context of the usage of these words
In Defense of Big Data Analytics 51

Complete language fluency might still elude computerization.17 But speech recognition is
increasingly accurate with Baidu’s Deep Speech 2 at 96 percent accuracy and Apple’s Siri at
95 percent and is on the way to being used in search applications.18 Speech recognition is
ubiquitous today in our smart phones and increasingly in consumer devices such as Amazon’s
Echo that allow people to talk to their houses.19

Health Care
The AI Study Report finds that the last several years have seen an immense leap forward in
collecting useful data from personal monitoring devices and mobile apps and from electronic
health records in clinical settings. As a result, AI based applications could improve health
outcomes and the quality of life for millions of people in the coming years.20
Research hospitals are already using IBM’s Watson as an oncology diagnosis and treatment
advisor and to select patients for clinical trials. The system synthesizes vast amounts of data from
textbooks, guidelines, journal articles, and clinical trials to help physicians make diagnoses and
identify treatment options for cancer patients. Medical decision making will become “ever more
scientific” while remaining a creative activity for the doctors and health care professionals
involved.21 The promise of intelligent systems that can aid diagnosis and treatment of disease
is better quality care and lower cost for all patients, but especially for those who currently face
health care barriers.22
One particularly striking example involves the use of data analytics to save premature babies at
risk. Medical researchers used pattern recognition to analyze data generated from premature
babies such as heart rate, respiration rate, temperature, blood pressure and blood oxygen level
with startling results. The simultaneous stabilization of vital signs as much as twenty four hours
in advance was a warning of an infection to come, thereby allowing medical intervention well
before a crisis had developed. AI had discovered a useful fact about the onset of fevers and
infections in premature babies that can become standard practice for early intervention.23

Consumer Credit
Credit scoring models have been used for decades to increase the accuracy and efficiency of
credit granting. They help as many people as possible to receive offers of credit on affordable

as human beings do, but because it can determine, statistically, that ‘to’ is much more likely immediately to precede
‘the office’ than ‘two’ or ‘too’. And this probability is established, effectively, by very fast searching and sorting across a
huge database of documents.” Richard Susskind and Daniel Susskind, The Future of the Professions: How Technology
Will Transform the Work of Human Experts, Oxford University Press, 2015, pp. 186–187 (Susskind and Susskind).
Similarly, good speech recognition programs use frequency not context to distinguish between “abominable” and
“a bomb in a bull” (p. 275).
17
Will Knight, “AI’s Language Problem,” MIT Technology Review, August 9, 2016, available at [Link]
[Link]/s/602094/ais-language-problem/.
18
Kevin J. Ryan, “Who’s Smartest: Alexa, Siri, and or Google Now?” Inc., June 3, 2016, available at [Link]
kevin-j-ryan/[Link].
19
Ry Crist and David Carnoy, “Amazon Echo Review: The Smart Speaker That Can Control Your Whole House,”
C|Net, February 15, 2016, available at [Link]
20
Study Panel Report, p. 4.
21
John Kelly III and Steve Hamm, Smart Machines: IBM’s Watson and the Era of Cognitive Computing, Columbia
Business School Publishing, Columbia University Press, 2013, p. 138.
22
Laura Lorenzetti, “Here’s How IBM Watson Health Is Transforming the Health Care Industry,” Fortune, April 5,
2016 available at [Link]
23
Mayer-Schonberger and Cukier, p. 60.
52 Mark MacCarthy

terms; and they allow lenders to efficiently manage credit risk. The models improve upon the
older judgmental systems that relied excessively on subjective assessments by loan officers.
Traditional credit scores are built from information in credit bureau reports and typically use
variables relating to credit history. But these traditional credit scores are not able to score
approximately 70 million individuals who lack credit reports or have “thin” credit reports
without enough data to generate a credit score.
This inability to score no file or thin file individuals differentially affects historically disadvan
taged minorities. A recent Lexis Nexis study found that 41 percent of historically underserved
minority populations of Hispanics and African Americans could not be scored using traditional
methods, while the unscorable rate for the general population was only 24 percent. Minorities
face an unscorable rate that is 1.7 times almost twice the rate for the general population.24
To remedy this limitation, companies are looking beyond the information contained in credit
reports to alternative data sources and building credit scores based on this additional data. For
instance, RiskView, an alternative credit score built by Lexis Nexis relies on public and insti
tutional data such as educational history and professional licensing, property asset and ownership
data such as home ownership, and court sourced items such as foreclosures, evictions, bank
ruptcies and tax liens.
The Lexis Nexis report demonstrated the extent to which credit risk scores built from
alternative data can help to extend credit to unscorable consumers, finding that fully 81 percent
of unscorable minorities received a RiskView score. A major benefit of alternative credit scores is
the improvement in the availability of credit for historically underserved minority groups.
Not every new model based on alternative data will be as predictive as the standard credit scoring
models.25 But a number of checks are in place to prevent abuse. New scoring models based on
alternative data are subject to the same regulatory scrutiny as traditional scores. Moreover, the
market will not support inaccurate models. For instance, some rethinking is already taking place on
the appropriate role of social media information in determining creditworthiness.26

Education
Big data analytics is improving education through personalizing learning and identifying
students at risk of failing. New computer based educational resources record student activity
during learning and create user models and groupings that improve student learning.27
Advanced online learning systems recommend the next learning activity and also predict how
the student will perform on examinations.28

24
Jeffrey Feinstein, “Alternative Data and Fair Lending,” Lexis-Nexis, August 2013, available at [Link]
.com/risk/downloads/whitepaper/fair [Link].
25
A 2014 study by Robinson + Yu discusses these alternative data scores and their limitations. See Robinson + Yu,
Knowing the Score: New Data, Underwriting, and Marketing in the Consumer Credit Marketplace: A Guide for
Financial Inclusion Stakeholders, October 2014, available at [Link] the
Score Oct 2014 v1 [Link].
26
See Telis Demos and Deepa Seetharaman, “Facebook Isn’t So Good at Judging Your Credit After All,” Wall Street
Journal, February 24, 2016, available at [Link]
friends-1456309801.
27
Marie Bienkowski, Mingyu Feng and Barbara Means, Enhancing Teaching and Learning Through Educational Data
Mining and Learning Analytics: An Issue Brief, US Department of Education Office of Educational Technology,
October 2012, available at [Link]
28
Paul Fain, “Intel on Adaptive Learning,” Inside Higher Ed, April 4, 2013, available at [Link]
news/2013/04/04/gates-foundation-helps-colleges-keep-tabs-adaptive-learning-technology#disqus thread.
In Defense of Big Data Analytics 53

These developments allow for various forms of individualized learning that can replace one
size fits all models of learning.29 In addition, biometric information can be used for assessing
various psychological characteristics, such as grit, tenacity and perseverance, linked to effective
learning.30
The US Department of Education concluded that these new data driven learning methods
are effective, saying, “students taught by carefully designed systems used in combination with
classroom teaching can learn faster and translate their learning into improved performance
relative to students receiving conventional classroom instruction.”31
Predictive analytics can also be used to find students at risk of failing a class or dropping out.
Simple early warning indicator systems can identify most students who eventually drop out of
high school as early as the sixth grade by their attendance, behavior and course performance.
Even more can be identified by the middle of ninth grade.32 Many schools throughout the
country use these systems to identify students to improve their chances of graduation.33
Other systems use a broader range of factors and more advanced analytics to identify at risk
students to enable schools to intervene early to provide them with the right support and
intervention. Using one of these systems developed by IBM, for instance, the Hamilton County,
Tennessee Board of Education increased graduation rates by more than eight percentage points
and increased standardized test scores in math and reading by more than 10 percent.34

Detecting and Remedying Discrimination


Human biases are notorious and often unconscious. Classical music orchestras were almost
entirely male for generations, despite the denials of bias by conductors who apparently exhibited
no gender bias in any other aspect of their lives. But arranging auditions to be held behind a
screen that hid the gender of the aspiring musician produced a dramatic change toward gender
neutrality. Eliminating information that biased human judgment led to fairer outcomes.35
More elaborate data analysis can also detect totally unconscious biases. Judges are trained to
conscientiously make good faith efforts to be impartial. Still one study in Israel found that at the
beginning of the workday, judges granted around two thirds of parole requests, but that approvals
fell steadily until food breaks, after which the judges again granted most parole requests.36
Moreover, statistical techniques can be used to assess whether employment hiring and
promotion practices are fair and provide the bases for taking remedial steps. Google publishes

29
US Department of Education, Expanding Evidence: Approaches for Measuring Learning in a Digital World, chapter 2,
2013, available at [Link]
30
US Department of Education, Promoting Grit, Tenacity and Perseverance: Critical Factors for Success in the 21st
Century, February 2013, p. 41, at [Link]
31
US Department of Education, Expanding Evidence: Approaches for Measuring Learning in a Digital World, chapter 2,
2013, p. 28, available at [Link]
32
Robert Balfanz, “Stop Holding Us Back,” New York Times, June 4, 2014, available at [Link]
opinionator/2014/06/07/stop-holding-us-back/? php true& type blogs&emc edit tnt 20140608&nlid 50637717&tnte
mail0 y& r 0.
33
Mary Bruce and John M. Bridgeland, “The Use of Early Warning Indicator and Intervention Systems to Build a Grad
Nation,” Johns Hopkins University, November 2011, available at [Link]
on track for [Link].
34
“IBM Predictive Analytics Solution for Schools and Educational Systems,” IBM, available at
[Link] YTS03068USEN&appname wwwsearch.
35
Malcolm Gladwell, Blink: The Power of Thinking without Thinking, Little, Brown and Company, 2005, pp. 245 ff.
36
“‘I think it’s time we broke for lunch . . .’: Court Rulings Depend Partly on When the Judge Last Had a Snack,” The
Economist, Apr. 14, 2011, available at [Link]
54 Mark MacCarthy

its diversity report regularly37 and has pioneered efforts to diversify its workplace through
workshops for employees on detecting and dealing with unconscious bias. Software recruiting
tools can also be used to help employers correct the underrepresentation of certain groups in
their workforces.38
Data analysis can detect whether a statistical model has disproportionate adverse effects on
protected classes. For instance, non mortgage financial institutions do not have information
about the race and ethnicity of their applicants and customers. To assess whether their statistical
models comply with fair lending rules they can use publicly available information on surnames
and geo location as reliable predictors of these characteristics, and advanced statistical tech
niques can improve the predictive accuracy of these factors.39

Future of Work
Computer based systems today can outperform people in more and more tasks once considered
within the exclusive competence of humans. Automation has historically produced long term
growth and full employment, despite initial job losses. But the next generation of really smart
AI based machines could create the sustained technological unemployment that John Maynard
Keynes warned against in the 1930s.40 This time it could be different people could go the way
of horses, and lose their economic role entirely.41 One study summed up the issue this way,
“if new technologies do not create many additional tasks, or if the tasks that they do create are
of a type in which machines, rather than people, have the advantage, then technological
(un)employment, to a greater or lesser extent, will follow.”42
Carl Frey and Michael Osborne estimated that 47 percent of occupations are susceptible to
automation, including the more advanced cognitive work of lawyers and writers.43 A study from
the Organization for Economicy Cooperation and Development (OECD) found that “9% of
jobs are automatable.”44 McKinsey estimates that currently demonstrated technologies could
automate 45 percent of work activities, and that in about 60 percent of all occupations currently

37
Google’s January 2016 report showed that its workforce was 59% white and 69% male. See Google Diversity Index,
available at [Link]
38
Jules Polonetsky and Chris Wolf, “Fighting Discrimination – With Big Data,” The Hill, September 15, 2015, available
at [Link] See also Future of
Privacy Forum, “Big Data: A Tool for Fighting Discrimination and Empowering Groups,” September 2015.
39
CFPB recently revealed the methodology it uses to assess disparate impact for fair lending compliance. Consumer
Financial Protection Board, Using Publicly Available Information to Proxy for Unidentified Race and Ethnicity:
A Methodology and Assessment, Summer 2014, available at [Link] cfpb report
[Link] (CFPB Methodology). It does not mandate that anyone use this methodology but companies
seeking to assess fair lending compliance risk are now in a position to make these assessments more reliably.
40
John Maynard Keynes, “Economic Possibilities for Our Grandchildren,” in Essays in Persuasion, New York:
W. W. Norton & Co., 1963, pp. 358–373, available at [Link]
upload/Intro [Link].
41
See Erik Brynjolfsson and Andrew McAfee, “Will Humans Go the Way of Horses? Labor in the Second Machine
Age,” Foreign Affairs, July/August 2015, available at [Link]
way-horses. See also by the same authors, The Second Machine Age: Work, Progress, and Prosperity in a Time of
Brilliant Technologies, W. W. Norton & Company, 2014.
42
Susskind and Susskind, p. 289.
43
Carl Frey and Michael Osborne, “The Future of Employment: How Susceptible Are Jobs to Computerization?”
Oxford University, September 2013, available at [Link] Future
of [Link].
44
Melanie Arntz, Terry Gregory and Ulrich Zierahn, “The Risk of Automation for Jobs in OECD Countries: A Comparative
Analysis,” OECD Social, Employment and Migration Working Papers No. 189, 2016, available at [Link]
[Link]/social-issues-migration-health/the-risk-of-automation-for-jobs-in-oecd-countries 5jlz9h56dvq7-en.
In Defense of Big Data Analytics 55

available technologies could automate 30 percent or more of their constituent activities.45


The Council of Economic Advisors estimates that that 83 percent of jobs making less than
$20 per hour would come under pressure from automation, as compared to only 4 percent of
jobs making above $40 per hour.46
The fear that the economy will motor on without human labor in a completely post work
society is far fetched, but the chances of a less labor intensive economy are significant enough to
warrant serious attention from policymakers. In the short term, increased efforts for education
and training are important. Policymakers should also consider income support measures such as
a universal basic income that break the link between work and income and could provide a fair
distribution of the cornucopia of plenty made possible by the advances of machine learning and
artificial intelligence.47
Companies will need to examine how job redesign and process reengineering can make full use
of skilled human resources while taking advantage of the efficiencies of machine learning. One
strand of thought emphasizes technology that complements human skills, systems that augment
human ability rather than substitute for it.48 The National Science Foundation’s National Robotics
Initiative provides an incentive for systems that work alongside or cooperatively with workers.49

privacy
The history of privacy policy shows that policymakers need to adapt privacy principles in the face
of significant technological changes. In the 1890s, Warren and Brandeis developed the right to
privacy as the right to be left alone in reaction to the development of the snap camera and mass
print media. Sixty years of case law produced Prosser’s four privacy torts as a systematization of
the harms from different privacy invasions.50
These legal structures proved inadequate to deal with the arrival of the mainframe computer,
which allowed the collection, storage, and processing of large volumes of personal information
to improve operations in business, government and education. A regulatory paradigm of fair
information practices arose to fill this gap.51

45
Michael Chui, James Manyika and Mehdi Miremadi, “Four Fundamentals of Workplace Automation,” McKinsey
Quarterly, November 2015, available at [Link]
four-fundamentals-of-workplace-automation; and Michael Chui, James Manyika and Mehdi Miremadi, “Where
Machines Could Replace Humans – And Where They Can’t (Yet),” McKinsey Quarterly, July 2016, available at
[Link]
and-where-they-cant-yet#.
46
Jason Furman, “Is This Time Different? The Opportunities and Challenges of Artificial Intelligence,” Remarks at AI
Now: The Social and Economic Implications of Artificial Intelligence Technologies in the Near Term, July 7, 2016,
available at [Link] cea ai [Link].
47
See Study Report, p. 9: “It is not too soon for social debate on how the economic fruits of AI technologies should be
shared.”
48
One goal could be human–machine cooperation by design, where the developers would aim “to engineer a human/
machine team from the very beginning, rather than to design a highly automated machine to which a user must adapt.”
David A. Mindell, Our Robots, Ourselves: Robotics and the Myths of Autonomy, Penguin Publishing Group, 2015, p. 210.
49
“National Robotics Initiative (NRI): Program Solicitation 16–517,” National Science Foundation, December 15, 2015,
available at [Link]
50
See the short history of the evolution of privacy law in Paul Ohm, “Broken Promises of Privacy: Responding to the Surprising
Failure of Anonymization,” UCLA Law Review, vol. 57, p. 1701, 2010; University of Colorado Law Legal Studies Research
Paper No. 9–12, pp. 1731–1739, available at [Link] 1450006 (Ohm, Broken Promises).
51
Robert Gellman, “Fair Information Practices: A Basic History,” available at [Link]
[Link].
56 Mark MacCarthy

Today, artificial intelligence, machine learning, cloud computing, big data analytics and the
Internet of Things rest firmly on the ubiquity of data collection, the collapse of data storage costs,
and the astonishing power of new analytic techniques to derive novel insights that can improve
decision making in all areas of economic, social and political life. A reevaluation of regulatory
principles is needed in light of these developments.

Data Minimization
A traditional privacy principle calls for enterprises and others to limit their collection of infor
mation to the minimum amount needed to accomplish a clearly specified specific purpose and
then to discard or anonymize this information as soon as that purpose is accomplished.52
In an era of small data sets, expensive memory and limited computing power, privacy policy
makers could enforce this data minimization principle to reduce privacy risks without sacrificing
any significant social gains. With the increasing capacity of big data analytics to derive new
insights from old data, this principle of collecting the minimum amount of information and
throwing it away as soon as possible is no longer appropriate. The routine practice of data
minimization would sacrifice considerable social benefit.
Full data minimization is not an ultimate moral principle that should endure through
changes in technology. It is a practical guideline that grew up in a time when the dangers of
computer surveillance and information misuse were not matched with the potential gains from
data retention and analysis. Previously, we could not use data to perform the astonishing range of
activities made possible by new machine learning and artificial intelligence techniques, includ
ing high level cognitive functions that rival or surpass human efforts. Now we can do these
things, provided we retain and analyze truly staggering amounts of information. It is now
sensible to retain this information rather than routinely discarding it.
This does not mean that all constraints on data collection and retention should be aban
doned. Good privacy by design practice suggests that a prior risk based assessment of the extent
of data collection and retention could prevent substantial public harm.
An example illustrates the point. Before the introduction of credit cards using a chip, a
common way to make a counterfeit card was to hack into a merchant database in the hopes of
finding enough information to make counterfeit cards. If the database contained the access
codes that had been read from the cards’ magnetic stripes, then the thieves could make the
counterfeit cards, but without these security codes the fake cards would not work at the point of
sale. This led to a very simple security rule: don’t store the access code. There was no business
reason for it to be retained and substantial risk in doing so.
This example suggests that a prior review of data collection and retention practices is
warranted to avoid retaining information that could create an unnecessary risk of harm. In these
cases, data controllers should assess the likely harm in retaining data compared to the likely gains
and should throw away information or de identify it when the risks of harm are too great.

52
See, for instance, the European General Data Protection Regulation, where article 5(1)(c) provides that personal data
shall be “adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed” and
is labeled “data minimization.” A Regulation of the European Parliament and of the Council of the European Union on
the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such
Data, and Repealing Directive 95/46/EC (General Data Protection Regulation) April 27, 2016, Article 5(1)(c), available at
[Link] CELEX:32016R0679&from EN.
In Defense of Big Data Analytics 57

Secondary Use
Traditional privacy principles call for enterprises to request approval from data subjects when
seeking to use information collected for one purpose for a purpose that is inconsistent with,
unrelated to or even just different from the original purpose.53
One way to try to avoid a need to seek approval for secondary use would be to describe
completely the purposes for which an enterprise might want to collect data. But this is exactly
what cannot be done in an era of big data analytics. Often information gathered for one purpose
is found to be useful for additional purposes. For example, health information gathered for the
purpose of treatment has enormous value for medical research. Also, information used to assess
student learning can be used to examine the effectiveness of new educational tools and programs.
As one commentator put it: “Since analytics are designed to extract hidden or unpredictable
inferences and correlations from datasets, it becomes difficult to define ex ante the purposes of
data processing . . . a notice that explains all the possible uses of data is hard to be given to data
subjects at the time of the initial data collection.”54
A privacy principle restricting all secondary uses without further notice and consent would create
unnecessary procedural barriers to beneficial uses made possible by new data analysis techniques.
This does not mean that any further use of information is legitimate. Secondary uses of
personal information are legitimate when they do not pose a significant risk of harm to the data
subject or when they are consistent with the context of information collection and use. Good
privacy by design practice suggests a risk analysis of secondary uses to assess likely harms and
benefits. When there is a significant risk of injury to data subjects, an appropriate level of control
might be needed prohibition, or opt in or opt out choice, depending on the severity of the risk.
This risk based secondary use principle is more appropriate to meet the challenges of privacy
protection in an age of big data.

Anonymization: Privacy and Utility


Traditional privacy policy emphasizes the role of anonymization or de identification as a way to
protect privacy. De identification is sound information management practice when there is little
need for keeping data in identified form. But there are substantial limitations to the effectiveness
of the technique of privacy protection in an age of big data analysis.
De identification involves removing identifying information from a dataset so that the remaining
data cannot be linked with specific individuals. If an individual level record is stripped of such
obvious identifiers as name, social security number, date of birth and zip code, then privacy
interests are no longer at stake because the record cannot be recognized as being about a specific
individual. For this reason, several privacy rules exempt de identified information.55 Various

53
The consideration of secondary use usually arises in the context of applying the principle of purpose limitation. See Article
29 Data Protection Working Party, Opinion 03/2013 on purpose limitation, April 2, 2013, available at [Link]
justice/data-protection/article-29/documentation/opinion-recommendation/files/2013/wp203 [Link].
54
Alessandro Mantelero and Giuseppe Vaciago, “Data Protection in a Big Data Society: Ideas for a Future Regulation,” Digital
Investigation, vol. 15, December 2015, pp. 104–109, Post-print version, available at [Link]
55
The restrictions on public release of student data imposed by the Family and Educational Records Privacy Act
(FERPA) do not apply to de-identified student records. See “Dear Colleague Letter about Family Educational Rights
and Privacy Act (FERPA) Final Regulations,” US Department of Education, December 17, 2008, available at http://
[Link]/policy/gen/guid/fpco/hottopics/ht12–[Link]. The privacy requirements of the Health Insurance
Portability and Accountability Act (HIPAA) Privacy Rule do not apply to de-identified health information. 42 CFR
164.514, available at [Link]
58 Mark MacCarthy

studies have shown, however, that it is often possible to use available techniques, including
information contained in other databases, to reidentify records in de identified databases.56
The Federal Trade Commission (FTC) has addressed this issue through a policy that relieves
enterprises using de identified databases of various privacy requirements such as notice and
consent, provided that their methods of anonymization are reasonable in light of current technical
developments in the field, that they commit to not attempting to reidentify records and that they
bind third parties to whom they make the data available to abide by the same commitment.57
A legislative proposal by Robert Gellman is similar to the FTC policy. It would have data
providers include clauses in their data use agreements noting that the data had been de
identified and requiring the data user to keep it in de identified form. Recipients would then
face civil and criminal penalties if they attempted to reidentify the data.58
This contractual policy might work in certain circumstances when the data is released only to
qualified individuals who can be clearly identified by the data provider. It does not provide
sufficient protection when data is simply released to the public, which is legally required in
certain circumstances and is often in the public interest to enable socially important scientific,
social or medical research.
Moreover, reidentification techniques can be effective, even when the underlying data set is
completely private and the only public information is the statistical model derived from the
private data set. Often sensitive information is part of a model that can accurately predict
nonsensitive information that has been voluntarily provided. The model can be run in reverse,
using the value of the dependent variable and other independent variables in the model to
predict the value of the sensitive variable. In this way, a person’s health status, for instance, could
be inferred from non health information voluntarily provided to a third party.
Absolute privacy cannot be guaranteed if the data set and the models derived from it are to be
publicly available and useful. A kind of relative privacy, differential privacy, can be obtained using
various techniques, which reduces the risk of reidentifying a person’s sensitive attribute to a level
judged to be acceptably low.59 Such privacy preserving techniques, however, can prevent the effective
use of the information in the database. A trade off must be made between the utility of the data and the
privacy of the data subjects. When the utility of the database is paramount, as in medical settings where
the priority is setting a safe and effective dose of a lifesaving medicine, the notion of balancing the risk
of revealing sensitive information versus the extra risk of mortality seems problematic. In these
circumstances, release of the data or the statistical models based on them might need to be avoided.60

56
See Ohm, Broken Promises for examples. See also Simpson Garfinkel, De-Identification of Personal Information,
National Institute of Standards and Technology, NIST IR 8053, October 2015, available at [Link]
nistpubs/ir/2015/[Link].
57
Federal Trade Commission report, Protecting Consumer Privacy in an Age of Rapid Change: Recommendations for
Businesses and Policymakers, March 2012, p. 21, available at
[Link]
era-rapid-change-recommendations/[Link].
58
Robert Gellman, “The Deidentification Dilemma: A Legislative and Contractual Proposal,” Fordham Intellectual
Property, Media & Entertainment Law Journal, vol. 21, no. 33, 2010, available at [Link]
uploads/2013/09/Deidentifi[Link].
59
Cynthia Dwork, “Differential Privacy,” 33rd International Colloquium on Automata, Languages and Programming,
part II (ICALP 2006), pp. 1–12, available at [Link]
(Dwork, Differential Privacy).
60
See, for instance, Matthew Fredrikson et al., “Privacy in Pharmacogenetics: An End-to-End Case Study of Personal-
ized Warfarin Dosing,” Proceedings of the 23rd USENIX Security Symposium, August 20–22, 2014, available at https://
[Link]/system/files/conference/usenixsecurity14/[Link]. They conclude that
differential privacy mechanisms cannot always be used to release data sets and statistical models based on them.
“Differential privacy is suited to settings in which privacy and utility requirements are not fundamentally at odds, and
In Defense of Big Data Analytics 59

Information Externalities
The problem raised by the reidentification of public data sets is really part of a general problem
of information externalities, where information disclosed by some people reveals information
about others.61 In principle, this has been a commonplace for years. If the police know that a
perpetrator of a certain crime is left handed and one of three people, and find out from the first
two that they are right handed, then they know that the third person is the guilty one, even
though the third person has disclosed nothing at all. If I know the average height of Lithuanian
women and that Terry Gross is two inches shorter than the average Lithuanian woman, I know
Terry Gross’s height.62
But information externalities are much more common than these curiosities suggest. Any
statistical regularity about people creates a potential for an information externality. Social
scientists know that people having a certain array of characteristics often have another charac
teristic that is of interest. The dependent variable of interest can be inferred from independent
variables, even when that fact about a person is highly sensitive, that person never disclosed it
and it cannot be found in public records.63
The new technology of big data analytics makes information externality the norm rather than
a curiosity. The standard examples are well known: pregnancy status can be inferred from
shopping choices;64 sexual orientation can be inferred from the characteristics of friends on
social networks;65 race can be inferred from name, and even more strongly from zip code and
name.66 As machine learning algorithms improve they will be able to more accurately ferret out
more and more personal traits that are of interest. In the age of big data analytics, it will be
increasingly difficult to keep secret any personal characteristic that is important for classifying
and making decisions about people.
One recommendation to fix to this problem, suggested earlier as a remedy for the de
anonymization problem, is to keep secret the statistical regularities, algorithms, and models that
allow information externalities. This can in some circumstances mitigate the extent of the issue.
After all, if the regularity is publicly available, then any researcher, enterprise or government
agency can use it to create this kind of information externality. Keeping it secret limits the group
of entities that can use it.
But secrecy is not the answer. If an organization develops a proprietary model or algo
rithm, but doesn’t make it generally available, it can still generate information externalities.
In many cases the algorithms that allow inference to previously hidden traits will be

can be balanced with an appropriate privacy budget . . . In settings where privacy and utility are fundamentally at odds,
release mechanisms of any kind will fail, and restrictive access control policies may be the best answer,” p. 19.
61
See Mark MacCarthy, “New Directions in Privacy: Disclosure, Unfairness and Externalities,” I/S: A Journal of Law
and Policy for the Information Society, vol. 6, no. 3 (2011), pp. 425–512, available at [Link]
people/maccartm/[Link].
62
Dwork, Differential Privacy.
63
This well-known property is sometimes called inferential disclosure. See OECD Glossary of Statistical terms, available
at [Link] 6932: “Inferential disclosure occurs when information can be inferred
with high confidence from statistical properties of the released data. For example, the data may show a high
correlation between income and purchase price of a home. As the purchase price of a home is typically public
information, a third party might use this information to infer the income of a data subject.”
64
Charles Duhigg, “How Companies Learn Your Secrets,” New York Times, February 16, 2012, available at [Link]
.[Link]/2012/02/19/magazine/[Link].
65
Carter Jernigan and Behram F. T. Mistree, “Gaydar: Facebook Friends Expose Sexual Orientation,” First Monday,
vol. 14, no. 10, October 5, 2009, available at [Link]
66
CFPB Methodology.
60 Mark MacCarthy

proprietary. The use will be restricted to a single company or its institutional customers and
will be available only to those who have the most interest and need to make the inferences
that expose previously hidden traits.
Information externalities challenge the traditional core privacy principle that individual
control over the flow of information is the front line of defense against privacy violations.
In the traditional view, privacy is just control over information.67 Privacy policymakers aim to
empower people to protect themselves and their privacy through a “notice and choice” mech
anism. In practice, privacy policymakers focus on the best way to have companies tell people
about the use of information and provide them a choice of whether or not to release information
for that purpose.
But this tool of privacy policy will become increasingly ineffective in an age of ubiquitous
information externalities. Fully informed, rational individuals could make the choice not to
reveal some feature of their character or conduct, but as long as others are willing to reveal that
information about themselves and contribute it to the huge data sets that form the input for
increasingly sophisticated algorithms, data scientists will be able to make increasingly accurate
predictions about that hidden feature of an individual’s life.
This does not imply that notice and choice are always a mistake. They can sometimes provide
an effective privacy protective mechanism. But policymakers are beginning to move away from
heavy reliance on notice and choice. The Obama Administration’s 2014 big data report urged
privacy policymakers to “to look closely at the notice and consent framework that has been a
central pillar of how privacy practices have been organized for more than four decades.”68 The
accompanying report from the President’s Council of Advisors on Science and Technology says
plainly that the notice and choice framework is “increasingly unworkable and ineffective.”69
As privacy moves away from reliance on notice and choice mechanisms, it moves toward
other policy areas. Law and policy on minimum wage, non discrimination, information security,
occupational safety and health, and environmental protection, to name just a few, do not rely on
consent mechanisms. In these areas, individual choice undermines the policy goal of a high
level of equal protection for all. In the past, privacy policymakers have thought that privacy
choices were individual and idiosyncratic and that regulation should allow space for differences
in the value people placed on keeping information confidential. But the growing use of big data
algorithms makes it increasingly likely that privacy cannot be provided on an individualized
basis. In the age of big data analysis, privacy is a public good.70
Privacy policy makers can begin to rely more on two other approaches to supplement
traditional privacy principles. A consequentialist framework focuses on the likely outcome of a
proposed privacy requirement and uses an assessment of benefits and costs to decide when and
how to regulate.71 A social approach treats privacy as a collection of informational norms tied to

67
Alan Westin famously defined privacy as “the claim of individuals, groups, or institutions to determine for themselves
when, how, and to what extent information about them is communicated to others.” Alan Westin, Privacy and
Freedom, Atheneum, 1967, p. 7.
68
White House Big Data Report, p. 54.
69
Executive Office of the President, President’s Council of Advisors on Science and Technology, Report to the
President: Big Data and Privacy: A Technological Perspective, May 2014, p. 40, available at [Link]
.[Link]/sites/default/files/microsites/ostp/PCAST/pcast big data and privacy - may [Link].
70
Joshua A. T. Fairfield and Christoph Engel, “Privacy as a Public Good,” Duke Law Journal, vol. 65, p. 385, 2015,
available at [Link]
71
J. Howard Beales, III and Timothy J. Muris, “Choice or Consequences: Protecting Privacy in Commercial Infor-
mation,” University of Chicago Law Review, vol. 75, p. 109, 2008, especially pp. 109–120, available at [Link]
.[Link]/sites/[Link]/files/uploads/75.1/75 1 Muris [Link].
In Defense of Big Data Analytics 61

specific contexts such as medicine, education, or finance and regulates to maintain or enforce
these privacy norms.72 The Software & Information Industry Association (SIIA) has issued
guidelines for privacy policy makers that attempt to meld these two frameworks as a way to
provide for effective privacy protection in an age of big data.73

fairness
Introduction
The increased use of big data analytics also raises concerns about fairness. Are the algorithms
accurate? Do they utilize characteristics like race and gender that raise issues of discrimination?
How do we know? Can people have redress if an algorithm gets it wrong or has a disparate
impact on protected classes?74
Policymakers have issued reports and held workshops on these questions over the last two
years. The Obama Administration’s 2016 report on big data and civil rights highlighted concerns
about possible discriminatory use of big data in credit, employment, education, and criminal
justice.75 The Federal Trade Commission held a workshop and issued a report on the possible
use of big data as a tool for exclusion.76
Even when companies do not intend to discriminate and deliberately avoid the use of
suspect classifications such as race and gender, the output of an analytical process can have a
disparate impact on a protected class when a variable or combination of variables is correlated
both with the suspect classification and the output variable. These correlations might be the
result of historical discrimination that puts vulnerable people at a disadvantage. The end result
is that analytics relying on existing data could reinforce and worsen past discriminatory
practices.77
Concerned that the new techniques of data analysis will create additional challenges for
minority groups, civil rights groups have developed principles aimed at protecting civil liberties
in an age of big data78 and have focused on the possibility that new techniques of analysis will be
used to target minorities for discriminatory surveillance.79

72
Helen Nissenbaum, Privacy in Context, Stanford University Press, 2009.
73
SIIA, Guidelines for Privacy Policymakers, 2016, available at [Link]
lines%20for%20Privacy%[Link].
74
See Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy,
Crown/Archetype, 2016 and Frank Pasquale, The Black Box Society: The Secret Algorithms That Control Money and
Information, Harvard University Press, 2015.
75
Executive Office of the President, Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights, May
2016, available at [Link] 0504 data [Link]
(Big Data and Civil Rights).
76
Federal Trade Commission, Big Data: A Tool for Inclusion or Exclusion? Understanding the Issues, January 2016,
available at [Link]
issues/[Link].
77
Solon Barocas and Andrew D. Selbst, “Big Data’s Disparate Impact,” California Law Review, vol. 104, p. 671, 2016,
available at [Link] 2477899.
78
Leadership Conference on Civil Rights, “Civil Rights Principles for the Era of Big Data,” 2014, available at http://
[Link]/press/2014/[Link].
79
On April 8, 2016, Georgetown Law and the Center on Privacy & Technology held a conference this issue of minority
surveillance, entitled The Color of Surveillance: Government Monitoring of the African American Community. See
[Link]
62 Mark MacCarthy

Current Legal Framework


Any technology, new or old, can further illegal or harmful activities, and big data analysis is no
exception. But neither are the latest computational tools an exception from existing laws that
protect consumers and citizens from harm and discrimination.
The Fair Credit Reporting Act (FCRA) sets out requirements for credit reporting agencies,
including access, correction and notification of adverse action.80 FCRA was put in place to deal
with computerized credit reporting agencies in the 1970s, but it applies to decisions made using
big data and the latest machine learning algorithms, including third party companies that
combine social media data with other information to create profiles of people applying for jobs.81
A second element of the current legal framework is the prohibition on discrimination against
protected groups for particular activities. As statutory constraints on discrimination:

• Title VII of the Civil Rights Act of 1964 makes it unlawful for employers and employment
agencies to discriminate against an applicant or employee because of such individual’s
“race, color, religion, sex, or national origin.”82
• The Equal Credit Opportunity Act makes it unlawful for any creditor to discriminate
against any applicant for credit on the basis of “race, color, religion, national origin, sex or
marital status, or age.”83
• Title VIII of the Civil Rights Act of 1968, the Fair Housing Act, prohibits discrimination in
the sale, rental or financing of housing “because of race, color, religion, sex, familial status,
or national origin.”84 The act also protects people with disabilities and families with
children.
• The Age Discrimination in Employment Act of 1967 (ADEA) makes it unlawful for an
employer to refuse to hire or to discharge or to otherwise discriminate against any individ
ual because of the individual’s age.85
• The Genetic Information Nondiscrimination Act of 2008 prohibits US health insurance
companies and employers from discriminating on the basis of information derived from
genetic tests.86
• Section 1557 of the Affordable Care Act of 2010 prohibits discrimination in health care and
health insurance based on race, color, national origin, age, disability, or sex.87
These laws apply to the use of any statistical techniques, including big data analytics, as the
Obama Administration recognized when they recommended that regulatory agencies “should
expand their technical expertise to be able to identify practices and outcomes facilitated by big
data analytics that have a discriminatory impact on protected classes, and develop a plan for
investigating and resolving violations of law in such cases.”88

80
15 U.S.C. § 1681 et seq.
81
See Federal Trade Commission, “Spokeo to Pay $800,000 to Settle FTC Charges Company Allegedly Marketed Infor-
mation to Employers and Recruiters in Violation of FCRA,” press release, June 12, 2012, available at [Link]
.[Link]/opa/2012/06/[Link]. For more on the FCRA enforcement, see SIIA, How the FCRA Protects the Public,
2013 available at [Link] com docman&task doc download&gid 4767&Itemid 318.
82
42 U.S.C. § 2000e-2, available at [Link]
83
15 U.S.C. § 1691, available at [Link]
84
42 U.S.C. 3604, available at [Link]
85
29 U.S.C. § 623, available at [Link]
86
Pub. L. No. 110–233, 122 Stat. 881, available at [Link]
87
42 U.S.C. § 18116, available at [Link]
88
White House Big Data Report, p. 60.
In Defense of Big Data Analytics 63

It is true that big data analytics might have discriminatory effects, even when companies do
not intend to discriminate and do not use sensitive classifiers such as race and gender. But social
scientists and policymakers have long known that statistical techniques and inferences can have
discriminatory effects.89 When discrimination arises indirectly through the use of statistical
techniques, regulatory agencies and courts use disparate impact assessment to determine
whether the practice is prohibited discrimination.90
Current rules provide for reasonable, contextually appropriate amounts of due diligence to
ensure fairness in the use of statistical models. Credit scoring models, for instance, are routinely
tested for compliance with fair lending laws and methodologies have been developed to assess
the risk of failing a disparate impact test.91 Reviews of some of these assessments have been made
public.92 Studies of disparate impact in the financial world include a Federal Trade Commis
sion study on insurance credit scores,93 a Payment Card Center study of credit cards,94 and a
Federal Reserve Board study of credit scores and the availability of credit.95
Existing rules for due diligence apply when newer techniques of big data analysis such
as machine learning algorithms are used. When these techniques are used in the
regulated contexts of housing, credit granting, employment and insurance, they are subject
to the same regulatory controls and validation requirements that apply to any statistical
methodology used in these contexts.

89
NIST points out that information externalities can produce this kind of harm: “Inferential disclosure may result in
group harms to an entire class of individuals, including individuals whose data do not appear in the dataset. For
example, if a specific demographic group is well represented in a data set, and if that group has a high rate of a
stigmatizing diagnosis in the data set, then all individuals in that demographic may be stigmatized, even though it may
not be statistically appropriate to do so.” NIST, p. 12.
90
Disparate impact analysis is controversial because it focuses on the effects of a policy, practice or procedure rather
than on its motivation or intent. Yet regulators and courts use disparate impact to assess discrimination in a wide
variety of circumstances. For instance, Title VII of the Civil Rights Act of 1964 forbids any employment practice that
causes a disparate impact on a prohibited basis if the practice is not “job related for the position in question and
consistent with business necessity” or if there exists an “alternative employment practice” that could meet the
employer or employment agency’s needs without causing the disparate impact (42 U.S.C. § 2000e-2(k)(1)), available
at [Link] On June 25, 2015, the Supreme Court, by a five-to-four margin,
upheld the application of disparate impact under the Fair Housing Act in Texas Department of Housing &
Community Affairs v. The Inclusive Communities Project, Inc., available at [Link]
ions/14pdf/13-1371 [Link].
91
See, for instance, Charles River Associates, Evaluating the Fair Lending Risk of Credit Scoring Models, February 2014,
available at [Link]
[Link]. The concern that big data analysis might discriminate inadvertently is explicitly recognized: “Ostensibly
neutral variables that predict credit risk may nevertheless present disparate impact risk on a prohibited basis if they are
so highly correlated with a legally protected demographic characteristic that they effectively act as a substitute for that
characteristic” (p. 3).
92
See, for instance, Center for Financial Services Innovation, The Predictive Value of Alternative Credit Scores,
November 26, 2007, available at [Link] id 330262.
93
Federal Trade Commission, Credit-Based Insurance Scores: Impacts on Consumers of Automobile Insurance, July 2007,
p. 3, available at [Link]
sumers-automobile-insurance-report-congress-federal-trade/p044804facta report credit-based insurance [Link].
94
David Skanderson and Dubravka Ritter, Fair Lending Analysis of Credit Cards, Payment Card Center Federal Reserve
Bank of Philadelphia, August 2014, available at [Link]
ment-cards-center/publications/discussion-papers/2014/[Link].
95
Board of Governors of the Federal Reserve System, Report to Congress on Credit Scoring and Its Effects on the Availability
and Affordability of Credit, August 2007, available at [Link]
[Link]. See also Robert Avery, et al., “Does Credit Scoring Produce a Disparate Impact?” Staff Working Paper,
Finance and Economics Discussion Series, Divisions of Research & Statistics and Monetary Affairs, Federal Reserve
Board, October 2010, available at [Link]
64 Mark MacCarthy

Algorithmic Transparency
To address these questions of fairness, some commentators have suggested moving beyond
current rules, calling for a policy of algorithmic transparency that would require the disclosure
of the source code embodied in a decision making or classificatory algorithm. In this view, one
of the major changes of rendering decision making more computational under big data analysis
is that the standards and criteria for making decisions have become more opaque to public
scrutiny and understanding. Disclosure would allow outsiders an effective way to evaluate the
bases for the decisions made by the programs. Along with a right to appeal a decision before an
independent body, it would provide due process protection for people when algorithms are used
to make decisions about them.96
Transparency in this sense of public disclosure of source code would be a mistake. Commer
cial algorithms are often proprietary and are deliberately kept as trade secrets in order to provide
companies with a competitive advantage. In addition, revealing enough about the algorithm so
that outside parties can predict its outcomes can defeat the goal of using the formula. For
instance, the process and criteria for deciding whom to audit for tax purposes or whom to select
for terrorist screening must be opaque to prevent people from gaming the system.97
In addition, transparency of code will not really address the problem of bias in decision
making. Source code is only understandable by experts. And even for them it is hard to
understand what a program will do based solely on the source code. In machine learning
algorithms, the decision rule is not imposed from outside, but emerges from the data under
analysis. Even experts have little understanding of why the decisional output is what it is. In
addition, the weights associated with each of the factors in a machine learning system change as
new data is fed into the system and the program updates itself to improve accuracy. Knowing what
the code is at any one time will not provide an understanding of how the system evolves in use.98
Still some understanding of the “narrative” behind algorithms might accomplish the goals of
algorithmic transparency. Traditional credit scoring companies such as FICO routinely release
the general factors that power their models and the rough importance of these factors. For
instance, payment history contributes 35 percent to the overall score and amounts owed contrib
utes 30 percent.99 Designers and users of newer statistical techniques might consider the extent
to which they could provide the public with a story to accompany the output of their statistical
models. For instance, researchers at Carnegie Mellon University have developed a method for

96
See, for example, Danielle Keats Citron and Frank Pasquale, “The Scored Society: Due Process for Automated
Predictions,” University of Maryland Francis King Carey School of Law, Legal Studies Research Paper, No. 2014–8.
(2014) 89 Wash. L. Rev 1, [Link] id 2376209; Danielle Keats Citron, “Tech-
nological Due Process,” University of Maryland Legal Studies Research Paper No. 2007–26;Washington University Law
Review, vol. 85, pp. 1249–1313, 2007, available at SSRN: [Link] 1012360; Kate Crawford and Jason
Schultz, “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms,” Boston College Law
Review, vol. 55, p. 93, 2014, available at [Link]
97
Christian Sandvig, Kevin Hamilton, Karrie Karahalios and Cedric Langbort, “Auditing Algorithms: Research Methods for
Detecting Discrimination on Internet Platforms,” Data and Discrimination: Converting Critical Concerns into Productive
Inquiry, 2014, available at [Link]
20–%20ICA%202014%20Data%20and%20Discrimination%[Link] (Auditing Algorithms).
98
See the discussion in Joshua A. Kroll, Joanna Huey, Solon Barocas, Edward W. Felten, Joel R. Reidenberg, David G.
Robinson and Harlan Yu, “Accountable Algorithms,” 165 U. Pa. L. Rev. 633 (2017), available at [Link]
.[Link]/penn law review/vol165/iss3/3.
99
See, “What’s in My Credit Score,” FICO, available at [Link]
In Defense of Big Data Analytics 65

determining why an AI system makes particular decisions without having to divulge the
underlying workings of the system or code.100

Framework of Responsible Use


To help ensure algorithmic fairness, FTC Commissioner Terrell McSweeny has called for a
framework of “responsibility by design” that would test algorithms at the development stage
for potential bias. Fairness by design should be supplemented by audits after the fact to ensure
that properly designed algorithms continue to operate properly.101 The Obama Administration
called for a similar practice of “equal opportunity by design.”102
Individual enterprises could address the issues involved in the construction of a framework of
responsible use, and in the end it might be a matter of balancing the business needs, legal risk
and social responsibility in ways that best fit the context of the individual company. For instance,
care in the use of statistical models that rely on commute time can be a matter of a company’s
own individual determination of how to manage legal risk and public perceptions, and some
companies choose not to use that information in their hiring decisions.103
Nevertheless, a collaborative effort involving a range of stakeholders might sharpen the issues
and allow sharing of information and best practices in a way that would benefit all. In such a
collaborative effort, businesses, government, academics and civil rights and public interest
groups would come together to establish a clear operational framework for responsible use of
big data analytics. The tech industry has begun to organize itself for this task with the formation
of a Partnership on AI “to advance public understanding of artificial intelligence technologies
(AI) and formulate best practices on the challenges and opportunities within the field.”104
It might be that there is no single framework for responsible use of big data analytics. It is
certainly true that the risks and considerations involved in the use of a technique such as
machine learning depends on the domain of use.105 The legal standards differ as well.106 Still a
stakeholder group might assess whether there are actionable general principles that could be
applied successfully in many fields.
It is important to begin to develop this framework now, and to get it right. The public needs to
be confident in the fairness of algorithms, or a backlash will threaten the very real and substantial

100
Byron Spice, “Carnegie Mellon Transparency Reports Make AI Decision-Making Accountable,” Carnegie Mellon
University, May 26, 2016, [Link]
making-accountable.
101
Terrell McSweeny, Keynote Remarks, “Tech for Good: Data for Social Empowerment,” September 10, 2015, available at
[Link] statements/800981/[Link].
102
Big Data and Civil Rights, p. 5.
103
Evolv, a waste management company, “is cautious about exploiting some of the relationships it turns up for fear of
violating equal opportunity laws. While it has found employees who live farther from call-center jobs are more likely
to quit, it doesn’t use that information in its scoring in the U.S. because it could be linked to race.” Joseph Walker,
“Meet the New Boss: Big Data. Companies Trade In Hunch-Based Hiring for Computer Modeling,” Wall Street
Journal, September 2012, available at
[Link]
104
Partnership for AI, “Industry Leaders Establish Partnership on AI Best Practices,” Press Release, September 28, 2016,
available at [Link]
105
This dovetails with the idea that regulation of AI as such would be mistaken. See Study Group, p. 48: “attempts to
regulate ‘AI’ in general would be misguided, since there is no clear definition of AI (it isn’t any one thing), and the
risks and considerations are very different in different domains.”
106
For instance, disparate impact analysis in employment is subject to a business necessity test, but disparate impact for
age discrimination is subject to a less stringent “reasonableness” standard. See Smith v. City of Jackson, 544 U.S. 228
(2005), available at [Link]
66 Mark MacCarthy

benefits. Stakeholders need to do more to ensure the uses of the new technology are, and are
perceived to be, fair to all.
Whether implemented by a single enterprise or developed as a collaborative effort, a frame
work would need to set out stakeholder roles, determine what metrics to use to assess fairness, the
range of economic life to which these assessments should apply, the standards of fairness and
what to do with a finding of disparate impact.

Stakeholder Roles
The framework would need to address the proper roles of the public, developers and users of
algorithms, regulators, independent researchers, and subject matter experts, including ethics
experts. How much does the public need to know about the inner workings of algorithms? What
are the different responsibilities of the developers of analytics, the furnishers of data and the
users? Should regulators be involved in the assessment of fairness? In areas where there are no
legal responsibilities is there a role for government to act as a convener? What is the role of
independent researchers? Should they have access to websites to test them for fairness?

Metrics
Fairness involves protecting certain classes of people against disproportionate adverse impacts in
certain areas. But how do we measure this? Statisticians, economists, and computer scientists,
among others, are working in the growing field of metrics designed to measure disparate
impact.107 Deviation from statistical parity is one measure. But there are others.108 Sometimes
analysis of training data can reveal the possibility of disparate impact in the use of algorithms.109
Rules of thumb have been developed from some legal contexts, such as the 80 percent rule in
employment contexts. It might be important to develop similar thresholds of disproportionate
burden that suggest possible illegal discrimination in other fields. It is also important to measure
how much loss of accuracy would result from using alternative statistical models. To assess the
central normative questions we need good measurement of the size of the disparate impact and
the loss of accuracy, if any, that might be involved in remedial action.
Individual enterprises or stakeholders will need to survey these methods, develop a process for
keeping up with developments in the fast moving field, and integrate the most effective
methodologies into the process of auditing and testing.

Extent of Application
Current law does not protect people in suspect classifications in every area of economic and
social life. Nevertheless, those who design, implement and use data analytics systems should be
107
Kroll et al. have a good discussion of the field in Accountable Algorithms; Sandvig et al. discuss different auditing
techniques in Auditing Algorithms. A good general framework from a computer science perspective is set out in
Cynthia Dwork, et al., “Fairness through awareness,” Proceedings of the 3rd Innovations in Theoretical Computer
Science Conference, ACM, 2012, available at [Link] 2090255.
108
For a comprehensive summary of these alternatives, see A. Romei and S. Ruggieri, “A Multidisciplinary Survey on
Discrimination Analysis,” The Knowledge Engineering Review, pp. 1–57, April 3, 2013, available at [Link]
.[Link]/ruggieri/Papers/[Link]; see also Salvatore Ruggieri, “Data Anonymity Meets Non-Discrimination,”
2013 IEEE 13th International Conference on Data Mining Workshops, p. 876, available at [Link]
ruggieri/Papers/[Link].
109
Michael Feldman, Sorelle Friedler, John Moeller, Carlos Scheidegger and Suresh Venkatasubramanian, “Certifying
and Removing Disparate Impact,” arXiv:1412.3756v3 [[Link]], 2015, available at [Link]
In Defense of Big Data Analytics 67

thoughtful about the potential for discriminatory effects in any field. In many cases, businesses
would want to know whether their use of these tools has disproportionate adverse impacts on
protected classes. But a universal audit norm for all statistical models is too broad, since it would
extend to areas where there is no consensus that making distinctions is an issue. In the
stakeholder approach to developing a common framework of responsible use, interested parties
would need to discuss which areas of social and economic life to include in a responsible
use model.

Standards of Fairness
The FTC found that credit insurance scores have a disparate impact, and are also predictive of
automobile insurance risk, not simply proxies for race and ethnicity.110 The Federal Reserve
Board had a similar finding that traditional credit scores had a disparate impact and were also
accurate predictors of creditworthiness. These findings were viewed as confirmation that the
scores were not discriminatory.111 Still many think these uses are unfair and most states limit the
use of credit insurance scores.112
Assessments of algorithms appear to be about data, statistics, and analytics. But in reality they
are often disputes about contested standards of fairness. Is fairness a matter of reducing the
subordination of disadvantaged groups or avoiding the arbitrary misclassification of individ
uals?113 Should analytics aim only at accurate predictions or should it also aim at the statistical
parity of protected groups?114 Is fairness simply accuracy in classification? Or does fairness call for
some sacrifice of accuracy in order to protect vulnerable groups?115 Are there some factors other
than suspect classifications that should not be taken into account even if they are predictive
because to do so would be unfair? Should we be judged solely by what we do and never by who

110
Federal Trade Commission, Credit-Based Insurance Scores: Impacts on Consumers of Automobile Insurance, July 2007,
p. 3, available at [Link]
sumers-automobile-insurance-report-congress-federal-trade/p044804facta report credit-based insurance [Link].
111
Commissioner Julie Brill said that the FTC and a companion Federal Reserve study, “found that the scores they
examined largely did not serve as proxies for race or ethnicity,” Remarks of FTC Commissioner Julie Brill at the FTC
Workshop, “Big Data: A Tool for Inclusion or Exclusion?” September 15, 2014, available at [Link]
files/documents/public statements/582331/[Link].
112
“As of June 2006, forty-eight states have taken some form of legislative or regulatory action addressing the use of
consumer credit information in insurance underwriting and rating.” FTC Study, p. 17.
113
Jack M. Balkin and Reva B. Siegel, “The American Civil Rights Tradition: Anticlassification or Antisubordination?”
University of Miami Law Review, vol. 58, p. 9, 2003, available at [Link]
[Link]. A study by Robinson + Yu frames the issue clearly: “Industry-standard credit scores
accurately reflect underlying differences in credit risk between racial groups, which are themselves a reflection of
social disparities and years of economic, political and other biases against racial minorities.” Robinson + Yu,
“Knowing the Score: New Data, Underwriting, and Marketing in the Consumer Credit Marketplace: A Guide for
Financial Inclusion Stakeholders,” October 2014, available at [Link] the
Score Oct 2014 v1 [Link].
114
See Jill Gaulding, “Race Sex and Genetic Discrimination in Insurance: What’s Fair,” Cornell Law Review, vol. 80,
p. 1646, 1995, available at: [Link] “From the efficient discrimination
perspective, we have a right not to be classified for insurance purposes unless the classification corresponds to an
accurate prediction of risk. From the anti-discrimination perspective, we have a right not to be classified for insurance
purposes on the basis of unacceptable classifiers such as race, sex, or genetic factors.”
115
Dwork joins the accuracy group with her definition of individual fairness as closeness to an accurate assessment of
ground truth about individuals and her rejection of statistical parity as an adequate notion of fairness. But her
framework allows for a form of “affirmative action” by seeking ways to preserve statistical parity with the minimum
sacrifice of accuracy.
68 Mark MacCarthy

we are?116 Should we allow new evidence from statistical models to change our pre existing
notions of what is relevant to a certain decision?
Some strong moral intuitions regarding fairness are widely shared and can form a basis for
responsible action even when they are not legal requirements. But this is an area where the
interests of different groups collide and moral intuitions diverge. One advantage of a collabora
tive approach to a responsible use framework would be to facilitate discussions to air these
differences and seek commonality, most productively under the guidance of philosophers and
legal scholars who have a wide understanding of different approaches to ethical questions.
Coming to some consensus or even isolating key differences in standards of fairness are not
technical questions. They are normative and need to be approached as such in developing a
framework of responsible use.

Remedies for Disparate Impact


Enterprises also need to determine a course of action following a finding of disproportionate
adverse impact, discovered either in the design stage or as a result of an after the fact audit.
While companies need to make their own decisions, a stakeholder approach might help to
develop and share alternatives.
Current non discrimination law applies only to certain industries and contexts, and even in
those contexts, does not require designing away features of algorithms that pass a legal test for
disparate impact. Still some designers and users of data analytics feel a need to do more than
reflect the realities of what was, and in many respects still is, a discriminatory society. They are
examining the extent to which they should take steps to free their algorithms as much as possible
of what they view as harmful biases.
A conversation among stakeholders would help to clarify what steps are appropriate and in
what situations they make sense.
These discussions must face squarely the question of what to do when steps to provide fair
treatment for protected classes through adjustments in algorithms might disadvantage other
citizens who feel that this treatment is itself a form of discrimination. These are contentious
issues concerning the extent to which institutional practice should be race conscious117 and the
extent to which efforts to avoid disparate impact run afoul of the duty to avoid discriminatory
treatment.118

116
Cathy O’Neil reflects widespread moral intuitions in saying that it is unjust to base sentencing on factors that could not be
admitted in evidence, such as the criminal record of a defendant’s friends and family. “These details should not be
relevant to a criminal case or a sentencing,” she says, because “(w)e are judged by what we do not by who we are.” See
Weapons of Math Destruction, at Kindle location 423. Eric Holder holds this view as well saying that criminal sentences
“should not be based on unchangeable factors that a person cannot control.” See Attorney General Eric Holder Speaks at
the National Association of Criminal Defense Lawyers 57th Annual Meeting, August 1, 2014, available at [Link]
.[Link]/opa/speech/attorney-general-eric-holder-speaks-national-association-criminal-defense-lawyers-57th.
117
In Fisher v. University of Texas at Austin, 579 U.S. (2016), available at [Link]
15pdf/14-981 [Link], the Supreme Court allowed the University of Texas to continue using a race-conscious
admissions program, ruling that the program was permitted under the equal protection clause.
118
Under the Supreme Court decision in Ricci v. DeStefano, 557 U.S. 557 (2009), available at [Link]
[Link]/opinions/08pdf/[Link], an employer might not be permitted to respond to a finding that an employ-
ment test has a disparate impact by taking steps that would consciously disadvantage other groups without a “strong
basis in evidence to believe that it will be subject to disparate impact liability” if it continues to use that employment
test. See Kroll, Accountable Algorithms, for further discussion of the idea that this “strong-basis-evidence” test
counsels for building fairness into algorithms in the design stage rather than revising them after discovering a
disparate impact in use.
In Defense of Big Data Analytics 69

Detecting disparate impact and designing alternatives that are less impactful are technical
questions. But the decision to modify an algorithm to be fair is not. It involves legal, ethical,
business and social matters that go beyond technical expertise in system design. For this reason,
people from many disciplines and with a broad array of knowledge, expertise and experience
need to be involved in assessing what to do with analytical structures that have or could have a
disparate impact. Since the risks, laws and other considerations vary from domain to domain, it is
unlikely that there will be one response to the question of what to do with an algorithm that has
a disparate impact.

conclusion
The powerful new tools of data analysis making their way through our social and economic life
are designed and used by people, acting in their institutional capacities.119 They can be designed
and used in ways that preserve privacy and are fair to all, but this will not happen automatically.
If we want these outcomes, we have to design these features into our algorithmic systems and use
the systems in ways that preserve these values.
We cannot ignore or resist the inherent normative nature of this conversation, or reduce it to
adventures in technical novelty. Ryan Calo got it right in his discussion of the use of a robot to
intentionally kill Micah Johnson, the person who had shot five Dallas police officers in July
2016 and was cornered in a parking garage, saying, “rather than focus on the technology, we
should focus on whether it was legitimate to kill Micah Johnson instead of incapacitating him.
Because robots could do either.”120
We are not driven willy nilly by technology. The choices are up to us, acting through our
current institutions, imperfect as they are, to put in place policies to protect privacy and preserve
fairness in an age of big data analysis.

119
Rob Atkinson emphasizes this key point of human agency: “AI systems are not independent from their developers
and, more importantly, from the organizations using them.” See Rob Atkinson, “Will Smart Machines Be Less Biased
than Humans?” Brink, August 15, 2016, available at [Link]
biased-than-humans/?mc cid feaec2cdf1&mc eid aa397779d1.
120
Ryan Calo, “Focus on Human Decisions, Not Technological Ethics of Police Robots,” New York Times, July 14, 2016,
available at [Link]
focus-on-human-decisions-not-technological-ethics-of-police-robots.
4

Education Technology and Student Privacy

Elana Zeide

Education is increasingly driven by big data. New education technology (ed tech) creates virtual
learning environments accessible online or via mobile devices. These interactive platforms
generate a previously unimaginable array and detail of information about students’ actions both
within and outside of classrooms. This information not only can drive instruction, guidance, and
school administration, but also better inform education related decision making for students,
educators, schools, ed tech providers, and policymakers. This chapter describes the benefits of
these innovations, the privacy concerns they raise and the relevant laws in place. It concludes
with recommendations for best practices that go beyond mere compliance.
Data driven education tools have the potential to revolutionize the education system and, in
doing so, provide more access to better quality, lower cost education and broader socioeconomic
opportunity. Learners can access world class instruction online on demand without having to be
physically present or enroll in expensive courses. “Personalized learning” platforms, for example,
use detailed, real time learner information to adjust instruction, assessment, and guidance
automatically to meet specific student needs. Information collected during the learning process
gives researchers fodder to improve teaching practices.
Despite their potential benefits, data driven education technologies raise new privacy con
cerns. The scope and quantity of student information has exploded in the past few years with the
rise of the ed tech industry.1 Schools rely on educational software created by private companies
that collect information about students both inside and outside of classroom spaces.
Three characteristics of the education context call for more stringent privacy measures than
the caveat emptor consumer regime. First, student privacy protects particularly vulnerable
individuals maturing children and developing learners. Traditional rules seek to prevent
students’ early mistakes or mishaps from foreclosing future opportunities the proverbial
“permanent record.”
Second, students rarely have a choice regarding educational privacy practices. In America,
education is compulsory in every state into secondary school. Most schools deploy technology
on a classroom and school wide basis due to practical constraints and a desire to ensure
pedagogical equality.
Third, the integration of for profit entities into the school information flow is still novel in the
education system. American education institutions and supporting organizations such as test

1
TJ McCue, Online Learning Industry Poised for $107 Billion in 2015, Forbes, 2014.

70
Education Technology and Student Privacy 71

providers and accreditors have traditionally been public or non profit entities with a primary
mission to promote learning and academic advancement. Today, educators want to take
advantage of the latest technologies and give students the opportunity to develop digital literacy,
but don’t have the programming sophistication or resources to create data systems and apps
internally. They instead turn to ed tech companies that often spring out of for profit startup
culture or from Silicon Valley stalwarts such as Google and Microsoft.
Traditional student privacy regulations aren’t designed for an era of big data. They focus on
the circumstances under which schools can share student information and giving parents and
older students the right to review and request corrections to their education record. Student
privacy regulations, most notably FERPA, which requires that parents, older students, teachers,
or school administrators approve disclosure of personally identifiable student information can
share personally identifiable information. In today’s age of big data, however, the volume,
variety, and velocity of data schools generate and share makes it difficult for parents and schools
to keep track of information flow and data recipients’ privacy policies. Stakeholders fear that
companies share, or even sell, student data indiscriminately or create advertisements based on
student profiles. Many fear that, following in the steps of several for profit colleges, companies
will prioritize generating revenue over students’ educational interests.
Newer state laws supplement traditional student data protection. The most recent reforms
prohibit school technology service providers from using student data for anything other than
educational purposes. They don’t, however, cover more nuanced questions. These include
whether student data can be used by schools or for school purposes in ways that might still
undermine students’ education interests through unintentional bias or disparate impact. Data
driven ed tech allows schools and companies to test new approaches on student data subjects.
Finally, student privacy rules do not apply to platforms and applications that learners use outside
of school, which instead fall under the permissive consumer privacy regime. This chapter
suggests best practices to cultivate the trust necessary for broad acceptance of new ed tech and
effective learning spaces. These include accounting for traditional expectations that student
information stays in schools, in contrast to the caveat emptor underpinning of the commercial
context, as well as providing stakeholders with sufficient transparency and accountability to
engender trust.

today’s education technology


Schools, districts, and higher education institutions increasingly embrace digital and online services.
New tools take advantage of the widespread adoption of networked systems, cloud storage, and
mobile devices. Most outsource operation management to data driven vendors. Identification cards,
for example, use data to provide students with access to physical facilities and the ability to buy food
at “cashless” cafeterias. They use outside software and platforms. Some, like G Suite for Education
and Bing for Education, offer core functionalities like email, document creation, search engines,
and social media specifically designed for school environments. Administrators rely on student
information systems (SISs) to manage increasingly complex education records. These tools allow
schools to draw upon outside expertise and maintain their focus on their primary task of providing
students with high quality educational experiences.2

2
Samantha Adams Becker et al., NMC/CoSN Horizon Report: 2016 K-12 Edition (The New Media Consortium), 2016
[hereinafter Adams Becker et al., NMC/CoSN Horizon Report: 2016 K-12 Edition]; Samantha Adams Becker et al.,
NMC/CoSN Horizon Report: 2016 Higher Education Edition (2016) [hereinafter Adams Becker et al., NMC/CoSN
72 Elana Zeide

Instructors adopt apps to track individual progress, monitor behavior, and develop lessons
and learning materials. These are often free or freemium software that allow teachers to draw
upon collective knowledge or communal resources to inform their own teaching. They create
new pathways for educators to share information about pedagogical and classroom manage
ment practices that until recently remained siloed in individual classrooms or educational
institutions.

Data Based Education Platforms


Ed tech is moving from hardware like whiteboards and class computers to data driven instruc
tional software. Early learning management systems, like Blackboard, created tools to deliver
digitized documents like syllabi and course readings. These evolved into interactive digital
environments that provide multi media instructional content, assessment tools, and discussion
boards. Today’s “intelligent ” technologies go a step further and use data to assess students
automatically and adapt instruction accordingly.3

Massive Open Online Courses (MOOCs)


In 2012, online education platforms gained public prominence through so called Massive Open
Online Courses (MOOCs). Huge enrollment in Stanford professors’ internet accessed com
puter science course prompted educators to adopt similar models.4 They created companies or
consortia that offered thousands of students (massive) internet access (online) to learning
materials structured like traditional classes (courses) for free or nominal fees (open). Many
schools, particularly in higher education, used MOOC platforms to provide courses to enrolled
students.
The revolutionary aspect of MOOCs, however, lay in the fact that independent companies
such as Coursera and Udacity, or non profits such as edX, offered education directly and at no
cost to learners outside the traditional education system. Many saw MOOCs as a way to give
students in underserved communities or remote locations access to high quality educational
resources at minimal, if any, cost. They hoped that this availability would provide new pathways
that allowed learners to pursue their own educational goals regardless of their status and without
requiring enrollment or acceptance at a school or university.5

Virtual Learning Environments


MOOCs have since shifted from strictly massive, open, online, complete courses to a variety of
Virtual Learning Environments (VLEs) better tailored to specific institutions and student
populations. Today’s learning platforms may reach a smaller scope of students. They may limit
enrollment or charge fees. They often supplement online work with physical instruction or

Horizon Report: 2016 Higher Education Edition]; Elana Zeide, 19 Times Data Analysis Empowered Students and
Schools: Which Students Succeed and Why? (Future of Privacy Forum), Mar. 22, 2016, available at [Link]
content/uploads/2016/03/Final 19Times-Data [Link]; Jules Polonetsky & Omer Tene, Who Is Reading Whom
Now: Privacy in Education from Books to MOOCs, 17 Vand. J. Ent. & Tech. L. 927 (2014).
3
Vincent Aleven et al., Embedding Intelligent Tutoring Systems in MOOCs and E-Learning Platforms, Int’l Conf. on
Intelligent Tutoring Sys. 409 (Springer 2016).
4
Jane Karr, A History of MOOCs, Open Online Courses, N.Y. Times, Oct. 30, 2014.
5
Laura Pappano, The Year of the MOOC, N.Y. Times (Nov. 2, 2012).
Education Technology and Student Privacy 73

assessment. Most offer education on specific topics in ways that are more modular than
traditional courses, letting users create their own learning playlists. The entities creating these
new technologies are predominantly for profit companies, with Khan Academy’s tutoring
platform and edX’s MOOCs as notable exceptions.
These VLEs are more integral to the education system than their predecessors. They increas
ingly supplement and, in some cases, replace physical classes in lower and higher education.
Students can access thousands of explicitly education oriented offerings directly online or
through app stores. In the classroom, instructors increasingly employ digitally mediated learning
tools on school computers or students’ individual devices. They direct students to independent
online education resources to supplement in class activity.6
VLEs deliver coursework, references, and guidance, at scale and on demand. The
flexibility of these anywhere, anytime, self paced, playlist style offerings offers more conveni
ence to the nontraditional students who now make up the majority of the students in
America. These individuals may be adult learners or younger students who balance educa
tion with family or work commitments. It also provides ways for workers to continue their
education as employers’ needs shift due to technological advances and increasingly auto
mated functions.7

Data Driven Education


With new technology, these interactive education platforms can conduct real time assessment of
student progress to adjust learning paths to suit students’ specific needs instead of providing one
size fits all instruction. VLEs are increasingly interactive.8 Students can choose to review
specific content, solve problems, and answer questions. The learning platforms continuously
collect information about students’ actions, including not only explicit responses to questions
and practice problems, but metadata about what pages they read, whether their mouse paused
over a wrong answer, and the length of each learning session.9 Education technologies capture
detailed information about students’ actions and performance during the learning process that
has never before been possible in physical classrooms.10
To track student progress more precisely, these platforms use real time monitoring and
learning analytics. Most systems use algorithmic “learning analytics” to transform the flow of
clickstream level data into models that reflect student progress in defined areas.11 They can, for
example, examine Susan’s answers to math problems over time to see how often she answers
questions related to a certain algebraic concept correctly. By tracking student performance

6
Adams Becker et al., NMC/CoSN Horizon Report: 2016 K-12 Edition, supra note 2; Adams Becker et al., NMC/CoSN
Horizon Report: 2016 Higher Education Edition, supra note 2.
7
Competency-Based Learning or Personalized Learning (Office of Educational Technology, U.S. Department of
Education); Competency-Based Education Reference Guide (U.S. Department of Education, 2016).
8
Dan Kohen-Vacs et al., Evaluation of Enhanced Educational Experiences Using Interactive Videos and Web
Technologies: Pedagogical and Architectural Considerations, 3 Smart Learn. Env’t. 6 (2016).
9
Daphne Koller: What we’re learning from online education (TED Talks, Aug. 1, 2012), [Link]
watch?v U6FvJ6jMGHU.
10
Barbara Means & Kea Anderson, Expanding Evidence Approaches for Learning in a Digital World (Office of
Educational Technology, U.S. Department of Education, 2013).
11
Ryan Baker, Using Learning Analytics in Personalized Learning, in Handbook on personalized learning for
states, districts, and schools (Center on Innovations in Learning, 2016); George Siemens, Learning Analytics:
The Emergence of a Discipline, 57 Am. Behav. Sci. 1380–1400 (2013).
74 Elana Zeide

related to specific skills and concepts, these platforms can create virtual “knowledge maps” that
that provide detailed diagnostics of student competencies and gaps over time.12
In doing so, they “embed” assessment within instruction. Instead of tests at the end of a
chapter or semester, teachers can use performance information to gauge student mastery. This
reduces reliance on periodic, high stakes exams that may not accurately reflect learners’ actual
competencies. Digital platforms can display results for teachers to use or created automated
systems that adapt instruction according to students’ performance.13
“Personalized” learning platforms use these assessments in real time to automatically
adapt the pace and content of instruction and assessment based on how students perform.14
This allows differentiation between learners at scale. Many educators see smart instruction and
tutoring systems as the way to move past the factory model of education that often neglects
individual learner differences.15 The US Department of Education foresees that these
technologies could eventually lead to an entirely new way of measuring and accounting for
education based on competencies instead of degrees.16 It is a stark contrast to the days
when education records consisted of a few pieces of paper in school filing cabinets, report cards,
and transcripts.

Data Driven Education Decision Making


Data collected and generated about students though ed tech also helps inform broader
education related decision making.17 “Academic analytics” allow parents and students, insti
tutions, companies, and policymakers to examine trends over time and according to different
types of student populations. This can help them make more evidence based decisions to
help students better maneuver the education system, schools make more informed insti
tutional decisions, and companies optimize technologies and innovate according to market
place needs.18
Student and parents can see what schools and degrees correspond with higher rates of
employment and income long after graduation. They can get a sense of the likelihood of
acceptance at a specific school. They can navigate the complicated options regarding enroll
ment and federal aid to find the school that best meets their needs and their budgets.19

12
Bela Andreas Bargel et al., Using Learning Maps for Visualization of Adaptive Learning Path Components, 4 Int’l
J. Computer Info. Sys. & Indus. Mgmt. Appl. 228 (2012); Debbie Denise Reese, Digital Knowledge Maps: The
Foundation for Learning Analytics through Instructional Games, in Digital Knowledge Maps in Education 299
(Dirk Ifenthaler & Ria Hanewald eds., 2014); Kevin Wilson & Nicols Sack, Jose Ferreira, The Knewton Platform: A
General-Purpose Adaptive Learning Infrastructure (Knewton, Jan. 2015).
13
Aleven et al., supra note 3.
14
Baker, supra note 12; Christina Yu, What Personalized Learning Pathways Look Like at ASU, Knewton Blog (Apr. 2,
2013), [Link]
15
Andrew Calkins & Kelly Young, From Industrial Models and “Factory Schools” to . . . What, Exactly?, EdSurge
(Mar. 3, 2016), [Link]
Monica Bulger, Personalized Learning: The Conversations We’re Not Having (Data & Society Research Institute),
Jul. 22, 2106.
16
Competency-Based Learning or Personalized Learning, supra note 8.
17
Zeide, supra note 2.
18
Ellen B. Mandinach & Sharnell S. Jackson, Transforming Teaching and Learning through Data Driven
Decision Making (2012); Data Quality Campaign, Executive Summary: All States Could Empower Stakeholders to
Make Education Decisions with Data But They Aren’t Yet (2012); Suhirman et al., Data Mining for Education
Decision Support: A Review, 9 Int’l J. Emerging Tech. Learning 4 (2014).
19
Zeide, supra note 2.
Education Technology and Student Privacy 75

Schools can see which courses in the curricula students struggle with the most and adjust
course requirements or tutoring resources accordingly.20 Data mining can supplement this
information by incorporating into analytics information outside of classrooms. Many schools,
for example, monitor information on students’ social media accounts. Others, particularly in
higher education, incorporate administrative and operation data showing when students regu
larly attend school events or skip lunch to detect struggling learners and intervene before they
drop out.21
Tracking students in detail and over time gives educators and education providers more
insight into the learning process. Companies and other platform providers can determine what
variations promote faster student progress and pinpoint where students struggle. Researchers use
education data to analyze student and school success across the country to highlight lessons to
learn and issues that require more attention.22
Policymakers can observe patterns in college readiness, enrollment, and attainment and make
more informed reforms. They can use state longitudinal data systems to trace individual
academic and workplace trajectories. Advocates can gather evidence supporting specific
reforms. Recent reports, for example, revealed that many high school seniors say they intend
to go to college, yet one in ten of those students never apply. As a result, some schools have
simplified the federal student aid application process with positive outcomes.23

today’s student privacy concerns


Some of the demand for stricter privacy protection is the shock of the new. New student
information practices are often at odds with expectations about information practices in learning
environments. Today’s technology captures student information continuously and at a level of
detail previously unheard of in learning environments. Schools share student data daily with a
broad array of private service providers.24
Most students, parents, and educators have no specific sense of the data or privacy practices of
school software providers.25 They don’t know what data companies collect about them, how that
information might inform pedagogical or institutional decisions, and if anyone is truly looking
out for their interests given the fast pace of technological change. Many companies make
matters worse by being secretive about their practices or failing to consider the perspectives of
different stakeholders in the education community. This lends itself to speculation, sensational
ism, and confusion.
Poor preparation and communication brought down a $100 million, Gates Foundation
funded nonprofit which offered a secure data repository to districts and states to use at their own

20
Ben Kei Daniel, Overview of Big Data and Analytics in Higher Education, in Big Data and Learning Analytics in
Higher Education 1–4 (2017); Zeide, supra note 2; Maria Eliophotou Menon et al., Using Data to Improve
Higher Education: Research, Policy and Practice (2014).
21
Zeide, supra note 2.
22
Id.
23
Id.; see also Nudging for Success: Using Behavioral Science to Improve the Postsecondary Student Journey (ideas42),
Jun. 2016.
24
Joel R. Reidenberg et al., Privacy and Cloud Computing in Public Schools (Center on Law and Information Policy),
Dec. 2013.
25
New FPF Survey Shows Parents Overwhelmingly Support Using Student Data to Improve Education, Future of
Privacy (Sept. 21, 2015), [Link]
port-using-student-data-to-improve-education/.
76 Elana Zeide

discretion. The organization, inBloom, launched with considerable fanfare, trying to dazzle the
technology industry and education reformers with the platform’s capabilities.26 However, its
leaders did not consider how new practices might run counter to parental expectations and,
accordingly, arouse suspicion and fear. The public debate that followed suffered from rampant
speculation, media sensationalism, and outright confusion that conflated inBloom with the
controversial adoption of the Common Core State Standards.27 Most states and districts with
drew their association with the organization, which closed soon afterwards.28

Pervasive Surveillance
Many parents and advocates consider continuous student monitoring as intrusive, regardless of
the possible benefits of pervasive data collection. Schools increasingly incorporate audiovisual
information from security cameras or social media monitoring for pedagogical and research
purposes.29 As described earlier, they may also collect data from nonacademic sources. The
wealth of information available will expand as the Internet of Things generates more data from
wearables and connected devices.30
Concerns about student data collection surface even when information is publicly available.
In 2015, for example, reports indicated that Pearson, the textbook and test publisher, monitored
Twitter feeds to detect and prevent cheating on the Partnership for Assessment of Readiness for
College and Careers tests which were administered at different times to students across the
country.31 While doing so was certainly legal, and arguably part of Pearson’s duty to ensure a fair
testing environment, some members of the public reacted with alarm.32 While parents were
aware that their children’s posts were publicly accessible, they did not necessarily connect that
understanding to the possibility that a company might systematically collect and analyze specific
students’ posts. The intentional, systematic nature of the surveillance changed how parents
perceived the companies’ reported privacy practices.

Permanent Records
New education technology also creates uncertainty about the accuracy, representativeness, and
retention of student data. Data used to drive instruction and track student progress might be

26
Benjamin Herold, inBloom to Shut Down Amid Growing Data-Privacy Concerns, Educ. Week, 2014; Ki Mae
Heussner, Gates Foundation-Backed InBloom Frees up Data to Personalize K-12 Education, GigaOM (Feb. 5,
2013), [Link]
27
Elana Zeide, Parsing Student Privacy, Technology | Academics | Policy (Sept. 18, 2015), [Link]
.com/Blog/Featured-Blog-Post/[Link].
28
Herold, supra note 27.
29
Emmeline Taylor, Surveillance Schools: A New Era in Education, in Surveillance Schools: Security, Discipline
and Control in Contemporary Education 15 (2013); Amelia Vance & J. William Tucker, School Surveillance:
The Consequences for Equity and Privacy (National Association of State Boards of Education), Oct. 2016.
30
Max Myers, Can the Internet of Things Make Education More Student-Focused?, Government 2020 (Dec. 3, 2014),
[Link]
31
Audrey Watters, Pearson, PARCC, Privacy, Surveillance, & Trust, Hackeducation (Mar. 17, 2015), [Link]
[Link]/2015/03/17/pearson-spy; e-mail, Pearson is spying on students (Mar. 17, 2015).
32
Pearson Under Fire for Monitoring Students’ Twitter Posts, Bits Blog, [Link]
pearson-under-fire-for-monitoring-students-twitter-posts/; e-mail, Pearson is spying on students, supra note 33; Cynthia
Liu, Pearson Is Not Spying on Student Tweets; Instead Enlisting Public School Officials to Protect Its Tests, K 12 News
Network (Mar. 14, 2015), [Link]
enlisting-public-school-officials-to-defend-its-intellectual-property/.
Education Technology and Student Privacy 77

inaccurate, nonrepresentative, or outdated. Because student information can be stored indefin


itely and capture mistakes in much greater detail, it raises the possibility of outdated information
unfairly limiting students’ academic and employment opportunities a contemporary version of
the proverbial permanent record.33

Big Data Bias


Big data processing also creates new privacy concerns. Data mining risks incorporating irrelevant
or improper information into education decision making. Subtle bias may also be unintention
ally embedded into algorithmic data processing.34 Seemingly neutral big data decision making
may have a disparate impact on different communities in ways that promote, rather than
ameliorate, existing inequalities.35 Most of the individuals who successfully completed early
MOOCs, for example, were already educated, not part of the underserved community the
learning platforms sought to serve.36

Experimentation Ethics
Digitally data driven education is new and, like most innovative technologies, involves ongoing
experimentation and improvement. Stakeholders want to know that ed tech actually achieves
the promised result that someone has tested a given system, hopefully over time and independ
ent from the people who seek to profit from it. They want to know that developers and educators
have considered potential unintended.37 In the 1970s, similar concerns prompted academic
researchers to create an ethical framework and review system governing human subject
research.38 No such rules apply, however, to companies’ “optimization” and “experimenta
tion.”39 Facebook, for example, was harshly criticized for conducting research that altered
different readers’ news feeds to promote positive or negative emotions.40

33
Anthony Cody, Will the Data Warehouse Become Every Student and Teacher’s “Permanent Record”?, Educ. Week
(May 20, 2013), [Link] the data warehouse [Link]?
cmp SOC-SHR-FB; David Sirota, Big Data Means Kids’ “Permanent Records” Might Never Be Erased, Mother
board (Oct. 24, 2013); Elana Zeide, The Proverbial Permanent Record (2014).
34
danah boyd & Kate Crawford, Six Provocations for Big Data (2011).
35
Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 Cal. L. Rev. 671 (2016).
36
John D. Hansen & Justin Reich, Democratizing Education? Examining Access and Usage Patterns in Massive Open
Online Courses, 350 Science 1245 (2015).
37
Michael Zimmer, Research Ethics in the Big Data Era: Addressing Conceptual Gaps for Researchers and IRBs, in
Beyond IRBs: Ethical Review Processes for Big Data Research (Future of Privacy Forum Washington, DC,
Dec. 2, 2015); Katie Shilton, Emerging Ethics Norms in Social Media Research, in Beyond IRBs: Ethical Review
Processes for Big Data Research (Future of Privacy Forum, Washington, DC, Dec. 2, 2015); Jacob Metcalf, Big
Data Analytics and Revision of the Common Rule, 59 Comm. ACM 31 (2016).
38
Department of Health, Education and Welfare Washington, DC, Ethical Principles and Guidelines for
the Protection of Human Subjects of Research (1979).
39
See Omer Tene & Jules Polonetsky, Beyond IRBs: Ethical Guidelines for Data Research, in Beyond IRBs: Ethical
Review Processes for Big Data Research (Future of Privacy Forum, Washington, DC, Dec. 2015).
40
Jeff T. Hancock, The Facebook Study: A Personal Account of Data Science, Ethics and Change, in Proceedings of
the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing 1 (CSCW ’15,
Mar. 14–18, 2015); Chris Chambers, Facebook Fiasco: Was Cornell’s Study of “Emotional Contagion” an Ethics
Breach?, The Guardian, Jul. 1, 2014, [Link]
nell-study-emotional-contagion-ethics-breach; danah boyd [sic], Untangling Research and Practice: What Facebook’s
“Emotional Contagion” Study Teaches Us, 12 Res. Ethics 4–13 (2016); James Grimmelmann, The Law and Ethics of
Experiments on Social Media Users, 13 J. Telecomm. & High Tech. L. 219 (2015).
78 Elana Zeide

Education Specific Considerations


As society grapples with issues surrounding surveillance, discrimination, and the ethics of big
data across sectors, several factors drive heightened concerns in the education context.41

The Difficulty of Opting Out of Mainstream School Data Policies


Using student or parental consent to ensure adequate privacy is particularly difficult in the
context of formal education. School attendance is compulsory in every state into secondary
school. Parents could theoretically switch schools or homeschool their children if they object to
schools’ technology and privacy policies, but few have the resources or incentive to do in
practice. Schools, districts, and an education model based on classroom wide instruction
incentivize students and parents to participate in mainstream technological use, regardless of
their individual privacy preferences.42 Students who opt out of using the default class technology
often end up at a social and academic disadvantage, so most parents eventually capitulate.

Vulnerable Children and Developing Learners


K 12 schools are sites of particular concern. Many students are children, an inherently vulnerable
population. In other contexts, the law recognizes that students may make mistakes as they
mature. The criminal justice system, for example, accords juveniles greater leniency and permits
their records to be sealed or expunged so that offenders can move past early mistakes. Traditional
student privacy rules against school disclosure function similarly by limiting the information
available to outsiders. Today parents fear that students’ second grade performance or behavior
might limit future opportunities through a modern day version of the proverbial permanent
record. As noted in a recent report from the Executive Office of the President, “As learning itself
is a process of trial and error, it is particularly important to use data in a manner that allows the
benefits of those innovations, but still allows a safe space for students to explore, make mistakes,
and learn without concern that there will be long term consequences for errors that are part of
the learning process.”43
Parents also have specific concerns about companies marketing to their children. The
Children’s Online Privacy Protection Rule (COPPA), for example, requires commercial web
sites to obtain parental consent prior to collecting, using, or disclosing personal information
about children under 13 years old.44 Another federal privacy statute, the Protection of Pupil
Rights Amendment (PPRA), requires federally funded schools to obtain written consent
before administering surveys that include such sensitive information as political beliefs or
religious practices as well as collection, disclosure, or use of personal information for marketing
or sales.45

41
Elana Zeide, Student Privacy Principles for the Age of Big Data: Moving beyond FERPA and FIPPs, 8 Drexel L. Rev.
339 (2016); Polonetsky & Tene, supra note 2; Elana Zeide, The Structural Consequences of Big Data-Driven
Education, 5 Big Data, 164–172 (2017).
42
Zeide, supra note 43.
43
John Podesta, Big Data: Seizing Opportunities, Preserving Values (Executive Office of the President), May 1, 2014.
44
15 U.S. C § 650.
45
20 U.S. § 1232h.
Education Technology and Student Privacy 79

For-Profit Fears
After decades of scandals, Americans place little trust in for profit education providers.46 The
norm is still public and non profit schools that explicitly serve educational missions. The same is
true of the organizations that traditionally offer data dependent services, such as The College
Board and National Clearinghouse. Most ed tech providers handling student data are for profit
companies, with Khan Academy and edX as notable exceptions.47 Public schools and nonprofits
are explicitly oriented and legally bound to education oriented missions. For profit education
companies, in contrast, face the primary pressure of generating revenue, which they may do at
the cost of educational quality. For profit colleges, for example, have been repeatedly repri
manded for investing in marketing rather than in academics.48

student privacy protection


Regulation of School Information
Parents and schools traditionally consider student privacy in terms of confidentiality. This
informs the primary regulatory protection governing education data for the past forty years, the
Family Educational Rights and Privacy Act (FERPA) and similar state statutes.49 FERPA created
rules restricting how federally funded schools, districts, and state educational agencies control
access to personally identifiable student information.50 It provides parents and students with
nominal control over school disclosure of personally identifiable student information. In
practice, however, most schools share information under the statute’s School Official Exception,
which requires the data recipient to be performing services on the schools’ behalf and have what
educators determine to be a legitimate educational interest in doing so.51 FERPA’s regulations
also indicate that recipients should not re disclose FERPA covered personally identifiable
student information, although they can do so with de identified data.52

Today’s Student Data Privacy


FERPA and similar state laws, however, do not address many issues raised by today’s data driven
ed tech. They were designed for a world of paper records, not networked, cloud based platforms

46
Why Do Americans Mistrust For-Profit Universities?, The Economist, July 50, 2013.
47
Michelle Molnar, K-12 For-Profits Increasingly “Cashing in on Kids,” Education Dive (Apr. 8, 2014), [Link]
.[Link]/news/k-12-for-profits-increasingly-cashing-in-on-kids/247645/.
48
See, e.g., Committee on Health, Labor, and Pensions, U.S. Senate, For Profit Higher Education: The Failure to
Safeguard the Federal Investment and Ensure Student Success, Vol. 1. No. 1–3. U.S. Government Printing Office,
2012.
49
20 U.S. § 1232g.
50
Personally identifiable information includes direct and indirect identifiers like names, parents or other family
members’ names, address and address of student or family, as well as “other personal identifiers information that
alone or in combination, is linked to a specific student that would allow a reasonable person in the school
community, who does not have personal knowledge of the relevant circumstances, to identify the student with
reasonable certainty” (34 CFR §99.3).
51
§ 99.31(a)(1) (School Officials Exception); FERPA Frequently Asked Questions: FERPA for School Officials (Family
Policy Compliance Office, U.S. Department of Education); id.; Brenda Leong, Who Exactly IS a “School Official”
Anyway?, Future of Privacy Forum Jan. 19, 2016; The Family Educational Rights and Privacy Act: Guidance for
Reasonable Method and Written Agreements (Family Policy Compliance Office, U.S. Department of Education).
52
FERPA’s definition for de-identication is more stringent than simply stripping data of unique identifiers. The data also
can’t include information about such small groups of students that they would be easily recognizable by community
members. § 99.3.
80 Elana Zeide

that collect information automatically.53 This is a sharp contrast to long standing expectations,
supported by the federal student privacy regulatory regime, that personally identifiable infor
mation primarily remains within the confines of public or nonprofit educational institutions.
As discussed above, schools today outsource a broad variety of services that use student data
everything from administrative data systems to cafeteria billing. Individual educators adopt
innovative apps, often without a concrete sense of what happens to the data they share.54 Private,
for profit companies design and provide vast majority these technologies.
Networked and cloud based platforms continuously and automatically collect a wide array of
administrative and academic data about students. Even diligent educators find it difficult to
understand exactly what information they share with the companies that provide these tools,
let alone monitor and evaluate providers’ own privacy practices.55
Different rules govern the disclosure and subsequent sharing and use of learner information
obtained through schools or directly from learners. Schools can only share
Ed tech providers must meet different privacy requirements if they collect data about students
using technologies in schools or learners independent of formal education institutions. User
information shared under the auspices public and publicly funded schools is subject to the
student privacy regime. , most notably FERPA.56 The less restrictive consumer privacy rules
apply if a company obtains the exact same data points directly from users.57 As a result, the
Department of Education has no recourse against companies regarding their collection, use,
disclosure, and protection of information from schools.
State legislators have responded with a wave of regulation that imposes more requirements on
schools sharing student information. Some apply directly to commercial ed tech providers,
limiting how can use the data they receive.58
Several state have passed rules limiting company use of student data for “educational”
purposes. Most prominently, California law confined operators who knowingly provide services
used primarily for K 12 “school purposes” to only use information, including creating profiles, to
serve such purposes.59 It defines “K 12 school purposes” as those “that customarily take place at

53
Zeide, supra note 43; Elise Young, Educational Privacy in the Online Classroom: FERPA, MOOCs, and the Big Data
Conundrum, 28 Harv. J. Law & Tec. 549 (2015); Reidenberg et al., supra note 25; Daniel J. Solove, FERPA and the
Cloud: Why FERPA Desperately Needs Reform, LinkedIn (Dec. 11, 2012), [Link]
20121211124311-2259773-ferpa-and-the-cloud-why-ferpa-desperately-needs-reform.
54
Zeide, supra note 43; Young, supra note 55; Reidenberg et al., supra note 25; Solove, supra note 55; Natasha Singer,
Privacy Pitfalls as Education Apps Spread Haphazardly, N.Y. Times, Mar. 11, 2015, [Link]
.[Link]/2015/03/12/technology/learning-apps-outstrip-school-oversight-and-student-privacy-is-among-the-risks
.html; Khaliah Barnes, Why a “Student Privacy Bill of Rights” Is Desperately Needmed, Washington Post, Mar. 6,
2014, [Link]
ately-needed/.
55
Zeide, supra note 43; Young, supra note 55; Reidenberg et al., supra note 25; Solove, supra note 55; Singer, supra note
56; Barnes, supra note 56.
56
While often discussed in terms of “prohibitions,” FERPA technically does not bar certain practices but instead
imposes conditions that schools, districts, and education agencies must meet in order to receive federal funding. 20
U.S.C. § 1232g(a)(1)(A); 20 U.S.C. § 1232g(a)(1)(B).
57
Steve Kolowich, Are MOOC-Takers “Students”? Not When It Comes to the Feds Protecting Their Data, Chron.
Higher Educ. Blogs (Dec. 3, 2014), [Link] at&
utm source at&utm medium en; Zeide, supra note 43; Young, supra note 55.
58
Student Data Privacy Legislation: A Summary of 2016 State Legislation (Data Quality Campaign), Sept. 2016; State
Student Privacy Law Compendium (Center for Democracy and Technology), Oct. 2016.
59
The law specifically governs any “operator of an Internet Web site, online service, online application, or mobile
application with actual knowledge that the site, service, or application is used primarily for K–12 school purposes and
was designed and marketed for K–12 school purposes.” CAL. BUS. & PROF. CODE § 22584(a).
Education Technology and Student Privacy 81

the direction of the K 12 school, teacher, or school district or aid in the administration of school
activities, including, but not limited to, instruction in the classroom or at home, administrative
activities, and collaboration between students, school personnel, or parents or are for the use and
benefit of the school.”60 The statute specifically carves out an exception for personalized
learning platforms, stating that its provisions do “not limit the ability of an operator to use
student data for adaptive learning or customized student learning purposes.”61 By regulating
vendors directly, these types of laws address some concerns about commercial use of student data
without increasing the burden on schools to keep track of data recipients’ privacy practices.
However, it remains to be seen how these will play out in practice.

Consumer Privacy Protection for “Learner” Data


Most student privacy protection, however, does not cover a wide variety of data collected directly
from “learners” by independent education providers. These frequently present themselves as
promoting educational missions, but are firmly within the commercial sphere subject to
financial pressures to turn a profit and more lenient consumer privacy regulations. Under these
rules, companies can share and use data with anyone for any purpose in accordance with the
terms of service accepted by their users.62
Many scholars and advocates, as well as the President’s Council of Advisors on Science and
Technology, question whether user acceptance of the vague fine print in privacy policies
constitutes meaningful knowledge or consent.63 Reading privacy policies is not only time
consuming, but often provides little meaningful information about specific corporate infor
mation practices.64 Further, users may have no realistic alternative to using a specific data
driven tool.65 Finally, even providing such notice and consent may not be possible given the
size and continuous collection capacity of sensor based devices connected to the emerging
Internet of Things.66

tomorrow’s best practices


As discussed above, data driven education technologies can provide tremendous benefit to stu
dents and the broader education system and labor markets. However, the volume, velocity, and

60
CAL. BUS. & PROF. CODE § 22584(j); Amelia Vance, Data Privacy Laws Follow Lead of Oklahoma and
California, 16 The State Education Standard25, May 2016.
61
CAL. BUS. & PROF. CODE § 22584(l); see also Jean-Louis Maritza, California Breaks New Ground in Education
Privacy Law with K-12 Student Data Privacy Bill, Nat. L. Rev, (Sept. 14, 2014).
62
The Federal Trade Commission (FTC) and state attorney generals may impose fines or bring suits if they determine
that companies have not upheld their promises and engage in deceptive business practices. The Federal Trade
Commission Act, 15 U.S.C. § 45(a)(2) (section 5).
63
President’s Council of Advisors on Science & Technology, Big Data and Privacy: A Technological Perspective, The
White House (May 1, 2014), [Link]/bigdata.
64
Joel R. Reidenberg et al., Disagreeable Privacy Policies: Mismatches between Meaning and Users’ Understanding, 30
Berkeley Tech. L. J. 39 (2015); Norman Sadeh et al., The Usable Privacy Policy Project (Technical Report, CMU-
ISR-13–119, Carnegie Mellon University, 2013).
65
Joel R. Reidenberg et al., Privacy Harms and the Effectiveness of the Notice and Choice Framework, 11 ISJLP 485
(2015); Robert H. Sloan & Richard Warner, Beyond Notice and Choice: Privacy, Norms, and Consent, 14 J. High
Tech. L. 370 (2014).
66
Carlo Maria Medaglia & Alexandru Serbanati, An Overview of Privacy and Security Issues in the Internet of Things, in
The Internet of Things 389 (2010).
82 Elana Zeide

variety of big data limit the capacity of individuals and institutions to control, consent to, and
oversee data collection, sharing, and quality that have traditionally protected “student privacy.”67
As a result, companies and educators often struggle to determine which practices are not only
legal but also acceptable to students, learners, and the education community. In the absence of
specific rules or social consensus, it rests on traditional educators and independent education
providers to determine what constitutes student privacy through their technological structures,
day to day protocols, and formally articulated standards. The following best practices consider
ing education stakeholders’ expectations, transparency, and accountability are excellent places
to start.

Consider Student Expectations and Education Values


It is important for commercial and private education technology providers and the educators and
policymakers who rely on their services to consider the perspectives of multiple stakeholders
across the educational context. This includes not only the school and administrator who are
their clients, but also students, parents, educators, and the broader community. When creating
technologies and communicating with stakeholders and the broader community, ed tech
providers and data processors need to consider the considerations noted earlier, which make
expectations different in education. Data policies and practices should accord with the high
stakes of the education context and the trust required in learning environments. This may entail
minimizing collection and retention to only essential information, or prohibiting data practices
that do not explicitly serve school or student interests.

Transparency
A best practice to prevent panicked responses to new information practices is to provide as much
transparency as possible regarding specific information practices in ways that the general public
can understand, as well as providing more detailed information for those who seek more
specifics.68 In creating systems and implementing governance and ethical review protocols,
companies should consider ways they can make their practices more transparent in terms of (1)
data practices; (2) specific technologies; (3) decision making; (4) security; and (5) oversight. This
can be difficult given the velocity, variety, and volume of big data and the complexity of
algorithmic and automated analysis. It is also complicated by the fact that many companies
consider their data processing proprietary information they do not want to release for public
scrutiny. However, the black box of big data decision making creates considerable uncertainty
that feeds public and parental fears.

Accountability
Since the Facebook’s “emotional contagion” scandal in 2015, companies are under increasing
pressure to put similar protocols or review mechanisms in place to ensure that they do not

67
Zeide, supra note 43.
68
Privacy Technical Assistance Center, Protecting Student Privacy While Using Online Educational Services: Require-
ments and Best Practice (Privacy Technical Assistance Center, U.S. Department of Education 2014), Feb. 25, 2014;
Department of Education, Transparency Best Practices (PTAC 2014).
Education Technology and Student Privacy 83

manipulate their users in exploitative or potentially harmful ways.69 Facebook, for example, has
implemented a formal review process before beginning certain user based experiments and pub
lishing the results.70 Advocates and policymakers debate whether companies should be required to
create such internal review systems or be accountable to external ethical review boards.71 Regardless,
it behooves companies especially those dealing with children to create review systems to consider
ways that information practices might disadvantage student data subjects.
Companies should be prepared to account for their decision making in terms of specific
individuals and broader patterns. While the small decisions that drive personalized learning
platforms or games might seem inconsequential, they have growing impact on students’ academic
progress, attainment, and, future opportunities. This is where self regulation such as pledges and
certifications backed up by auditing and accountability can be useful.
In a regulatory system that is fractured and difficult for companies, let alone stakeholders, to
comprehend, many companies turn to self regulation to reassure students and stakeholders that
they take user privacy seriously. The Student Privacy Pledge organized by the Future of Privacy
Forum and Software and Information Industry Association (SIIA), for example, has over
300 companies and organizations who have promised to abide by ten principles which include
not selling personal information and “not build[ing] a personal profile of a student other than for
supporting authorized educational/school purposes or as authorized by the parent/student.”72
Signatories who fail to abide by these principles may be subject to FTC enforcement.
Other companies take advantage of nonprofits and advocacy groups who offer “seals” signaling
compliance with certain regulatory and security standards. These generally involve companies
passing an internal audit of their practices before obtaining a certificate of compliance that they
can then display on their websites and promotional material. Prominent privacy toolkit and audit
providers include iKeepSafe and the Consortium of School Networks (CoSN) CoSN.73 These
self regulatory mechanisms are, however, successful only to the degree that stakeholders trust that
participants are actually audited and held accountable for noncompliance.74

conclusion
Ed tech offers tremendous potential to provide a better education for students worldwide and of
all ages. The data collected allows learning platforms to adjust course content, pacing, and
testing automatically to suit individual student needs and better inform students, parents,

69
Tene & Polonetsky, supra note 41; Ryan Calo, Consumer Subject Review Boards: A Thought Experiment, 66 Stan.
L. Rev. Online 97 (2013).
70
Molly Jackman & Lauri Kanerva, Evolving the IRB: Building Robust Review for Industry Research, 72 Wash. & Lee
L. Rev. Online 442 (2016).
71
Tene & Polonetsky, supra note 41; Camille Nebeker et al., New Challenges for Research Ethics in the Digital Age, in
Beyond IRBs: Ethical Review Processes for Big Data Research (Future of Privacy Forum Washington, D.C.),
Dec. 10, 2015.
72
Pledge to Parents & Students (Future of Privacy Forum & Software & Information Industry Association), 2016.
73
FERPA [Link] (iKeepSafe); Consortium for School Networking, Protecting Student Privacy (Consortium of
School Networks).
74
Michele Molnar, Student-Privacy Pledge for Ed-Tech Providers Draws Praise, Criticism, Educ. Week (Oct. 12, 2014),
[Link] privacy pledge for ed-tech providers draws
praise [Link]?cmp SOC-SHR-FB; Sophia Cope & Gennie Gebhart, Loopholes and Flaws in the Student
Privacy Pledge, Electronic Frontier Found. (Oct. 20–2016), [Link]
flaws-student-privacy-pledge; Jules Polonetsky, Student Privacy Pledge Loopholes? Nope. We Did Our Homework.,
Future of Privacy Forum (Oct. 21, 2016), [Link]
work/.
84 Elana Zeide

schools, companies, and policymakers making education related decisions. At the same time,
schools’ routine sharing of student information with outsiders and independent education
providers’ collection of learner data raise questions about unintended inequity, unforeseen
consequences, and whose interests will be prioritized. Private companies can legally use student
data in ways that worry parents and advocates, even under newer state laws that restrict
processing information for noneducational purposes. Entities handling student data can take a
proactive stance by considering different stakeholder perspectives, providing meaningful trans
parency, and ensuring accountability. Doing so helps reassure students, educators, and the
broader community not only of their good intentions, but thoughtful implementation of data
driven systems. The transformative potential of data driven ed tech requires sufficient stake
holder trust and support. Tailoring privacy practices to the specific considerations of learning
environments paves the way for truly revolutionary innovation.
5

Mobile Privacy Expectations

How Privacy Is Respected with Mobile Devices

Kirsten Martin and Katie Shilton

introduction
Privacy expectations about the information collected by our mobile devices has been a specter in
the popular press since the advent of smartphones in 2007 and in the academic literature for
years before that (Curry, 2002). In this time, little has been solved and concerns have only grown.
Consider recent news that antichoice groups have targeted women visiting Planned Parenthood
locations with antiabortion advertisements using a technique based on phone location data
(Coutts, 2016). Or news that a popular pregnancy app was exposing consumers to privacy threats
due to a software bug (Beilinson, 2016). Mobile application company Uber was hit with criticism
after journalists revealed it had access to all of its users’ locations through a so called “God view”
(Bellware, 2016). A big data company gathered the mobile device IDs of Iowa caucus goers and
de anonymized them by matching those IDs with real world online profiles (Hill, 2016).
These incidents cause alarm because individuals use mobile devices to socialize, communi
cate, play, shop, bank, and monitor their own behavior, health, and moods. Mobile devices
enable novel possibilities for human interaction while changing the privacy landscape in
important ways. New types of information are available through smartphones, tablets, and
e readers, and mobile applications enable new data collection actors. Data may be collected
by application developers (e.g., Rovio Games), mobile providers (e.g., AT&T), operating system
providers (e.g., Google), and device manufacturers (e.g., Blackberry and Apple). These data may
also be shared with third party tracking or advertising companies.
Consumers and regulators both struggle to define their own expectations for the privacy of this
data. When is it acceptable for companies to track, store, and sell data? Regulatory bodies such as
the US Federal Trade Commission have suggested privacy approaches for developers (Federal
Trade Commission, 2012), and industry groups have begun to make recommendations (Future
of Privacy Forum & Center for Democracy & Technology, 2012). Meanwhile, watchdog groups
defending consumer privacy rights have brought cases on behalf of consumers against com
panies such as a flashlight app that was recording users’ locations (Kang, 2013). The European
Union has taken a much more proactive regulatory stance. Beginning in 2018, the EU’s General
Data Protection Regulation will apply to any company processing the data of EU citizens, and
will require measures such as strong forms of notice, strong privacy defaults, privacy impact
assessments, and hiring of a data protection officer (Jones, 2016).
In this chapter, we focus on the ever changing ecosystem of applications, hardware, operating
systems, and telecommunication companies that collect increasing amounts of personal data

85
86 Kirsten Martin and Katie Shilton

Government

Fellow Users

Browsers
Telecom/ISP

Researchers

Firms
Mobile App Advertising
App Store
Ad
Networks
App
Developers
Data
Phone/Device
Brokers

figure 5.1 The mobile device ecosystem

(see Figure 5.1). Mobile developers are key agents impacting consumer privacy, deciding what
data to collect and how to store and share it. Yet, application developers struggle to find a
common understanding of users’ privacy expectations (Greene & Shilton, 2017). In this chapter,
we explore what we know about the privacy challenges raised by mobile devices; user privacy
expectations in regards to mobile devices; and developer responses to those challenges. Through
so doing, we illustrate the role of each actor in the mobile ecosystem in respecting privacy
expectations.

what we know about privacy and mobile devices


1 Mobile Technologies Have Technological Properties that Change Social Practices
What’s so new about mobile devices? They are host to increasingly prevalent digital activity.
Consumers use their phones for a wide range of application-enabled activities, from accessing
social media to banking and performing searches on health conditions. In 2015, 64% of
Americans owned a smartphone(Smith, 2015). Between 2013 and 2015, mobile application usage
grew 90% and contributed to 77% of the total increase in digital media time spent by US
consumers. Two out of every three minutes Americans spend with digital media is on a mobile
device, and mobile applications constitute just over half of those minutes (Lella & Lipsman,
2015). In many ways, mobile devices are simply pocket-sized laptops. But in other ways, their
unique technological affordances – what actions they enable easily or disallow entirely (Fried-
man & Nissenbaum, 1996; Shilton, Koepfler, & Fleischmann, 2013) – matter because they shift
social practices.
First, new types of information are available through smartphones, tablets, and e-readers. Data
such as location, motion, communications content, in-application activities, and sound are
easily gathered through mobile devices. Much of the literature on privacy and mobile devices, in
particular, has focused on the fact that mobile devices uniquely enable the collection of location
information. Using cell-tower triangulation, WiFi mapping, and GPS, mobile phones can
How Privacy Is Respected in Mobile Devices 87

record a user’s location frequently and accurately (Barkhuus & Dey, 2003; Decker, 2008;
He, Wu, & Khosla, 2004; Shilton, 2009).
Second, mobile applications enable new data collection actors. Data may be collected by a
variety of companies: application developers, mobile providers, operating system providers, and
device manufacturers. These data may also be shared with third party tracking or advertising
companies. Mobile data may be compelled from companies during law enforcement investi
gations (Soghoian, 2011), or “skimmed” from geographic locations using special devices that
imitate cell towers (“Stingray Tracking Devices,” 2016).
Finally, the smartphone or tablet is more personal and connected to the individual than are
desktops or other computing devices. Many consumers carry their mobile devices with them at
all times and the device is intimate to them in a way that differs from the shared personal
computer at home or work. Tracking of user information gathered from phones may feel more
personal, invasive, and ubiquitous than surveillance of work or home computers.
This personalization is important because how users relate to their mobile devices is different
from how they relate to their desktop. Users have a more intimate relationship with devices they
carry on their person, accessible everywhere (but also following them everywhere). This allows
users to share a constant flow of information where they are, who they are with, what they are
doing, and what they are worried about with the mobile device and, possibly, the entire
ecosystem in Figure 5.1. This change in social practices leads us to point #2: surveillance and
mobile devices.

2 Surveillance Is Highly Salient for Mobile Device Users


The personal and omnipresent nature of mobile devices, and their data collection capabilities,
lend themselves to ubiquitous tracking. Mobile phones not only track online behaviors such as
browsing and shopping habits, but they can also track previously offline activities such as
commuting routes and habits, frequented locations, and interpersonal contacts. Practically,
consumers’ online life is as deeply integrated into their social life and as radically heterogeneous
as their offline life (Nissenbaum, 2011). The ability to track both online and offline activities with
mobile devices lends itself to concerns about surveillance.
Jeffery Rosen (2000) frames surveillance as the unwanted gaze from direct observations as well
as from searches on stored records. Famously, Foucault used the architectures of hospitals and
prisons as classic illustrations of surveillance, where persistent observation is used to maintain
control (Foucault, 1977). Foucault’s panopticon includes a centralized, hidden actor in a tall
guard tower to watch prisoners in surrounding prison cells (see also Bentham, 1791). The
individuals under observation begin to self monitor and to curtail prohibited or even perfectly
legal but socially stigmatized behaviors. Importantly, “spaces exposed by surveillance function
differently than spaces that are not so exposed” (Cohen, 2008, p. 194) by changing how
individuals behave and think due to the fear of being watched and judged by others.
As noted by Kirsten Martin (2016), surveillance is a concern because it frustrates the need of
individuals to be unobserved (Benn, 1984), discourages expressions of uniqueness, and compli
cates individuals’ development of a sense of self (Bloustein, 1964; Fried, 1970; Rachels, 1975).
Unobserved personal space permits “unconstrained, unobserved physical and intellectual move
ment,” enabling critical, playful individual development and relationship cultivation (Cohen,
2008, p. 195).
Mobile surveillance is particularly effective in changing behavior and thoughts when individ
uals (1) cannot avoid the gaze of the watcher and (2) cannot identify the watchers (Cohen, 2008).
88 Kirsten Martin and Katie Shilton

In other words, both the breadth of information gathered and the tactic of invisibility contribute
to the problem of surveillance with mobile devices (Martin, 2016). Further complicating the
problem of surveillance is that mobile devices support aggregating data across disparate contexts
and contribute to the perception that surveillance is impossible to avoid. Many of the actors in
the ecosystem pictured in Figure 5.1 data brokers, ISPs, and device manufacturers have the
capability to create a data record that tells a richer, more personalized story than any individual
data points. The mosaic theory of privacy explains why privacy scholars are concerned with all
elements of tracking, including transaction surveillance and purchasing behavior (Strandburg,
2011). The mosaic theory of privacy suggests that the whole of one’s movements reveals far more
than the individual movements comprising it (DC Circuit, p. 647; Kerr, 2012; United States v.
Jones, 2012), where the aggregation of small movements across contexts is a difference in kind
and not in degree (Strandburg, 2011).
Mobile devices allow for broad user surveillance across many areas of life and support tracking
not only online activity but also the user’s associated offline activity, such as where they are and
who they are with. As Brunton and Nissenbaum (2011) note, “Innocuous traces of everyday life
submitted to sophisticated analytics tools developed for commerce and governance can become
the keys for stitching disparate databases together into unprecedented new wholes.”

3 Context Matters for Privacy Expectations for Mobile Devices


A popular, yet outdated (Martin & Nissenbaum, 2016), approach to privacy attempts to group
consumers by privacy preferences. Alan F. Westin famously claimed consumers ranged from
“privacy fundamentalists” with greater concern for privacy, to “privacy pragmatists,” willing to
supposedly trade privacy for certain benefits, to “privacy unconcerned” (Westin, 1970). However,
recent empirical research has shown Westin’s privacy categories to be relatively unimportant in
relation to contextual elements in privacy judgments (King, 2014; Martin & Nissenbaum, 2016).
In one study, even “privacy unconcerned” respondents rated data collection vignettes as not
meeting privacy expectations on average, and respondents across categories had a common
vision of what constituted a privacy violation. In fact, how the information was used within the
stated context versus for commercial use drove meeting privacy expectations rather than
difference in privacy ‘concerns’ (Martin & Nissenbaum, 2016).
These theories and empirical data to support them apply to mobile application users, as
well. Much of the research on user privacy expectations for mobile devices resides in the
human computer interaction and usability research areas (Anthony, Kotz, & Henderson,
2007; Palen & Dourish, 2003). Though early work in this space attempted to sort consumers
into groups based on privacy or data sharing preferences or policies (Sadeh et al., 2009), recently,
much of this work has reflected the important movement in privacy scholarship that approaches
privacy expectations less as personal preferences, and more as social, contextually dependent
phenomena (Ahern et al., 2007; Khalil & Connelly, 2006; Mancini et al., 2009). This work posits
that privacy expectations are based on social norms within particular information contexts
(Nissenbaum, 2009). Those contextual privacy norms dictate what data it is acceptable to
collect, who can have access to it, whether it should be kept confidential, and how it can be
shared and reused.
When privacy expectations are context specific, norms around what information should be
disclosed and gathered and for what purpose are developed within a particular community or
context. Shopping online, talking in the break room, and divulging information to a doctor are
each governed by different information norms. As Nissenbaum states, “the crucial issue is not
How Privacy Is Respected in Mobile Devices 89

whether the information is private or public, gathered from private or public settings, but
whether the action breaches contextual integrity” (Nissenbaum, 2004, p. 134).
This contextual approach is consistent with a social contract approach to privacy expectations
(Culnan & Bies, 2003; Li, Sarathy, & Xu, 2010; Martin, 2012; Xu, Zhang, Shi, & Song, 2009) in
which rules for information flow take into account the purpose of the information exchange as
well as risks and harms associated with sharing information. This approach allows for the
development of contextually dependent privacy norms between consumers and businesses.
These norms have been shown to take into account (Martin, 2015b; Nissenbaum, 2010):

• Who/Recipients the people, organizations, and technologies who are the senders, recipi
ents, and subjects of information.
• What/Information the information types or data fields being transmitted.
• How/Transmission principles the constraints on the flow of information.
• Why the purpose of the use of information.
Key to all contextual definitions of privacy is how the main components work together who
receives the information, what type of information, how it is used, and for what purpose within
a particular context.
Ongoing research that connects mobile devices to contextual privacy attempts to understand
what contexts, actors, values, transmission principles, and information uses matter to users’
mobile privacy expectations. For example, location data may be required for contexts such as
navigation but inappropriate for a flashlight application (Kang, 2013); anonymity may be
appropriate for a context such as Internet searching, but inappropriate in a context such as
social networking. Contextual privacy also stipulates that data types cannot be deemed ‘private’
or ‘public’ across contexts. Tactics such as behavioral advertising, data collection and retention,
and tracking may be appropriate and within the contextually defined privacy norms in one
context while inappropriate in another. Importantly, mobile phones cross contexts as users carry
them throughout their days, and use apps which may touch upon contexts such as banking,
education, socializing, and leisure time.
Researchers such as Lin et al. (2012) have begun this effort by measuring sensitive data types
and user reactions to the purpose of data collection in the mobile ecosystem. They found that
the purpose of information use was a powerful factor in meeting individuals’ expectations for an
app, and that explanations of information use could allay users’ concerns about data collection.
We have also conducted empirical research in this space, including a factorial vignette survey
that measured the impact of diverse real world contexts (e.g., medicine, navigation, and music),
data types, and data uses on user privacy expectations (Martin & Shilton, 2015; Martin & Shilton,
2016). Results demonstrate that individuals’ general privacy preferences are of limited signifi
cance for predicting their privacy judgments in specific scenarios. Instead, the results present a
nuanced portrait of the relative importance of particular contextual factors and information uses,
and demonstrate how those contextual factors can be found and measured. The results also
suggest that current common activities of mobile application companies, such as harvesting and
reusing location data, images, and contact lists, do not meet users’ privacy expectations.
In our study, each survey respondent was shown a series of vignettes that varied based on:

• Who: The data collection actor the primary organization collecting information, such as
application developer, third party placing an ad, app store, or mobile phone provider;
• What: The type of information received or tracked by the primary organization such as
location, accelerometer, demographic data, contacts, keywords, user name, and images;
90 Kirsten Martin and Katie Shilton

• Why: The application context e.g., playing games, checking weather, participating in
social networking, navigating using maps, listening to music, banking, shopping, and
organizing personal productivity;
• How (used): How the data is reused or stored: the length of storage, whether the data was
tied to a unique identifier for personalization, and the secondary use such as retargeting
ads, social advertising, or selling to a data exchange.
This generated vignettes such as the following; underlining highlights the factors (independent
variables) that would systematically change.

Targeting Vignette Sample:


While using your phone, you check updates on a social networking application that you have used
occasionally for less than a month.
The social networking app shows you an advertisement for another application they sell based on
your phone contact list.

We found that tracking scenarios met privacy expectations to a lesser extent than targeting
scenarios (mean = 42.70 and 18.01, respectively) on the ‘meeting privacy expectations’ scale
of 100 to +100. In addition, some data types were particularly sensitive regardless of context or
at least particularly surprising to respondents. Harvest of both images and contact information
universally failed to meet user privacy expectations. This may be because it is not widely known
that these data can be harvested by mobile applications, or it may be that these data types are
particularly sensitive. Application developers engaged in privacy by design may wish to avoid
collecting, using, or selling images or contact information from phones.
Among the contextual factors, generally, the type of information mattered most to respondents’
privacy judgments. The use of contact information (β = 70.10) and image information ( 78.84)
were the most (negatively) influential types of information, followed by the individual’s name
( 19.51), friend information ( 20.35), accelerometer ( 15.17), and location ( 13.33). All of these data
types negatively impacted meeting privacy expectations for targeted advertising compared to using
demographic information. However, using keywords (11.49) positively impacted meeting privacy
expectations for targeted advertising compared to using demographic information. For tracking
scenarios, the secondary use of information was the most important factor impacting privacy
expectations. Selling to a data exchange (β = 47.17) and using tracked information for social
advertising to contacts and friends ( 21.61) both negatively impacted meeting privacy expectations.
The results indicate that very common activities of mobile application companies (harvesting
and using data such as location, accelerometer readings, demographic data, contacts, keywords,
name, images, and friends) do not meet users’ privacy expectations. But users are not monolithic
in their privacy expectations. For example, users expect navigation and weather applications to
use location and accelerometer data, and users expect a link between harvesting keywords and
targeted advertising. The results show that nuanced, contextual privacy concerns can be
measured within an industry. For example, navigation applications should feel confident
collecting users’ location data, but should not collect image data. Navigation applications can
make design changes to avoid data besides location, supporting privacy by design. These findings
are important because regulators and companies should not rely on consumers as idiosyncratic
How Privacy Is Respected in Mobile Devices 91

as to their privacy preferences or privacy expectations. Consumers, when asked, can be fairly
consistent as to the minimum protections they expect in regards to their information being
accessed, stored, shared, and used.
The need to define appropriate contextual boundaries for mobile data privacy highlights the
importance of “apps” as a contextual concept. Apps single purpose pieces of software are a
place where we can design for context in a device that otherwise spans multiple contexts.

challenges to meeting privacy expectations with mobile devices


4 Government Regulations on Mobile Surveillance Have Been Slow to Catch Up
Perhaps surprising given the sensitivity of mobile data and the fact that mobile data collection
has been growing for almost ten years, mobile data collection and use are largely unregulated in
the United States. Some health applications may be regulated under the Health Information
Portability and Privacy Act (HIPPA) (Martínez Pérez, Torre Díez, & López Coronado, 2014),
and all applications must ensure that they do not collect data from children under 13. Beyond
these restrictions, regulation of mobile data collection is left to the Federal Trade Commission
(FTC) under its jurisdiction to punish unfair and deceptive trade practices (Solove & Hartzog,
2014). For example, the FTC fined a flashlight application that was tracking the location of its
users (Kang, 2013) and has published recommended security best practice for mobile application
developers (Bureau of Consumer Protection, 2013).
The FTC, however, only regulates private companies. US government rules and regulations
about how the federal government may use mobile data for law enforcement are also in flux
(Pell & Soghoian, 2012). Law enforcement regularly uses either cell tower data requested from
phone providers, or increasingly, specialized devices called “stingrays” which mimic cell towers
to track the location of individuals (Michael & Clarke, 2013).
Court decisions are increasingly recognizing the privacy rights of users of mobile devices.
A 2014 Supreme Court case declared that law enforcement officers require a warrant to search
mobile devices (Liptak, 2014). But in 2016, an appellate court declared that a warrant was not
necessary to obtain a suspect’s historical cell phone data from a phone company (Volz, 2016).
The multi actor ecosystem in which mobile data resides was used as the justification for the
majority’s decision; judges argued that once user location was stored by a third party, it was no
longer protected under the Constitution’s Fourth Amendment. The third party doctrine domin
ates the current legal understanding of privacy expectations: information given to a third party is
not considered to have legal privacy expectations (Kerr, 2009). However, recent scholarship has
called into question the utility of relinquishing privacy protections of all information shared with
a third party (Richards & King, 2016).

5 Corporate Best Practices Also Slow to Adapt


Governments are not the only actors in the mobile ecosystem struggling with questions of
fairness and privacy while dealing with mobile data. How firms should best meet consumer
privacy expectations in the mobile space is an unanswered question. This growing market lacks
standards of practice for addressing privacy, as most uses of data collected by mobile applications
are unregulated (Federal Trade Commission, 2012; Shilton, 2009).
Fair Information Principles (FIP), and in particular notice and choice, serve as one source of
guidance for self regulation within the industry (Bowie & Jamal, 2006; Culnan & Williams, 2009;
92 Kirsten Martin and Katie Shilton

Milne & Culnan, 2002), and could be adapted to the mobile sector (Federal Trade Commis
sion, 2012). However, relying on notice and choice raises both practical and philosophical
problems. First, “choice” is a problematic concept when individuals perceive that opting out
of application usage has more costs than benefits. In addition, surveys and experiments have
shown that individuals make judgments about privacy expectations and violations regardless of
the content of privacy notices (Beales & Muris, 2008; Martin, 2013; McDonald & Cranor,
2008; Milne, Culnan, & Greene, 2006; Nissenbaum, 2011). As noted in a study on the role of
privacy notices in consumer trust (Martin, 2017), privacy notices are also frequently difficult to
read (Ur, Leon, Cranor, Shay, & Wang, 2012), misleading (Leon, Cranor, McDonald, &
McGuire, 2010), and difficult to find (Leon et al., 2012). Notices are also time consuming
(McDonald & Cranor, 2008) and not always targeted towards consumers. Mobile devices, with
their small screens and limited user interface features, exacerbate these issues (Schaub,
Balebako, Durity, & Cranor, 2015).
Companies also struggle to identify who, in a complex data sharing ecosystem, should be
responsible for user privacy. Platform providers such as Apple require independent app
developers to provide privacy policies, and screens for these policies during its app stores
approval process. While adhering to a notice and choice model, the app store approval process
provides a clear signal that Apple views consumer privacy as a shared responsibility between
users (who must read notices), developers (who must set policies), and the platform itself
(which enforces compliance) (Shilton & Greene, 2016). In contrast, Google’s popular Android
marketplace does not screen apps submitted to its store (Google Play, 2016). While its
developer policies emphasize that developers should include privacy notices, the open nature
of its store signals much more reliance on developers and users to monitor their own
application choices.
Individual developers also struggle to define and implement best practice privacy. Low
barriers to entry enable a vibrant but deprofessionalized development ecosystem (Cravens,
2012), and surveys of application developers have revealed that many lack knowledge of current
privacy best practices (Balebako, Marsh, Lin, Hong, & Cranor, 2014). A recent study contrasting
iOS and Android applications found that 73% of Android apps tested, and 47% of iOS apps
tested, reported user location. Forty nine percent of Android apps and 25% of iOS apps shared
personally identifying information (Zang, Dummit, Graves, Lisker, & Sweeney, 2015).
In a study of privacy conversations in mobile development forums, Shilton and Greene (2016)
found that many developers, particularly in the Android ecosystem in which hobbyists are very
involved in development conversations, get feedback from users and take it quite seriously.
However, developers don’t have ready access to comprehensive or representative data on user
privacy expectations. As a result, developers argue about how much privacy matters to users, as
well as about privacy as an ethical principle more broadly.
Within these arguments, privacy tends to be defined rather narrowly. Developers focus on
data collection or notice and consent, rather than on the contextual variables (particularly the
actor collecting the data, the purpose of data collection, and the social context of the app) that
have been shown to matter to users (Shilton & Greene, 2016). In our analysis of the discussion
forums, it became evident that the platforms themselves iOS and Android were impacting
how developers in each ecosystem defined privacy. iOS developers must pass a review by Apple
that includes privacy requirements. As a result, iOS developers tend to define privacy according
to Apple’s regulations, which focus on notice and consent. The most frequently cited privacy
definition in iOS discussions focused on transparency with users, particularly in the form of
notice and consent. Developers frequently defined most kinds of data collection as allowable as
How Privacy Is Respected in Mobile Devices 93

long as users were informed. Developers also frequently credited Apple with authorizing
this particular definition of privacy.
Android developers are much less regulated by the platform. Android lacks the stringent app
store review process that was so critical to prompting privacy discussions in iOS. While Android
developers must agree to the Developer Distribution Agreement (Google Play, 2016), and are
asked to include privacy features such as a privacy policy and encryption for data in transmission,
the agreement explicitly states that Google does not “undertake an obligation to monitor the
Products or their content.” Instead, Google reserves the right to remove (called “takedowns”)
violating apps from the store at their discretion. Interestingly, discussion of app takedowns was
not prominent in the XDA forums. Instead, privacy was discussed as a feature that could provide
consumer choice and could distinguish an application in the competitive marketplace. Defin
ing privacy as a feature, particularly one enabling user choice, led to a wealth of privacy
enhancing options within the Android ecosystem. But privacy enhancing or privacy enhanced
applications were often developed by hobbyists and open source developers working on their
own and then beta testing with like minded experts. This led to a marketplace in which
consumer choice ran amok, but there was little indication of the level of professionalization,
trustworthiness, product support, or long term prospects of most applications.

table 5.1 Mobile devices and privacy: What we know, why it’s hard, what is next

CHALLENGES TO MEETING THE FUTURE OF


WHAT WE KNOW ABOUT PRIVACY PRIVACY EXPECTATIONS PRIVACY AND
AND MOBILE DEVICES WITH MOBILE DEVICES MOBILE DEVICES
Mobile Technologies Have Government regulations on mobile Focus of Users:
Technological Properties that Change surveillance have been slow to catch growing mistrust
Social Practices: Users have a more up
intimate relationship with mobile devices Notice and Choice as particularly
as the device is literally on their person, limited with mobile
following them everywhere. Developers struggle to define
privacy in a new sociotechnical
Surveillance Is Highly Salient with Focus of Firms:
setting:
Mobile Device Users: Mobile devices techniques to meet
allow for broad user surveillance across privacy expectations
many areas of life and supports tracking
not only the user online but also their
associated offline activity such as where
they are and who they are with.
Context Matters for Privacy Focus of Regulators:
Expectations for Mobile Devices: regulating themselves
Tactics such as behavioral advertising, and others
data collection and retention, and
tracking may be appropriate and within
the contextually defined privacy norms in
one context while inappropriate in
another. Importantly, mobile phones
cross contexts as users carry them
throughout their days, and use apps
which may touch upon contexts such as
banking, education, socializing, and
leisure time.
94 Kirsten Martin and Katie Shilton

the future of privacy and mobile devices


Given the state of privacy and mobile devices and the challenges facing both industry and
regulators, we see trends in the future focus of users, firms, and regulators. Users are growing
more mistrustful of corporations and their use of mobile data. Firms should use increased focus
on privacy as one way of regaining consumer trust.

Users: Growing Mistrust


Users are becomingly increasingly concerned with, frustrated at, and mistrustful of technology
companies. Users have long rejected tracking of their information for targeted advertising
(Turow, King, Hoofnagle, Bleakley, & Hennessy, 2009) and have become only more pessimistic
over time (Turow, Hennessy, & Draper, 2015). Consumers care about privacy but do not believe
their concerns will be addressed by commercial entities (Turow et al., 2015). These findings
are reinforced by surveys measuring changing expectations over time (Urban, Hoofnagle, &
Li, 2012).
We believe two additional issues will only become more prevalent with growing user
awareness of mobile surveillance. First is the challenging issue of when the massive stores of
information collected with mobile devices can be used beyond commercial purposes, for
example, for research (Metcalf & Crawford, 2017; Vitak, Shilton, & Ashktorab, 2016; Zimmer,
2010). The assumption that information collected by one party is available for any use or by any
other entity has been debunked (Martin & Nissenbaum, 2016). But this challenges researchers
who hope to use these datasets to understand human behavior, health, and practices. Should
corporations be the sole actors with access to these “small data” streams that can be useful for
individual health and wellness (Estrin, 2014) as well as population level research (boyd &
Crawford, 2012)? If this data is to be made available for research, how can users ensure that
their rights and dignity as research subjects are respected?
Second, it is increasingly difficult for individuals to resist government surveillance (Calo,
2016) or remain obscure from public or private interests (Hartzog & Selinger, 2015). As con
sumers seek to protect their information, their focus may turn to tools to obfuscate their
activities. For example, Nissenbaum and Brunton created a user’s guide to obfuscation with
practical tips to thwart pervasive tracking (Nissenbaum & Brunton, 2015). Introducing wide
spread obfuscation into data streams may frustrate governments, corporations, and researchers
reliant on this data. But it is a legitimate tool of protest for users who feel that their interests are
overlooked by powerful data collectors.

Firms: Techniques to Meet Privacy Expectations


For the reasons outlined above, user trust is a primary issue that should concern firms in the
mobile data ecosystem. Figure 5.2 illustrates the varying levels of trust within the mobile device
ecosystem. Not all actors in the ecosystem are equally close to consumers, and therefore different
actors might require different levels of trust. Gatekeepers, such as mobile applications, are the
mechanism by which consumers enter the ecosystem, and such firms carry an additional
obligation to respect consumer privacy expectations (Martin, 2016). Mobile devices and mobile
platform providers similarly carry an obligation to uphold the trust of consumers, as these firms
have the technological ability and unique regulatory position to enforce what sorts of data
collection and sharing are permitted on the device.
How Privacy Is Respected in Mobile Devices 95

Government

Telecom/ISP

Browsers

Firms
App Advertising
Developers
Mobile App
App Store Ad Networks

Phone/Device
Researchers

Data Brokers

figure 5.2 Mobile device ecosystem by consumer relationship and trust

For firms interested in increasing consumer trust, we predict several areas of focus. First, given
the issues with adequate notice explained above, new types of notices around choice architec-
ture, framing, and layered privacy choices will become the focus of firms (Adjerid, Acquisti, &
Loewenstein, 2016). Increasingly, behavioral approaches and nudges are utilized for notice on
mobile devices (Balebako et al., 2011). In addition, when the notice is given matters: recent
research has illustrated the importance of timing when the notice is shown to the user. Showing
the notice during app use significantly increased users’ recall rates as opposed to showing the
notice in the app store. Importantly, the results suggest that a notice is unlikely to be recalled by
users if only shown in the app store (Balebako, Schaub, Adjerid, Acquisti, & Cranor, 2015)
Second, mobile platforms may become more interested in using their regulatory power to
maintain consumer trust. Apple has recently announced a new internal emphasis on differential
privacy, an innovative privacy protection technique that enables aggregate data use without
identifying individuals (Simonite, 2016). Apple has long showed a willingness to regulate privacy
policies for mobile applications, and their interest in incorporating advanced methods for
privacy protection into the mobile ecosystem could help enhance consumer trust.
Consumer-facing organizations are uniquely positioned to manage consumer privacy, and
firms face the risk of consumer backlash if privacy is not respected. In the past, efforts to place
security concerns on the agendas of boards of directors have been fruitful to realigning consumer
trust and corporate practice.1 Privacy practices could soon be included in corporate governance.
As privacy concerns and the associated perceptions of vulnerability begin to look similar to
security concerns, corporate governance could begin to focus on the privacy practices of
corporations as a form of risk management.

1
[Link]
96 Kirsten Martin and Katie Shilton

Regulators: Regulating Themselves and Others


US regulatory bodies are also starting to realize the importance of regulating privacy to maintain
user trust. Documents such as the White House’s Consumer Privacy Bill of Rights (Strickling,
2012) and the recent National Privacy Research Strategy (National Science and Technology
Council, 2016) signal the interest of the executive branch in engaging privacy as a regulatory
topic. Regulators have at least two areas of focus in the mobile data ecosystem: regulating
corporate use of mobile data and regulating government use of that same data.
Regulators such as the FTC have long enforced the use of adequate notice and consumer choice
(Solove & Hartzog, 2014). In addition, researchers’ attention to privacy by design has also attracted
the attention of regulators. Regulators could move beyond the limitations of notice to promote more
creative approaches to privacy by design. For example, regulators might require corporate privacy
programs to include identifying privacy risks as a step in all product design decisions. Regulators
might also regulate the substance and adequacy of privacy notices, instead of only their compliance
with corporate policy (Mulligan & Bamberger, 2013; Mulligan & King, 2011). Asking firms to detail
their design decisions and ongoing data maintenance in official privacy programs will highlight the
importance of thoughtful privacy design. Focusing less on the degree to which firms are truthful in
their privacy notices (which is necessary, but not sufficient) will encourage firms to focus more on
the substance of the notices and their adherence to the expectations of consumers.
Finally, US courts, and perhaps the US Congress, will also need to decide the limits to which
governments can compel user data from private companies. As we have outlined, the specter of
government surveillance of mobile data is real and realized. Governments must limit their
access to this data for the good of national trust.

conclusion
In this chapter, we examined privacy and the ecosystem of applications, hardware, operating
systems, and telecommunication companies that collect increasing amounts of personal data.
We explored what we know about the privacy challenges raised by mobile devices; user privacy
expectations in regards to mobile devices; and developer responses to those challenges. Mobile
device users have predictable privacy expectations, but these vary based on data type and applica
tion context. Mobile developers are positioned as key decision makers about consumer privacy, but
they struggle to find a common understanding of users’ context sensitive privacy expectations.
While mobile device technology is relatively new, recent research has revealed quite a bit
about privacy expectations and mobile devices. Firms and regulators can help to mitigate the gap
between developers and consumers by increasing attention to privacy by design at the level of
corporate governance, making privacy a first order concern in protecting consumer trust.

acknowledgments
Thanks to Karen Boyd for assistance with background research for this article. This work was
supported in part by the US National Science Foundation awards, CNS 1452854, SES 1449351,
and a Google Faculty Research Award.

references
Adjerid, I., Acquisti, A., & Loewenstein, G. (2016). Choice architecture, framing, and layered privacy
choices. Framing, and Layered Privacy Choices. Available at SSRN: [Link]
How Privacy Is Respected in Mobile Devices 97

Ahern, S., Eckles, D., Good, N. S., King, S., Naaman, M., & Nair, R. (2007). Over exposed?:
Privacy patterns and considerations in online and mobile photo sharing. In Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems (pp. 357 366). ACM. Retrieved from
[Link]
Anthony, D., Kotz, D., & Henderson, T. (2007). Privacy in location aware computing environments.
Pervasive Computing, 6(4), 64 72.
Balebako, R., Leon, P. G., Almuhimedi, H., Kelley, P. G., Mugan, J., Acquisti, A., . . . Sadeh, N. (2011).
Nudging users towards privacy on mobile devices. In Proceedings of the CHI 2011 Workshop on
Persuasion, Nudge, Influence and Coercion.
Balebako, R., Marsh, A., Lin, J., Hong, J., & Cranor, L. F. (2014). The privacy and security behaviors of
smartphone app developers. In USEC ’14. San Diego, CA: Internet Society. Retrieved from http://
[Link]/pubs/usec14 app [Link].
Balebako, R., Schaub, F., Adjerid, I., Acquisti, A., & Cranor, L. F. (2015). The impact of timing on the
salience of smartphone app privacy notices. In Proceedings of the 5th Annual ACM CCS Workshop on
Security and Privacy in Smartphones and Mobile Devices (pp. 63 74). ACM.
Barkhuus, L., & Dey, A. (2003). Location based services for mobile telephony: A study of users’ privacy
concerns. In Proceedings of the INTERACT 2003: 9TH IFIP TC13 International Conference on
Human Computer Interaction (Vol. 2003, pp. 709 712).
Beales, J. H., & Muris, T. J. (2008). Choice or consequences: Protecting privacy in commercial infor
mation. The University of Chicago Law Review, 75(1), 109 135.
Beilinson, J. (2016, July 28). Glow pregnancy app exposed women to privacy threats, Consumer Reports
finds. Consumer Reports. Retrieved from [Link] security software/
glow pregnancy app exposed women to privacy threats/.
Bellware, K. (2016, January 6). Uber settles investigation into creepy “God View” tracking program. The
Huffington Post. Retrieved from [Link] settlement god view us
568da2a6e4b0c8beacf5a46a.
Benn, S. I. (1984). Privacy, freedom, and respect for persons. In F. Schoeman (Ed.), Philosophical
dimensions of privacy. (pp. 223 244). Cambridge, MA: Cambridge University Press.
Bentham, J. (1791). Panopticon or the inspection house (Vol. 2). Retrieved from [Link]
documents/castle style/bridewell/bridewell jeremy bentham panoption [Link].
Bloustein, E. J. (1964). Privacy as an aspect of human dignity: An answer to Dean Prosser. NYU Law
Review, 39, 962.
Bowie, N. E., & Jamal, K. (2006). Privacy rights on the internet: Self regulation or government regulation?
Business Ethics Quarterly, 16(3), 323 342.
Boyd, D., & Crawford, K. (2012). Critical questions for big data. Information, Communication & Society,
15(5), 662 679.
Brunton, F., & Nissenbaum, H. (2011). Vernacular resistance to data collection and analysis: A political
theory of obfuscation. First Monday, 16(5). Available at: [Link]
view/3493/2955.
Bureau of Consumer Protection. (2013, February). Mobile App Developers: Start with security.
Retrieved June 21, 2013, from [Link] mobile app developers start
security.
Calo, R. (2016). Can Americans resist surveillance? The University of Chicago Law Review, 83(1), 23 43.
Cohen, J. E. (2008). Privacy, visibility, transparency, and exposure. The University of Chicago Law Review,
75(1), 181 201.
Coutts, S. (2016, May 25). Anti choice groups use smartphone surveillance to target “abortion minded
women” during clinic visits. Rewire. Retrieved from [Link] choice
groups deploy smartphone surveillance target abortion minded women clinic visits/.
Cravens, A. (2012, September 26). A demographic and business model analysis of today’s app developer.
Retrieved March 19, 2013, from [Link] demographic and business model
analysis of todays app developer/.
Culnan, M. J., & Bies, R. J. (2003). Consumer privacy: Balancing economic and justice considerations.
Journal of Social Issues, 59(2), 323 342.
Culnan, M. J., & Williams, C. C. (2009). How ethics can enhance organizational privacy: Lessons from the
ChoicePoint and TJX data breaches. Management Information Systems Quarterly, 33(4), 6.
98 Kirsten Martin and Katie Shilton

Curry, M. R. (2002). Discursive displacement and the seminal ambiguity of space and place. In
L. Lievrouw & S. Livingstone (Eds.), Handbook of new media. (pp. 502 517). London: Sage
Publications.
Decker, M. (2008). Location privacy: An overview. In Proceedings of the 2008 7th International Conference
on Mobile Business (pp. 221 230). Barcelona: IEEE Computer Society Press. Retrieved from http://
[Link]/persagen/[Link]?resourcePath=/dl/proceedings/&toc=comp/proceedings/
icmb/2008/3260/00/[Link]&DOI=10.1109/ICMB.2008.14.
Estrin, D. (2014). Small data, where n = me. Communications of the ACM, 57(4), 32 34. [Link]
10.1145/2580944.
Federal Trade Commission (2012). Protecting consumer privacy in an era of rapid change: Recommenda
tions for businesses and policymakers. Washington, DC: Federal Trade Commission.
Foucault, M. (1977). Discipline and punish: The birth of the prison. New York: Random House LLC.
Fried, C. (1970). An anatomy of values: Problems of personal and social choice. Cambridge, MA: Harvard
University Press.
Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information
Systems (TOIS), 14(3), 330 347.
Future of Privacy Forum, & Center for Democracy & Technology. (2012). Best practices for mobile
application developers. Washington, DC: Future of Privacy Forum. Retrieved from [Link]
[Link]/wp content/uploads/Best Practices for Mobile App Developers [Link].
Google Play (2016, July 2). Google Play developer distribution agreement. Retrieved August 9, 2016, from
[Link] us/about/developer distribution [Link].
Hartzog, W., & Selinger, E. (2015). Surveillance as loss of obscurity. Washington and Lee Law Review,
72(3), 1343.
He, Q., Wu, D., & Khosla, P. (2004). The quest for personal control over mobile location privacy. IEEE
Communications Magazine, 42(5), 130 136.
Hill, K. (2016, February 12). How this company tracked 16,000 Iowa caucus goers via their phones. Retrieved
from [Link] iowa caucus us 56c12cafe4b08ffac125b591.
Jones, M. L. (2016). Ctrl + z: The right to be forgotten. New York and London: New York University Press.
Kang, C. (2013, December 6). Flashlight app kept users in the dark about sharing location data: FTC.
The Washington Post. Retrieved from [Link]
light app kept users in the dark about sharing location data ftc/2013/12/05/1be26fa6 5dc7 11e3 be07
006c776266ed [Link].
Kerr, O. S. (2009). The case for the third party doctrine. Michigan Law Review, 107, 561 601.
Kerr, O. S. (2012). The Mosaic Theory of the Fourth Amendment. Michigan Law Review, 110, 311 354.
Khalil, A., & Connelly, K. (2006). Context aware telephony: Privacy preferences and sharing patterns.
In Proceedings of the 2006 20th anniversary conference on computer supported cooperative work
(pp. 469 478). New York: ACM. Retrieved from [Link]
King, J. (2014). Taken out of context: An empirical analysis of Westin’s privacy scale. In Symposium on
Usable Privacy and Security (SOUPS) 2014. Menlo Park, CA: ACM.
Lella, A., & Lipsman, A. (2015). The 2015 U.S. Mobile App Report. Retrieved from [Link]
Insights/Presentations and Whitepapers/2015/The 2015 US Mobile App Report.
Leon, P. G., Cranor, L. F., McDonald, A. M., & McGuire, R. (2010). Token attempt: The misrepre
sentation of website privacy policies through the misuse of p3p compact policy tokens. In
Proceedings of the 9th annual ACM workshop on privacy in the electronic society (pp. 93 104).
New York: ACM.
Leon, P. G., Cranshaw, J., Cranor, L. F., Graves, J., Hastak, M., Ur, B., & Xu, G. (2012). What do online
behavioral advertising privacy disclosures communicate to users? In Proceedings of the 2012 ACM
workshop on privacy in the electronic society (pp. 19 30). New York: ACM.
Li, H., Sarathy, R., & Xu, H. (2010). Understanding situational online information disclosure as a privacy
calculus. Journal of Computer Information Systems, 51(1), 62.
Lin, J., Amini, S., Hong, J. I., Sadeh, N., Lindqvist, J., & Zhang, J. (2012). Expectation and purpose:
Understanding users’ mental models of mobile app privacy through crowdsourcing. In Proceedings of
the 2012 ACM Conference on Ubiquitous Computing (pp. 501 510). New York: ACM. [Link]
10.1145/2370216.2370290.
How Privacy Is Respected in Mobile Devices 99

Liptak, A. (2014, June 25). Supreme Court says phones can’t be searched without a warrant. The New York Times.
Retrieved from [Link] court cellphones search [Link].
Mancini, C., Thomas, K., Rogers, Y., Price, B. A., Jedrzejczyk, L., Bandara, A. K., . . . Nuseibeh, B. (2009).
From spaces to places: Emerging contexts in mobile privacy. New York: ACM Press. [Link]
10.1145/1620545.1620547.
Martin, K. (2017). Do privacy notices matter? Comparing the impact of violating formal privacy notices and
informal privacy norms on consumer trust online. Journal of Legal Studies, 45(S2), 191 215.
Martin, K. (2013). Transaction costs, privacy, and trust: The laudable goals and ultimate failure of notice
and choice to respect privacy online. First Monday, 18(12).
Martin, K. (2015). Ethical issues in the big data industry. MIS Quarterly Executive, 14(2): 67 85.
Martin, K. (2016a). Understanding privacy online: Development of a social contract approach to privacy.
Journal of Business Ethics, 137(3), 551 569.
Martin, K. (2016b). Data aggregators, consumer data, and responsibility online: Who is tracking consumers
online and should they stop? The Information Society, 32(1), 51 63.
Martin, K. (2012). Diminished or just different? A factorial vignette study of privacy as a social contract.
Journal of Business Ethics, 111(4), 519 539. [Link] 012 1215 8.
Martin, K. E., & Shilton, K. (2015). Why experience matters to privacy: How context based experience
moderates consumer privacy expectations for mobile applications. Journal of the Association for
Information Science and Technology, 67(8), 1871 1882.
Martin, K. E., & Shilton, K. (2016). Putting mobile application privacy in context: An empirical study of
user privacy expectations for mobile devices. The Information Society, 32(3), 200 216. [Link]
10.1080/01972243.2016.1153012.
Martin, K., & Nissenbaum, H. (2016). Measuring privacy: Using context to expose confounding variables.
Columbia Science and Technology Law Review. Retrieved from [Link]
.cfm?abstract id=2709584.
Martínez Pérez, B., Torre Díez, I. de la, & López Coronado, M. (2014). Privacy and security in mobile
health apps: A review and recommendations. Journal of Medical Systems, 39(1), 181. [Link]
10.1007/s10916 014 0181 3.
McDonald, A. M., & Cranor, L. F. (2008). The cost of reading privacy policies. I/S: A Journal of Law and
Policy for the Information Society, 4(3), 543.
Metcalf, J., & Crawford, K. (2016). Where are human subjects in Big Data research? The emerging ethics
divide. Big Data & Society, 3(1). [Link]
Michael, K., & Clarke, R. (2013). Location and tracking of mobile devices: Überveillance stalks the streets.
Computer Law & Security Review, 29(3), 216 228.
Milne, G. R., & Culnan, M. J. (2002). Using the content of online privacy notices to inform public policy:
A longitudinal analysis of the 1998 2001 US web surveys. The Information Society, 18(5), 345 359.
Milne, G. R., Culnan, M. J., & Greene, H. (2006). A longitudinal assessment of online privacy notice
readability. Journal of Public Policy & Marketing, 25(2), 238 249.
Mulligan, D. K., & Bamberger, K. A. (2013). What regulators can do to advance privacy through design.
Communications of the ACM, 56(11), 20 22.
Mulligan, D. K., & King, J. (2011). Bridging the gap between privacy and design. University of Pennsylvania
Journal of Constitutional Law, 14 (4), 989.
National Science and Technology Council. (2016). National privacy research strategy. Washington, DC:
National Science and Technology Council.
Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review, 79(1), 119 158.
Nissenbaum, H. (2010). Privacy in context: Technology, policy, and the integrity of social life. Stanford, CA:
Stanford University Press.
Nissenbaum, H. (2011). A contextual approach to privacy online. Daedalus, 140(4), 32 48.
Nissenbaum, H., & Brunton, F. (2015). Obfuscation: A user’s guide for privacy and protest. Cambridge, MA:
MIT Press.
Palen, L., & Dourish, P. (2003). Unpacking “privacy” for a networked world. In CHI 2003 (Vol. 5,
pp. 129 136). Ft. Lauderdale, FL: ACM.
Pell, S. K., & Soghoian, C. (2012). Can you see me now?: Toward reasonable standards for law enforcement
access to location data that Congress could enact. Berkeley Technology Law Journal, 27(1), 117 195.
100 Kirsten Martin and Katie Shilton

Rachels, J. (1975). Why privacy is important. Philosophy & Public Affairs, 4(4), 323 333.
Richards, N. M., & King, J. H. (2016). Big data and the future for privacy. In F. X. Olleros & M. Zhegu
(Eds.), Research handbook on digital transformations. (p. 272). Northhampton, MA: Edward Elgar
Publishing.
Rosen, J. (2011). The unwanted gaze: The destruction of privacy in America. New York: Vintage Books.
Sadeh, N., Hong, J., Cranor, L., Fette, I., Kelley, P., Prabaker, M., & Rao, J. (2009). Understanding and
capturing people’s privacy policies in a mobile social networking application. Personal and Ubiquitous
Computing, 13(6), 401 412.
Schaub, F., Balebako, R., Durity, A. L., & Cranor, L. F. (2015). A design space for effective privacy
notices. In Proceedings of the Eleventh Symposium On Usable Privacy and Security (SOUPS 2015)
(pp. 1 17).
Shilton, K. (2009). Four billion little brothers?: Privacy, mobile phones, and ubiquitous data collection.
Communications of the ACM, 52(11), 48 53. [Link]
Shilton, K., & Greene, D. (2017). Linking platforms, practices, and developer ethics: levers for privacy
discourse in mobile application development. Journal of Business Ethics (online first). [Link]
10.1007/s10551 017 3504 8.
Shilton, K., Koepfler, J. A., & Fleischmann, K. R. (2013). Charting sociotechnical dimensions of values for
design research. The Information Society, 29(5), 259 271.
Simonite, T. (2016, August 3). Breakthrough privacy technology invented by Microsoft gets its first big
test thanks to Apple. MIT Technology Review. Retrieved August 10, 2016, from [Link]
[Link]/s/602046/apples new privacy technology may pressure competitors to better protect our
data/?imm mid=0e6973&cmp=em data na na newsltr 20160810.
Smith, A. (2015, April 1). U.S. smartphone use in 2015. Retrieved from [Link]/2015/04/01/us
smartphone use in 2015/.
Soghoian, C. (2011, April 22). How can US law enforcement agencies access location data stored by Google
and Apple? Retrieved April 23, 2011, from [Link] can us law enforce
ment [Link].
Solove, D. J., & Hartzog, W. (2014). The FTC and the new common law of privacy. Columbia Law Review,
114: 583 676.
Stingray tracking devices: Who’s got them? (2016). Retrieved August 19, 2016, from [Link]
map/stingray tracking devices whos got them
Strandburg, K. J. (2011). Home, home on the web: The Fourth Amendment and technosocial change.
Maryland Law Review, 3, 614 680.
Strickling, L. (2012, June 15). Putting the Consumer Privacy Bill of Rights into practice. Retrieved from
[Link] consumer privacy bill rights practice.
Turow, J., Hennessy, M., & Draper, N. (2015). The tradeoff fallacy: How marketers are misrepresenting
American consumers and opening them up to exploitation (pp. 1 24). Annenburg School of Communi
cation. Retrieved from [Link] [Link].
Turow, J., King, J., Hoofnagle, C. J., Bleakley, A., & Hennessy, M. (2009). Americans reject tailored
advertising and three activities that enable it. Available at SSRN at [Link] or
[Link]
Ur, B., Leon, P. G., Cranor, L. F., Shay, R., & Wang, Y. (2012). Smart, useful, scary, creepy: Perceptions of
online behavioral advertising. In Proceedings of the 8th symposium on usable privacy and security (p. 4).
ACM.
Urban, J. M., Hoofnagle, C. J., & Li, S. (2012). Mobile phones and privacy (BCLT Research Paper Series).
Berkeley, CA: University of California at Berkeley Center for the Study of Law and Society. Retrieved
from [Link] id=2103405.
Vitak, J., Shilton, K., & Ashktorab, Z. (2016). Beyond the Belmont principles: Ethical challenges, practices,
and beliefs in the online data research community. In Proceedings of the 19th ACM Conference on
Computer Supported Cooperative Work and Social Computing (CSCW 2016). San Francisco, CA:
ACM.
Volz, D. (2016, May 31). U.S. court says no warrant needed for cellphone location data. Reuters. Retrieved
from [Link] usa court mobilephones idUSKCN0YM2CZ
Westin, A. F. (1970). Privacy and freedom. New York: Atheneum.
How Privacy Is Respected in Mobile Devices 101

Xu, H., Zhang, C., Shi, P., & Song, P. (2009). Exploring the role of overt vs. covert personalization strategy
in privacy calculus. Academy of Management Proceedings, 2009(1), 1 6. [Link]
AMBPP.2009.44249857.
Zang, J., Dummit, K., Graves, J., Lisker, P., & Sweeney, L. (2015). Who knows what about me? A survey of
behind the scenes personal data sharing to third parties by mobile apps. Journal of Technology Science, 30.
Retrieved from [Link]
Zimmer, M. (2010). “But the data is already public”: On the ethics of research in Facebook. Ethics and
Information Technology, 12(4), 313 325.
6

Face Recognition, Real-Time Identification, and Beyond

Yana Welinder and Aeryn Palmer*

introduction
In China, you can access your bank account with your face.1 A Russian app allows users to take a
photo of a crowd and match people with their social media accounts.2 And countries all over the
world are adding face recognition software to the complement of tools used to identify travelers
at the borders.3
Technology companies are racing to outpace each other and discover new, innovative ways of
using face recognition technology. In the quest to discover what can be done, questions about
what should be done may be left behind. Privacy and security concerns related to the massive
scope of data collection and sharing are pushed aside, or addressed haphazardly with little
consideration. Consumers may not understand the implications of using this technology, while
regulators struggle to keep up.
Regulators and developers can both take steps to ensure that consumers understand the technol
ogy and make informed choices about its use. Companies can design intuitive data practices,
minimize data collection and retention, and carefully protect biometric data from being misused.
They are in the best position to ensure good privacy practices given that they know what data they
collect and how they use it. They also have a business incentive to create solutions that build user
trust and preempt impulsive overbroad government regulations that tend to be issued in response to
abusive practices. Regulators, on their end, can mandate meaningful consent and focus on

* We would like to thank Anisha Mangalick, Gaëtan Goldberg, James Buatti, Jane Pardini, Jennifer Grace, and Tiffany
Li for their excellent research assistance.
1
Zhang Yuzhe, Banks Face Obstacles to Using Biometric Data for ID Purposes, CᴀɪxɪɴOɴʟɪɴᴇ (May 25, 2015), http://
[Link]/2015–05–25/[Link]; see also Chinese regulators put brakes on facial-recognition for payment,
PYMNTS.ᴄᴏᴍ (May 26, 2015), [Link]
for-payment/.
2
Shawn Walker, Face Recognition App Taking Russia by Storm May Bring End to Public Anonymity, Tʜᴇ Gᴜᴀʀᴅɪᴀɴ
(May 17, 2016), [Link]
nymity-vkontakte.
3
See, e.g., Jim Bronskill, Candid Facial-Recognition Cameras to Watch for Terrorists at Border, Tᴏʀᴏɴᴛᴏ Mᴇᴛʀᴏ (Jan. 8,
2016), [Link]
[Link] (citing The Canadian Press); Peter B. Counter, FaceFirst Expands Border Control Deployment in
Panama, FɪɴᴅBɪᴏᴍᴇᴛʀɪᴄs (Sept. 18, 2014), [Link]
panama/; Melinda Ham, Face Recognition Technology, University of Technology Sydney Faculty of Law
(Nov. 17, 2015), [Link] Stephen Mayhew, Istan-
bul Atatürk Airport Deploys Biometric Border Control Gates, BɪᴏᴍᴇᴛʀɪᴄUᴘᴅᴀᴛᴇ.ᴄᴏᴍ (Jan. 26, 2015), [Link]
.[Link]/201501/istanbul-ataturk-airport-deploys-biometric-border-control-gates.

102
Face Recognition, Real Time Identification, and Beyond 103

technology neutral regulations that prevent harmful practices regardless of what technology they
employ, and don’t slow down innovation in specific classes of technology, such as computer vision.
This chapter will describe technological advances in the world of face recognition and
biometric data collection, before laying out some recent regulatory efforts to control the
technology’s use. Finally, we will make a series of policy recommendations to regulators and
technology companies, which could create a safer environment for consumers.

how face recognition technology works


Computer scientists have spent countless brain cycles to get computers to recognize faces. The main
appeal of face recognition technology is convenience. You don’t have to interact with a person to
identify her by asking for her name or fingerprint.4 More importantly, face recognition is how we
humans tend to recognize each other.5 So this research problem is a piece of the puzzle to get
computers to simulate or preferably excel at human vision on the path towards artificial intelligence.

The Process of Face Recognition Technology


Most automatic face recognition methods involve a general step by step process, analyzing
photos of already identified individuals to measure their facial features.6 The measurements
are “biometric data” that is compiled into a biometric database.7 Face recognition technology
refers to this database to be able to recognize the listed individuals in new photos.8 It allows a
user of the technology to recognize the listed individuals without actually knowing them.
The user only needs to capture a photo of an individual and apply the technology.9 The
technology detects a face in the new photo and matches it against the database.10 Traditionally,
the technology would transform the size, position, illumination, and color scale of the detected
face to compare its measurements to biometric data gathered under other conditions.11
Over the last few years, the process of automatic face recognition has changed significantly
with the use of neural networks. A neural network is a machine learning method that can be
used to find an optimal function to solve a task from a large amount of inputs.12 It is particularly
helpful for complex tasks that require such extensive data analysis that a human would struggle
to find the optimal function for the task.
As applied to face recognition, neural network models “learn” to recognize individuals based
on large data sets of images. One such face recognition process trains the neural network on
three images at a time, where two of the images are known to show the same person and the
third shows a different person.13 The network is instructed to extract vectors of biometric data

4
See Tanzeem Choudhury, History of Face Recognition, MIT Media Lab (Jan. 21, 2000), [Link]
tech-reports/TR-516/[Link].
5
Id.
6
See Handbook of Face Recognition 2–3 (Stan Z. Li & Anil K. Jain eds., 1st ed. 2005).
7
Id.
8
Id.
9
Article 29 Data Protection Working Party, Working Party 29 Opinion on Face Recognition in Online and Mobile
Service, 2012 00727/12 (WP 192) (EN), 2012 O.J. (L 727) 2 (EN) [hereinafter WP29 Opinion], [Link]
justice/data-protection/article-29/documentation/opinion-recommendation/files/2012/wp192 [Link].
10
See Handbook of Face Recognition, supra note 7, at 2–3.
11
See id.
12
Artificial Neural Network, Wikipedia, [Link] neural network.
13
Florian Schroff et al., FaceNet: A Unified Embedding for Face Recognition and Clustering, Cᴏᴍᴘᴜᴛᴇʀ Vɪsɪᴏɴ
Fᴏᴜɴᴅᴀᴛɪᴏɴ, [Link] cvpr 2015/papers/Schroff FaceNet A Unified 2015
CVPR [Link] (2015).
104 Yana Welinder and Aeryn Palmer

from each image in a way that would allow it to distinguish the two images of the same person
from the third image. It does this by extracting data such that the combined score of vectors
for two images will be closer if they show the same person than if they show different people.
This process is then repeated for millions or billions of images until the network establishes an
optimized process for analyzing faces for similarity and can be applied to images of previously
unidentified individuals. While certain parameters and the general architecture of the neural
network are predetermined by the developer of the network, the network decides how to analyze
each image to create the optimal score for determining similarity. Researchers have been able to
peek under the hood of neural networks to see that a neural network usually starts with
determining the edges of a face in an image in different orientations.14 But most of the analysis
that a network does to an image to recognize faces is still a black box and there is some research
effort into trying to understand this complex analysis.15
The application of neural networks to the face recognition process has significantly improved
the accuracy of automatic face recognition. It is now fair to say that networks that rely on huge
datasets of images approach human capability in recognizing faces.16 When trained on a massive
data set of 4 million photos of 4,000 individuals, this process can identify faces with 97.35 percent
accuracy on a popular face recognition dataset.17

Consumer Applications of Face Recognition Technology


As face recognition technology has improved over the years, it has started being used in various
consumer applications.18 At the most basic level, many digital cameras rely on parts of the face
recognition process to focus the lens on a face.19 Photo management apps have used
face recognition to help users organize their photos.20 Social networks such as Facebook and
Google+ have integrated face recognition technology features to allow users to automatically
identify their friends in photos that they upload and link to their friends’ online profiles.21

14
See Matthew Zeiler & Rob Fergus, Visualizing and Understanding Convolutional Networks, in Lᴇᴄᴛᴜʀᴇ Nᴏᴛᴇs ɪɴ
Cᴏᴍᴘᴜᴛᴇʀ Sᴄɪᴇɴᴄᴇ 8689, 818 (Springer International Publishing, 2014), [Link]
op view citation&hl en&user a2KklUoAAAAJ&citation for view a2KklUoAAAAJ:YsMSGLbcyi4C; Hakka Labs,
Visualizing and Understanding Deep Neural Networks by Matt Zeiler, YᴏᴜTᴜʙᴇ (Feb. 2, 2015), [Link]
[Link]/watch?v ghEmQSxT6tw.
15
See id.
16
Yanav Taigman et al., DeepFace: Closing the Gap to Human-Level Performance in Face Verification, Rᴇsᴇᴀʀᴄʜ ᴀᴛ
Fᴀᴄᴇʙᴏᴏᴋ (June 24, 2014), [Link]
formance-in-face-verification/.
17
Id.
18
See, e.g., Alessandro Acquisti et al., Faces of Facebook: Privacy in the Age of Augmented Reality, Black Hat Webcast 1
(Jan. 9, 2012), [Link] Larry Magid, Google+
Adds Find My Face Feature, Forbes (Dec. 8, 2011, 1:59 PM), [Link]
google-adds-find-my-face-feature/. See also Douglas Gantenbein, Helping Kinect Recognize Faces, Microsoft
Research (Oct. 31, 2011), [Link]
19
See, e.g., Face Detection, Sᴏɴʏ Cʏʙᴇʀ-sʜᴏᴛ Usᴇʀ Gᴜɪᴅᴇ, [Link] guide/
eng/contents/05/02/15/[Link]; DCRP Review: Canon PowerShot S5 IS, Dɪɢɪᴛᴀʟ Cᴀᴍᴇʀᴀ Rᴇsᴏᴜʀᴄᴇ Pᴀɢᴇ, http://
[Link]/reviews/canon/powershot s5-review/.
20
See, e.g., Russell Brandom, Apple’s New Facial Recognition Feature Could Spur Legal Issues, Tʜᴇ Vᴇʀɢᴇ (June 16,
2016, 8:11 AM), [Link]
(addressing Apple’s announcement of a new facial recognition system cataloging pictures according to faces and
Google Photos’ ability to auto-tag photos); Tom Simonite, Apple Rolls Out Privacy-Sensitive Artificial Intelligence,
MIT Tᴇᴄʜɴᴏʟᴏɢʏ Rᴇᴠɪᴇᴡ (June 13, 2016), [Link]
tive-artificial-intelligence/.
21
See Larry Magid, supra note 18.
Face Recognition, Real Time Identification, and Beyond 105

Some apps and devices have used face recognition instead of passwords to allow users to quickly
unlock and access their services.22 Gaming devices use face recognition to keep track of different
players so that friends can challenge each other in sports in their living rooms rather than just
exercising their thumbs with the more traditional forms of video games.23 So the applications of
the technology are many and diverse and there is continuous innovation in this field.
The availability of ubiquitous camera phones with fast Internet connection have enabled
mobile applications also to take advantage of face recognition technology. This means a phone
user could take a photo of someone they see and instantly read information about that individual
on the phone. A number of mobile apps have tapped into Facebook’s massive photo database to
allow users to do just that. One example is KLIK, an iPhone app offered by a company called
[Link] in 2012.24 KLIK was short lived. It was acquired by Facebook, which promptly pulled
the plug on the app after it turned out to have security vulnerabilities.25 There was a similar app
called Viewdle SocialCamera for Android users.26 These and other apps allowed users to upload
photos to social networks and link them to the user profiles of the individuals identified in the
photos. Photos taken with a mobile phone often include metadata about where and when the
photo was taken.27 So uploading the images to social networks allowed social networks to track
the location of the user, who presumably was there to take the photo, as well as the individuals
identified in the photo. The location may also have been available to other social network users,
depending on whether the social network in question made metadata publicly available.
And even without metadata, the location can sometimes be obvious from landmarks in the
background, potentially exposing more information than the uploader or people identified in
the photo would anticipate.28

emerging regulatory responses to face recognition technology


Governments around the world are reacting to the new prevalence of face recognition technol
ogy. Some have created fairly specific rules for handling face recognition data. Others are

22
See e.g., John P. Pullen, How Windows 10 Could Kill Passwords Forever, Tɪᴍᴇ (Nov. 30, 2015), [Link]
windows-10-hello-facial-recognition/ (describing a Microsoft Windows 10 feature called “Hello,” which allows device
login using facial recognition).
23
Douglas Gantenbein, Helping Kinect Recognize Faces, Microsoft Research (Oct. 31, 2011, 9:30 AM), [Link]
.[Link]/web/20160428231310/[Link]
24
See David Goldman, Real-Time Face Recognition Comes to Your iPhone Camera, CNN Money (Mar. 12, 2012), http://
[Link]/2012/03/12/technology/iPhone-face-recognition/[Link].
25
See Ashkan Soltani, Facepalm, AshkanSoltani (June 18, 2012), [Link] (“Face
.com essentially allowed anyone to hijack a KLIK user’s Facebook and Twitter accounts to get access to photos and
social graph (which enables ‘face prints’), even if that information isn’t public.” (Emphasis in the original.)); Steven
Musil, Facebook Shuts Down [Link] APIs, Klik App, CNET News (July 8, 2012, 11:00 AM), [Link]
8301–1023 3–57468247–93/[Link]-apis-klik-app/.
26
See, e.g., Emily Steel, A Face Launches 1,000 Apps, Wall St. J. (Aug. 5, 2011), [Link]
[Link]?mod WSJ Tech LEFTTopNews; Viewdle, CrunchBase,
[Link]
27
See, e.g., Facebook Data Use Policy: Information We Receive and How It Is Used, Facebook, [Link]
.com/about/privacy/your-info#inforeceived (Facebook may get this information as a geotag uploaded with the photo,
containing its exact latitude and longitude). See also Kate Murphy, Web Photos That Reveal Secrets, Like Where You
Live, N.Y. Times, Aug. 12, 2010, at B6.
28
See [Link] Publishes Exclusive with John McAfee Reveals Location in iPhone Metadata (EXIF), Mobile Privacy
(Dec. 3, 2012), [Link]
in-iphone-metadata-exif/; see also Hanni Fakhoury, A Picture Is Worth a Thousand Words, Including Your Location,
Electronic Frontier Foundation (Apr. 20, 2012), [Link]
words-including-your-location.
106 Yana Welinder and Aeryn Palmer

already using the technology to their own ends, but have yet to create laws controlling its use.
In this section, we will detail some recent developments in legal responses to face recognition.
Divided by region, the survey is a sample of current laws and practices, and does not cover the
entire globe.

United States
In the United States, both state and federal laws regulate face recognition technology.29 Chief
among them are Section 5 of the Federal Trade Commission Act (FTC Act) and the Children’s
Online Privacy Protection Act (COPPA). Section 5 of the FTC Act protects consumers from
“unfair or deceptive acts or practices in or affecting commerce.”30 In practice, this requirement
means that technology companies must provide consumers with adequate notice and choice
about practices that may affect their privacy. COPPA requires that websites targeting or
knowingly collecting information from children must ensure that they have parental consent.31
Neither law addresses face recognition specifically, although the FTC has released best practices
regarding face recognition data that emphasize the importance of choice, transparency, and
building in privacy at every stage of product development, a practice known as “Privacy by
Design.”32
Without comprehensive federal laws addressing the unique nature of biometric information,
some states have passed their own laws, which consumers or state attorney generals may
enforce.33 Here, we will mostly focus on recent consumer litigation. Many of the recent cases
were brought under the Illinois Biometric Information Privacy Act (BIPA), which provides rules
for the collection, storage, and handling of biometric data. Consumers may sue for violations of
the law, such as if their information is stored or used improperly, or disclosed without author
ization.34 For example, in Licata v. Facebook, Inc., the plaintiffs claimed that Facebook’s face
recognition feature, which helps users “tag” people in photos, violates the BIPA.35 Facebook’s
motion to dismiss failed; they argued that digital photographs were not “face geometry” under
the BIPA, but the court disagreed.36
This outcome mirrors the ruling on Shutterfly’s motion to dismiss a similar case, Norberg v.
Shutterfly, Inc., the first to address such a claim.37 There, the court held that online photos
contain face geometry information, and found that the plaintiff’s arguments could have merit:

29
For a summary of these laws through 2013, see Yana Welinder, A Face Tells More than a Thousand Posts: Developing
Face Recognition Privacy in Social Networks, 26 Harvard J. L. & T. 165 (2012); Yana Welinder, Facing Real-Time
Identification in Mobile Apps & Wearable Computers, 30 Santa Clara High Tech. L.J. 89 (2014).
30
15 U.S.C. § 45(a)(1) (2006).
31
15 U.S.C. § 6502.
32
Facing Facts: Best Practices for Common Uses of Facial Recognition Technologies, Fed. Trade Comm’n, [Link]
.[Link]/sites/default/files/documents/reports/facing-facts-best-practices-common-uses-facial-recognition-technologies/
[Link].
33
See, e.g., TEX. BUS. & COM. CODE ANN. § 503.001(a) (2009); 740 Iʟʟ. Cᴏᴍᴘ. Sᴛᴀᴛ. 14 (2008).
34
740 Iʟʟ. Cᴏᴍᴘ. Sᴛᴀᴛ. 14.
35
Venkat Balasubramani, Facebook Gets Bad Ruling in Face-Scanning Privacy Case In Re Facebook Biometric
Information Privacy Litigation, Tᴇᴄʜ. & Mᴋᴛɢ. Lᴀᴡ Bʟᴏɢ (May 6, 2016), [Link]
2016/05/facebook-gets-bad-ruling-in-face-scanning-privacy-case-in-re-facebook-biometric-information-privacy-litiga
[Link].
36
Id.; In re Facebook Biometric Information Privacy Litigation, 2016 WL 2593853 (N.D. Cal. May 5, 2016).
37
Venkat Balasubramani, Shutterfly Can’t Shake Face-Scanning Privacy Lawsuit, Tᴇᴄʜ. & Mᴋᴛɢ. Lᴀᴡ Bʟᴏɢ (Jan. 11,
2016), [Link]
Face Recognition, Real Time Identification, and Beyond 107

Plaintiff alleges that Defendants are using his personal face pattern to recognize and identify
Plaintiff in photographs posted to Websites. Plaintiff avers that he is not now nor has he ever
been a user of Websites, and that he was not presented with a written biometrics policy nor has
he consented to have his biometric identifiers used by Defendants. As a result, the Court finds
that Plaintiff has plausibly stated a claim for relief under the BIPA.38

Tech companies have unsuccessfully lobbied to amend BIPA;39 without changes, the lawsuits
continue. Facebook is now trying a new argument to defeat the claims in Licata. It is using the
Supreme Court’s Spokeo v. Robins40 ruling to argue that any injury from scanning photographs
is too speculative to warrant damages.41 Another class action suit was filed in July 2016, this time
against Snapchat. The plaintiff alleges that Snapchat’s “Lenses” feature which adds humorous
animations to a person’s image is based upon scanning their face and retaining their biometric
identifiers, in violation of BIPA.42 Uses of face recognition technology will almost certainly
continue to multiply, and it is likely that consumer lawsuits will do the same.
In June 2016, the US Department of Commerce National Telecommunications and Infor
mation Administration (NTIA) released best practices for commercial use of face recognition
technology.43 Among other considerations, the guide suggests that companies collecting, pro
cessing, or storing face recognition data prioritize good data management and security practices,
and be transparent with consumers about the data’s handling and use.44 Perhaps if consumers
were provided with better notice, some BIPA litigation could be prevented.
The NTIA guidelines do not apply to government entities. A recent report from the US
Government Accountability Office begins to fill that gap, in addressing FBI use of automatic
face recognition.45 The FBI has a database of roughly 30 million photos, mostly from state, local,
and other law enforcement agencies, to which face recognition analysis is applied in the course
of criminal investigations.46 The report found several ways in which oversight and transparency
could be improved. It noted, for example, that required Privacy Impact Assessments were not
carried out in a timely manner,47 oversight audits of some systems had not been completed,48
and the accuracy of some external analysis systems had not been assessed.49

38
Norberg v. Shutterfly, Inc., No. 15-CV-5351 (N.D. Ill. 2015). See also Venkat Balasubramani, Shutterfly Can’t Shake
Face-Scanning Privacy Lawsuit, Tᴇᴄʜ. & Mᴋᴛɢ. Lᴀᴡ Bʟᴏɢ (Jan. 11, 2016), [Link]
01/shutterfl[Link].
39
Id.
40
Spokeo, Inc. v. Robins, 578 U.S. (2016)., [Link] [Link]
41
John J. Roberts, Facebook and Google Really Want to Kill This Face-Scanning Law, Fortune (10:17 AM EDT), http://
[Link]/2016/06/30/facebook-google-face-recognition-lawsuits/.
42
Cyrus Farivar, Does Snapchat’s Lenses Feature Violate Illinois’ Biometrics Law?, Aʀs Tᴇᴄʜɴɪᴄᴀ (July 17, 2016), http://
[Link]/tech-policy/2016/07/does-snapchats-lenses-feature-violate-illinois-biometrics-law/; Martinez v. Snapchat,
Inc., No. 2:16-CV-05182 (Cal. Super. Ct. 2016), [Link]
43
Hunton & Williams, NTIA Releases Facial Recognition Technology Best Practices, Pʀɪᴠᴀᴄʏ & Iɴfᴏʀᴍᴀᴛɪᴏɴ Sᴇᴄᴜʀɪᴛʏ
Lᴀᴡ Bʟᴏɢ (June 22, 2016), [Link]
best-practices/.
44
National Telecommunications and Information Administration, Privacy Best Practice Recommendations For Com-
mercial Facial Recognition Use, Pʀɪᴠᴀᴄʏ & Iɴfᴏʀᴍᴀᴛɪᴏɴ Sᴇᴄᴜʀɪᴛʏ Lᴀᴡ Bʟᴏɢ (June 15, 2016), [Link]
[Link]/wp-content/uploads/sites/18/2016/06/privacy best practices recommendations for commercial
use of facial [Link].
45
U.S. Gov’t Accountability Office, GAO-16–267, Face Recognition Technology: FBI Should Better Ensure Privacy and
Accuracy (May 2016), United States Government Accountability Office available at [Link]
680/[Link].
46
Id. at 10.
47
Id. at 18.
48
Id. at 23.
49
Id. at 30.
108 Yana Welinder and Aeryn Palmer

Whether these recent publications will affect general practice in either corporate or govern
ment use of face recognition technology remains to be seen. For the moment, it is clear that
efforts to respond to the technology in the United States are coming both from government
agencies and also from consumers themselves. Those who use and develop the technology face a
somewhat uncertain regulatory future.

Europe
In Europe, data collection and processing are governed by a framework of national laws that
implement the European Union (EU) Data Protection Directive.50 Article 29 of the directive
establishes a working party to opine on how the directive should be applied to specific data
practices.51 In a 2012 opinion, the Article 29 working party provided guidance on the develop
ment of technologies that process biometric data.52 The opinion provided specific requirements
for companies to bear in mind.53 These included a proportionality requirement and an accuracy
requirement, which stressed the importance of preventing identity fraud.54 Additionally, com
panies were encouraged to retain biometric data no longer than necessary.55
The opinion specifically singled out tagging software on social networks as a source of concern:
Photographs on the internet, in social media, in online photo management or sharing applica
tions may not be further processed in order to extract biometric templates or enrol them into a
biometric system to recognise the persons on the pictures automatically (face recognition)
without a specific legal basis (e.g. consent) for this new purpose. . . . [If a data subject consents
to being tagged] biometric data not needed anymore after the tagging of the images with the
name, nickname or any other text specified by the data subject must be deleted. The creation of
a permanent biometric database is a priori not necessary for this purpose.56

EU regulators have been cautious about the use of face recognition software for social media
purposes Facebook’s “Moments” app was stripped of its automatic face scanning capabilities in
its EU release. Users may identify friends one by one, and the app looks for other images with
similarities, but does not compile a database of biometric information.57
The future of regulating biometric data in the EU depends upon the incoming General Data
Protection Regulation (GDPR), which will supersede the Data Protection Directive. The
GDPR has established baseline considerations for the processing of biometric information,
defining it broadly to include face recognition data: “‘biometric data’ means personal data
resulting from specific technical processing relating to the physical, physiological or behavioural

50
Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the Protection of
Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data, Recital 3,
1995 O.J. (L 281) 31 (EC), [Link] OJ:L:1995:281:0031:0050:EN:PDF.
It should be noted that this directive is now superseded by Regulation (EU) 2016/679 of the European Parliament and
of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and
on the free movement of such data.
51
Id. at art. 29.
52
Article 29 Data Protection Working Party, Opinion on “Developments in Biometric Technologies,” 2012 00720/12
(WP 193) (EN) 2012 O.J. (L 720) 17 (EN), [Link]
ion-recommendation/files/2012/wp193 [Link].
53
Id.
54
Id.
55
Id.
56
Id. at 7.
57
Russell Brandom, Facebook Strips Out Facial Recognition for the European Version of Moments, Tʜᴇ Vᴇʀɢᴇ (May 10,
2016, 5:38 PM), [Link]
Face Recognition, Real Time Identification, and Beyond 109

characteristics of a natural person, which allow or confirm the unique identification of that
natural person, such as face images or dactyloscopic data.”58
Biometric information processed “in order to uniquely identify a person” is considered
sensitive data that requires particular protection.59 Processing of sensitive information is only
allowed under certain exceptions, for example if the data subject has given explicit consent,60 or
if the processing is necessary to uphold a “substantial public interest.”61 The GDPR makes it
clear that these caveats are only a floor; each country may create further limitations.62
The current Data Protection Directive does not include biometric information among the
categories of sensitive data. However, data protection authorities have already treated biometric
data as a special case. For example, the Irish Data Protection Commissioner has strongly
suggested that systems collecting biometric data in the workplace should undergo a privacy
impact assessment.63 The GDPR recommends similar assessments in its recitals.64
Outside of the EU, regulators in other European countries also think about how to address
face recognition technology. In Turkey, a draft law that would regulate the protection of
personal data has not yet been implemented.65 Turkish law generally defines personal infor
mation as “any information relating to an identified or identifiable individual,” using the
definition in Article 2(a) of the Strasbourg Convention for the Protection of Individuals with
Regard to Automatic Processing of Personal Data 1981.66 Until the draft law comes into force,
a patchwork of constitutional provisions, civil, and criminal laws may impact the collection and
use of biometric and face recognition data in Turkey which is already in use there. In 2015,
biometric gate controls were added to Istanbul’s Atatürk airport, with plans to expand the system
from fingerprints to face recognition data.67 The same year, Vodafone Turkey aimed to allow
users to login to a payment app via eyeprint.68
In contrast to Turkey, some European countries seek to regulate biometric data with more
specificity. For example, Azerbaijan has a specific law relating to biometric information, which
includes “fingerprints of hands and palms, the image of the person, the retina, fragments of the
voice and its acoustic parameters, analysis results of deoxyribonucleic acid (DNA), the sizes
of the body, the description of special signs and shortcomings of the body, handwriting, the
signature etc.”69 Similarly, Serbia’s draft law on personal data protection defines biometric data

58
Regulation (EU) 2016/679 of the European Parliament and of the Council on the Protection of Natural Persons with
Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/
EC, art. 4(14), (Apr. 27, 2016), [Link] CELEX:32016R0679&
from en [hereinafter “General Data Protection Regulation” or “GDPR”].
59
Id. at art. 9(1).
60
Id. at art. 9(2)(a).
61
Id. at art. 9(2)(g).
62
Id. at art. 9(4).
63
The Office of the Data Protection Commissioner in Ireland, Biometrics in the Workplace, Data Protection
Commissioner in Ireland [Link]
64
GDPR Recital 91.
65
Hakki Can Yildiz & Can Sözer, Data Protection in Turkey: Overview, Pʀᴀᴄᴛɪᴄᴀʟ Lᴀᴡ (Nov. 1, 2015), [Link]
.[Link]/7–520-1896#a490337.
66
Id. at § 3; See also Council of Europe, Convention for the Protection of Individuals with Regard to Automatic
Processing of Personal Data (“Strasbourg convention”), European Treaty Series – No. 108 (Jan. 28, 1981), https://
[Link]/en/web/conventions/full-list/-/conventions/rms/0900001680078b37.
67
Stephen Mayhew, Istanbul Atatürk Airport Deploys Biometric Border Control Gates, BɪᴏᴍᴇᴛʀɪᴄUᴘᴅᴀᴛᴇ.ᴄᴏᴍ (Jan. 26,
2015), [Link]
68
Justin Lee, Olcsan to Resell and Distribute EyeVerify Eye Recognition Products in Turkey and Europe,
BɪᴏᴍᴇᴛʀɪᴄUᴘᴅᴀᴛᴇ.ᴄᴏᴍ (June 8, 2015), [Link]
ify-eye-recognition-products-in-turkey-and-europe.
69
The Law of the Azerbaijan Republic about Biometric Information, 2008, No. 651-IIIG, Article 1.1.2), [Link]
[Link]/[Link]?rgn 24349.
110 Yana Welinder and Aeryn Palmer

as “particularly [including] data relating to appearance, voice, fingerprint, signature, hand


geometry, eye pupil and cornea, as well as DNA.”70
Russia’s Personal Data Law, which regulates both government and private businesses, pro
vides a more general definition of biometric data, and does not single out face imagery.
It encompasses “information characterizing physiological peculiarities of a human being and
on which basis it is possible to establish his identity.”71 This data can only be processed without
the data subject’s consent under certain conditions, such as upon crossing the border.72
Additionally, biometric data may only be stored using methods “that ensure protection of these
data from unlawful or accidental access to them, destruction, modification, blocking, copying,
distribution.”73
Use of face recognition technology in Russia is expanding. FindFace is a wildly popular Russian
app that allows users to discover social media profiles of individuals they photograph in public.74
Its inventors are in talks with Moscow city officials to install the program on CCTV systems.75
The software could be used by police to identify suspects by matching surveillance images with
court records or even social media photos. The company also plans retail applications, which
could allow shops to track individuals and market products to them based upon items in which
they show interest online.76 It remains to be seen how Russian law will handle such developments.

Canada
In Canada, the major federal privacy laws are the Privacy Act77 and the Personal Information
Protection and Electronic Documents Act (PIPEDA);78 the former applies to the public sector,
the latter to the private sector. Neither law specifically mentions face recognition. In 2013, the
Research Group of the Office of the Privacy Commissioner of Canada (OPC)79 issued a report
stating that the Privacy Act prohibits face recognition data from being used for purposes beyond
or inconsistent with those for which it was initially collected.80 Further, the report confirms that
PIPEDA would likely control any commercial use of face recognition technology.81
The report recommends that any party considering the use of face recognition technology
should examine factors such as the purposes of the use, consent from data subjects, and
reasonable security measures. It proposes a series of useful questions. For example, the OPC

70
Draft Law on Data Protection, art. 3(12), [Link] (providing
English and Serbian versions).
71
Federal Law on Personal Data, 2006, No.152-FZ, art. 11(1), [Link] center/Russian
Federal Law on Personal [Link].
72
Id. at art. 11(2).
73
Id. at art. 19(4).
74
Shaun Walker, Face Recognition App Taking Russia by Storm May Bring End to Public Anonymity, Tʜᴇ Gᴜᴀʀᴅɪᴀɴ
(May 17, 2016, 4:39 EDT), [Link]
public-anonymity-vkontakte.
75
Id.
76
Id.
77
Canada Privacy Act, R.S.C. 1985, c. P-21, [Link]
78
Canada Personal Information Protection and Electronic Documents Act, S.C. 2000, c. 5, [Link]
eng/acts/p-8.6/.
79
See generally Office of the Privacy Commissioner of Canada Website, [Link] [Link].
80
Research Group of the Office of the Privacy Commissioner of Canada, Automated Facial Recognition in the Public
and Private Sectors (Mar. 2013), § 5(3), [Link] 201303 e
.asp#heading-005-3 (addressing consistent use under facial recognition and the Privacy Act).
81
Id. at § 7, [Link] 201303 [Link]#heading-007 (discussing face
recognition and PIPEDA).
Face Recognition, Real Time Identification, and Beyond 111

suggests that government agencies determine whether or not the use of the technology justifies
the potential intrusion on subjects’ privacy:
Is the measure demonstrably necessary to meet a specific need?
Is it likely to be effective in meeting that need?
Would the loss of privacy be proportionate to the benefit gained?
Is there a less privacy invasive way of achieving the same end?82

The report notes that face recognition systems could be added to existing video surveillance,
such as at the Canadian border.83 This prediction seems to be coming true. In 2016, the
Canadian Border Services Agency announced that it is testing the use of face recognition
technology on travelers at the border.84
In addition to federal privacy laws, there are provincial privacy statutes. Some expressly mention
face recognition or biometric data, and allow government use in association with an application
for an identification card.85 However, this use may not be unfettered; in Quebec, the collection of
biometric information for identification purposes is allowed only with the express consent of the
data subject.86 Further, “only the minimum number of characteristics or measurements needed
to link the person to an act and only such characteristics or measurements as may not be recorded
without the person’s knowledge may be recorded for identification purposes.”87
In the absence of provincial laws that specifically mention face recognition technology,
regulators must still evaluate its use. In 2001, the Ontario Alcohol and Gaming Commission
allowed casinos to employ face recognition software to identify known cheats, as long as patrons
had notice of the technology’s use.88 More recently, in a 2012 report, the Office of the Infor
mation and Privacy Commissioner for British Columbia examined police use of face recogni
tion technology to identify suspects following a hockey riot.89 The data in question had been
collected by the Insurance Corporation of British Columbia (ICBC).90 The report made several
recommendations, including that ICBC cease using its database to identify individuals in
images sent to them by police in the absence of a warrant or other order.91 The report also
encouraged ICBC to conduct a Privacy Impact Assessment and make structural changes that
would establish better accountability and consideration of data subjects’ privacy.92

82
Id. at § 5(c), [Link] 201303 [Link]#heading-005-2.
83
Id. at § 5, [Link] 201303 [Link]#heading-005.
84
Jim Bronskill, Candid Facial-Recognition Cameras to Watch for Terrorists at Border, Tᴏʀᴏɴᴛᴏ Mᴇᴛʀᴏ (Jan. 8, 2016), http://
[Link]/news/canada/2016/01/08/[Link] (citing
The Canadian Press).
85
See, e.g., Identification Card Regulation, Alberta Reg. 221/2003 (Can.), § 7.3, [Link]
221-2003/latest/[Link]; Manitoba, The Drivers and Vehicles Act, CCSM c D104 §§ 149.1 et seq.
86
An Act to Establish a Legal Framework for Information Technology, CQLR c. C-1.1 s. 44, [Link]
qc/laws/stat/cqlr-c-c-1.1/latest/[Link].
87
Id.
88
Ontario Alcohol and Gaming Commission, Investigation Report, PC-010005–1/2001 (Can.), [Link]
on/onipc/doc/2001/2001canlii26269/[Link].
89
See Elizabeth Denham, Investigation Report F12 01: Investigation into the Use of Facial Recognition Technology by
the Insurance Corporation of British Columbia, Iɴfᴏʀᴍᴀᴛɪᴏɴ ᴀɴᴅ Pʀɪᴠᴀᴄʏ Cᴏᴍᴍɪssɪᴏɴᴇʀ (Feb. 16, 2012), https://
[Link]/investigation-reports/1245.
90
Id. at 2.
91
Id. at 35.
92
Id. at 35–6.
112 Yana Welinder and Aeryn Palmer

Latin America
Face recognition technology has previously been used in Latin America for governmental and
law enforcement purposes. For example, in 2000, the Mexican government used face recogni
tion information to prevent voter fraud.93 More recent activity in the country has involved the
creation of systems to make analysis of biometric information easier and more efficient.94
With use has come an interest in laws to regulate such technology. Peru’s personal data protection
law, passed in 2010, includes in its definition of sensitive data biometric information that can be used
to identify an individual.95 Colombia’s data protection law also classes biometric data as sensitive
information.96 Use of face recognition technology may be outpacing the passage of laws to regulate
it. Colombian buses are fitted with a system designed by FaceFirst that can identify bus passengers
whom the police may be seeking.97 FaceFirst has also established a system at at Panama’s Tocumen
International Airport, which picks out passengers who are wanted in the country, or by Interpol.98 In
2015, it was announced that Brazilian airports would see a similar project to improve security and
create a more efficient immigration experience for travelers.99 Brazil also provided officers with face
recognition goggles at the 2014 World Cup.100 Rio de Janeiro, host of the 2016 Summer Olympics,
refused to be outdone; they contracted with a Chinese company that had supplied the 2014 World
Cup with face recognition technology for security purposes.101
Law enforcement agencies are not the only parties using face recognition data in Latin
America. Consumers are gaining access to it in their daily life. For example, in Costa Rica,
Ecuador, and Peru, bank customers can authenticate their identity using their image.102 FacePhi
Biometría, a Spanish firm that has sold such technology to several banks in the region, describes

93
Mexican Government Adopts FaceIt Face Recognition Technology to Eliminate Duplicate Voter Registrations in
Upcoming Presidential Election, Tʜᴇ Fʀᴇᴇ Lɪʙʀᴀʀʏ (May 11, 2000), [Link]
ernment+Adopts+FaceIt+Face+Recognition+Technology+to. . .-a062019954 (citing Business Wire).
94
See, e.g., Ryan Kline, New Facial Recognition System in Mexico to Help Law Enforcement, SᴇᴄᴜʀᴇIDNᴇᴡs (July
23, 2007), [Link]
Katitza Rodriguez, Biometrics in Argentina: Mass Surveillance as a State Policy, Eʟᴇᴄᴛʀᴏɴɪᴄ Fʀᴏɴᴛɪᴇʀ Fᴏᴜɴᴅᴀᴛɪᴏɴ
(Jan. 10, 2012), [Link]
95
See DLA Piper, Data Protection Laws of the World: Peru, in Gʟᴏʙᴀʟ Dᴀᴛᴀ Pʀᴏᴛᴇᴄᴛɪᴏɴ Hᴀɴᴅʙᴏᴏᴋ, [Link]
.[Link]/[Link]#handbook/definitions-section/c1 PE, (accessed Sept. 5, 2015); see also Peru
Adopts New Data Protection Law, IT Lᴀᴡ Gʀᴏᴜᴘ, [Link] Data Protection
[Link] (last accessed Sept. 12, 2016).
96
L. 1377/2013, art. 3, June 27, 2013, Mɪɴɪsᴛʀʏ ᴏf Tʀᴀᴅᴇ, Iɴᴅᴜsᴛʀʏ ᴀɴᴅ Tᴏᴜʀɪsᴍ (Colom.), [Link]
knowledge center/DECRETO 1377 DEL 27 DE JUNIO DE 2013 [Link] (last accessed Sept. 12, 2016).
97
Peter B. Counter, FaceFirst Biometrics Deployed in Colombia, FɪɴᴅBɪᴏᴍᴇᴛʀɪᴄs (Mar. 19, 2015), [Link]
[Link]/facefirst-biometrics-deployed-in-colombia-23196/.
98
Peter B. Counter, FaceFirst Expands Border Control Deployment in Panama, FɪɴᴅBɪᴏᴍᴇᴛʀɪᴄs (Sept. 18, 2014),
[Link]
99
Justin Lee, NEC to Provide Facial Recognition Technology for 14 Brazilian Airports, BɪᴏᴍᴇᴛʀɪᴄUᴘᴅᴀᴛᴇ (July 16,
2015), [Link]
100
Ariel Bogel, Drones Are Keeping Their Eyes on the Ball, Too, Slate (June 13, 2014), [Link]
technology/future tense/2014/06/world cup security brazil has spent insane amounts on surveillance technol
[Link]; see also Robin Yapp, Brazilian Police to Use “Robocop-Style” Glasses at World Cup, The Telegraph
(Apr. 12, 2011), [Link]
[Link].
101
Wuang Sujuan & Sun Muyao, Made-in-China Security Equipment Safeguards Rio Olympics, China Daily (May 10,
2016), [Link] [Link].
102
Tanya Andreasyan, Banco Nacional of Costa Rica Implements FacePhi’s Facial Recognition Technology, IBS
Intelligence (Jan. 13, 2016), [Link]
ments-facephis-face-recognition-technology/.
Face Recognition, Real Time Identification, and Beyond 113

the use as account access via selfie.103 As consumer applications continue to crop up, we can
expect more consumer biometric regulation in the region.

Asia and Oceania


In China, too, consumers may now use face recognition to access their bank accounts.104
With such innovations serving as a backdrop, Asian countries have made strides towards
covering face recognition data in their personal data protection laws in the past few years. Japan,
China, and Hong Kong have amended existing laws, passed new ones, and issued agency
guidance that may address the use of face recognition data.
China’s draft cybersecurity law includes a reference to “personal biometric information,”
which it fails to define.105 Generally, the term has been understood to refer to genetic infor
mation and fingerprints.106 Whether or not the draft law is intended to cover face recognition
data, its regulation is already taking place. Face recognition is becoming a more popular method
of identification in China as mentioned earlier, banks are using it to verify customers’
identities. However, regulators have expressed concern over the lack of a technological standard
and prevented banks from using face recognition to identify new customers. Existing customers,
however, can still use the technology to be identified by their bank for online transactions or,
in the phrasing of Alibaba affiliate Ant Financial Services Group, “smile to pay.”107
In contrast to China’s rather vague draft cybersecurity law, guidance issued by the Hong Kong
Office of the Privacy Commissioner for Personal Data has made it clear that face recognition
data is under the umbrella of personal information.108 The guidance urges caution in the
gathering of biometric data, and notes that individuals must have a “free and informed choice”
in its collection.109
In Japan, amendments to the Personal Information Protection Act were suggested in 2014 that
would expand the definition of “personal data.”110 Passed in October of the following year, the
amendments added biometric data to the definition of Personally Identifiable Information.111

103
Id.
104
Zhang Yuzhe, Banks Face Obstacles to Using Biometric Data for ID Purposes, Caixin Online (May 25, 2015), http://
[Link]/2015–05–25/[Link]; see also Chinese Regulators Put Brakes on Facial-Recognition for
Payment, PYMNTS (May 26, 2015), [Link]
tion-for-payment/.
105
See Article 72(5); Eric Carlson, Sheng Huang & Ashwin Kaja, China Releases Draft of New Network Security Law:
Implications for Data Privacy & Security, Inside Privacy (July 12, 2015), [Link]
ized/chinas-releases-draft-of-new-network-security-law-implications-for-data-privacy-security/. The law received a
second reading in June 2016, but the updated text does not provide more details. See Cybersecurity Law (draft)
(second reading draft), China Law Translate, [Link] en; see also
Second Reading of China’s Draft of Cybersecurity Law, Hunton Privacy Blog (June 30, 2016), [Link]
.[Link]/2016/06/30/second-reading-of-chinas-draft-of-cybersecurity-law/.
106
Carlson, Huang & Kaja, supra note 98.
107
Yuzhe, supra note 97; see also PYMTS, supra note 97.
108
See Office of the Privacy Commissioner for Personal Data, Hong Kong, Guidance on Collection and Use of Biometric
Data (July 2015), [Link] centre/publications/files/GN biometric [Link] (last
visited Aug. 29, 2016).
109
Id.
110
Mark Parsons & Peter Colegate, 2015: The Turning Point for Data Privacy Regulation in Asia? (Feb. 18, 2015),
Chronicle of Data Protection (Feb. 18, 2015), [Link]
privacy/2015-the-turning-point-for-data-privacy-regulation-in-asia/.
111
Matthew Durham, Japan Updates Privacy Law, Winston (Oct. 20, 2015), [Link]
corner/[Link].
114 Yana Welinder and Aeryn Palmer

This aspect of the regulation will soon be tested; Japan has made plans to implement face
recognition systems at its airports, in advance of hosting the 2020 Summer Olympics.112
To the south, Australia and New Zealand have begun using biometric data for national
security and immigration purposes. Both countries’ customs services use face recognition
software at the border.113 Australian law was amended in 2015 to explicitly allow this use. An
explanatory memorandum for the Migration Amendment (Strengthening Biometrics Integrity)
Bill defined personal identifiers as unique physical characteristics, such as fingerprints, iris scans,
or face images.114 The bill empowered the government to collect such information at the border,
from both citizens and noncitizens.115 Later in the year, Australia announced that it was creating
a National Facial Biometric Matching Capability, which will allow some police and other
government agencies to “share and match” photos in existing databases. Minister for Justice
Michael Keenan hailed the tool as Australia’s “newest national security weapon.”116

The Middle East and North Africa


In some Middle Eastern countries, where biometric data and its appropriate uses have not yet
been clearly defined in the law, face recognition is nevertheless part of the technological
landscape.
In Egypt, no specific law explicitly regulates the use of face recognition technology. It may
instead be covered by a patchwork of other laws.117 A 2014 Privacy International report notes that
“[t]he absence of definitions as to what consists personal data and sensitive personal data, [and] the
lack of an independent national authority responsible for data protection in Egypt . . . raise
significant concerns in view of the extensive access given to authorities of users’ personal data.”118
Egyptian authorities are already using biometric data to serve important governmental
functions, such as verifying the identity of voters.119 This is somewhat similar to the situation
in Lebanon, which also lacks a broad privacy regime.120 There has been little oversight by
Lebanese courts regarding biometric data.121 Yet, passports using face recognition information
have been introduced.122

112
Kamran Shah, Japan Launches “Facial Recognition” Technology to Thwart Terrorism: Report, INQUISITR (Mar. 23,
2016), [Link]
113
Melinda Ham, Face Recognition Technology, University of Technology Sydney Faculty of Law (Nov. 17, 2015),
[Link]
114
See Explanatory Memorandum, Migration Amendment (Strengthening Biometrics Integrity) Bill 2015, [Link]
[Link]/parlInfo/search/display/display.w3p;query Id%3A%22legislation%2Fems%2Fr5421 ems 2e28605d-
fbe5–401d–9039-ccead805c177%22 at 1.
115
Id. at, e.g., 34.
116
Ariel Bogle, Facial Recognition Technology Is Australia’s Latest “National Security Weapon,” Mashable (Sep. 11,
2015), [Link]
117
Dyson et al., Data Protection Laws of the World, DLA Piper, [Link]
section/c1 EG (last visited Aug. 25, 2016).
118
See Privacy International et al., The Right to Privacy in Egypt, Privacy International, [Link]
[Link]/sites/default/files/UPR [Link] at 11.
119
Stephen Mayhew, Mobile Biometric Solution MorphoTablet Secures the Voting Process in Egyptian Parliamentary
Elections, Biometric Update (Dec. 17, 2015), [Link]
morphotablet-secures-the-voting-process-in-egyptian-parliamentary-elections.
120
See Privacy International et al., The Right to Privacy in Lebanon, Privacy International, [Link]
[Link]/sites/default/files/Lebanon UPR 23rd session Joint Stakeholder submission [Link] (Mar. 2015).
121
Alexandrine Pirlot de Corbion, Lebanon: It’s Time to Turn Your International Position on Privacy into Action at the
National Level, Privacy International (May 26, 2015), [Link]
122
Adam Vrankulj, Lebanon to Introduce Biometric Passports in 2014, Biometric Update (Nov. 27, 2012), [Link]
.[Link]/201211/lebanon-to-introduce-biometric-passports-in-2014.
Face Recognition, Real Time Identification, and Beyond 115

Other countries in the region have more established data protection laws. One example is
Morocco.123 While the Moroccan personal data law does not specifically refer to biometric or
face recognition information, it is likely that such data would be covered by the definition of
“personal information” in Article 1, which includes information relating to “several factors
specific to the physical, physiological, genetic, mental, economic, cultural or social identity
[of that natural person].”124
A 2014 report on data protection in various countries, commissioned by the United Kingdom
Centre for the Protection of National Infrastructure, found Moroccan law largely to be in line
with the EU Data Protection Directive.125 Time will tell whether Morocco will update its laws to
match the new GDPR’s considerations regarding biometric data. Moroccan government offi
cials have taken courses from the US FBI on topics including face recognition technology, so
they may be looking for ways to use these capabilities.126
A detailed data protection law can also be found in the Dubai International Financial
Centre (DIFC) in the United Arab Emirates. The DIFC Data Protection Law No. 1 of
2007,127 amended by the DIFC Data Protection Law Amendment Law No. 5 of 2012,128
prescribes rules for the handling of personal data. It also establishes a Commissioner of Data
Protection, who issues relevant regulations.129 Under the Data Protection Law, an “Identifi
able Natural Person” is “a natural living person who can be identified, directly or indirectly,
in particular by reference to an identification number or to one or more factors specific to
his biological, physical, biometric, physiological, mental, economic, cultural or social
identity.”130
Despite the fact that more specific data protection laws are otherwise lacking in the United
Arab Emirates,131 one of the most famous recent uses of face recognition technology has
occurred in Abu Dhabi, where police have been using such software since 2008. Information
posted on the police department’s website in 2013 explains that:
with only a brief look from the individual in the direction of the camera, face characteristics
such as the position, size and shape of the eyes, nose, cheekbones and jaw are recorded and the
image is instantly secured. Biometric software allows for analysis and evaluation of the image by
anyone: technicians do not need extensive training, unlike with other biometric technology.

123
See Law no. 09–08 of 18 February 2009 relating to the protection of individuals with respect to the processing of
personal data, [Link] see also its implementation Decree n 2–09–
165 of 21 May 2009, [Link]
124
Law no. 09–08 of Feb. 2009 Article 1(1) (“plusieurs éléments spécifiques de son identité physique, physiologique,
génétique, psychique, économique, culturelle ou sociale”).
125
See Centre for the Protection of National Infrastructure, Personnel Security in Offshore Centres (Apr. 2014) at 132–33,
Centre for the Protection of National Infrastructure [Link]
75/[Link].
126
See Moroccan American Center for Policy, State Department Terrorism Reports Lauds Moroccan Counterterrorism
Strategy, Market Wired (June 23, 2015), [Link]
[Link].
127
See Dubai International Financial Centre Authority (DIFC) Data Protection Law, DIFC Law No. 1 of 2007, https://
[Link]/files/5814/5448/9177/Data Protection Law DIFC Law No. 1 of [Link].
128
See DIFC Data Protection Law Amendment Law No. 5 of 2012, DIFC, [Link]
Protection Law Amendment Law DIFC Law No.5 of [Link] (last visited Aug. 29, 2016).
129
See DIFC Data Protection Regulations, DIFC, [Link] Protection Regula
tions 0 0 [Link] (last visited Aug. 29, 2016).
130
See DIFC Data Protection Law Amendment Law, DIFC Law No. 5, 2012, Schedule 1.3, [Link]
5449/6834/Data Protection Law Amendment Law DIFC Law No.5 of [Link].
131
Dyson et al., Data Protection Laws of the World Handbook, DLA Piper, [Link]
book/law-section/c1 AE2/c2 AE.
116 Yana Welinder and Aeryn Palmer

This makes the Community Protection Face Recognition System in addition to being highly
accurate exceptionally simple to use.132

Additional plans were made in 2015 to use face recognition technology at United Arab Emirates
airports as well, in order to implement more efficient interactions between travelers and
immigration officers.133

recommendations for policy-makers


As the previous section illustrates, there are currently no consistent and comprehensive rules
governing applications that rely on face recognition technology. This section provides some
general recommendations for how these applications should be addressed, recognizing that the
best regulation will carefully balance innovation and privacy rights.
As a model for good balance between innovation and privacy, we can look at how the law
around instantaneous photography has evolved over the past century. In a seminal 1890s piece
articulating principles that became the foundation of current privacy law in the United States,
Samuel Warren and Louis Brandeis observed that “since the latest advances in photographic art
have rendered it possible to take pictures surreptitiously, the doctrines of contract and of trust are
inadequate to support the required protection, and the law of tort must be resorted to.”134 Back
then, instantaneous photography challenged the law just as face recognition does today.135 It was
up to lawmakers and the public to determine the norms to govern photography. In the United
States, they did not in any way prohibit instantaneous photography or stop innovation in portable
cameras. But they did regulate specific situations, such as photographing and videotaping private
body parts without a person’s consent.136 We also witnessed the development of a body of case
law that regulates specific situations when photographing or publishing a photo may invade a
person’s privacy.137 Other jurisdictions have struck the balance differently based on the value
their culture places on privacy. In France, for example, a photographer needs to get a person’s
consent before taking a photo focusing on that person, even if the photo is taken in public.138
With information flowing freely between state boundaries in our information age, finding a
balance based on cultural norms will be more difficult and result in a battle of values.
But ultimately, some balance will need to be reached.

Technology Neutral Regulation


Our earlier publications on face recognition technology advise against a blanket prohibition on
the technology because it also presents useful applications, many of which we are still to

132
See Abu Dhabi Police GHQ, Face Recognition (Apr. 8, 2013), Abu Dhabi Police [Link]
aboutadpolice/ourachievments/face.fi[Link].
133
Caline Malek, New Biometrics System to Speed Up Travel Through UAE, The National (Mar. 12, 2015), [Link]
.[Link]/uae/new-biometrics-system-to-speed-up-travel-through-uae.
134
Samuel D. Warren & Louis D. Brandeis, The Right to Privacy, 4 Harv. L. Rev. 193, 211 (1890).
135
See id.
136
Video Voyeurism Prevention Act, 18 U.S.C. § 1801 (2004).
137
See, e.g., Restatement (Second) of Torts § 652B cmt. b (1977) (“The intrusion itself makes the defendant subject
to liability, even though there is no publication or other use of any kind of the photograph or information outlined.”).
138
Logeais & Schroeder, at 526.
Face Recognition, Real Time Identification, and Beyond 117

discover.139 Face recognition technology can help photographers organize their photos, and
enable more interactive video games, and certain aspects of the technology even provide basic
functionality in digital cameras.140
Particular uses of face recognition technology, however, may raise privacy concerns. One
invasive use would be if a person with criminal intent could automatically recognize strangers in
public and access their personal information from a social network or a dating app, for example.
Applications that enable this behavior may need to be regulated to prevent harmful uses. Any
regulation that could overly burden or eliminate applications of technology needs to be
preceded by very careful analysis. But more importantly, any such regulation should narrowly
target specific uses, rather than classes of technology such as face recognition technology.
Technology neutrality is a well established regulatory principle that is particularly beneficial
for rapidly developing technologies.141 One example of tech neutral regulation that has applied
to specific uses of face recognition technology is the EU Data Protection Directive.142
The directive regulates automatic processing of personal data, which can be done with a lot
of different technologies. An advisory opinion of the EU Article 29 working party explains how
this tech neutral directive applies to particular uses of face recognition technology.143 A German
Data Protection Agency has similarly enforced the directive as it has been implemented into
German law against Facebook’s use of face recognition technology.144 But the directive could
similarly have been applied if Facebook started enabling its users to identify other Facebook
users in the street based on their scent or patterns in their fashion choices. At the same time, the
directive does not apply to a camera function that identifies a person’s face to focus the lens on
the face, even though this function is one step of the face recognition process, because its
application does not result in an identification.
In contrast to the EU directive, recommendations issued by the US FTC in 2012 broadly focus
on face recognition technology to provide guidance for developing various applications that use
the technology.145 The guidance is based on a workshop that examined face recognition
technology, rather than focusing on specific uses or particular privacy concerns.146

139
See Yana Welinder, A Face Tells More Than a Thousand Posts: Developing Face Recognition Privacy in Social
Networks, 26 Harv. J. L. & Tech. 165 (2012); see also Yana Welinder, Facing Real-Time Identification in Mobile Apps
& Wearable Computers, 30 Santa Clara High Tech. L.J. 89 (2014).
140
supra, at 104–105.
141
See Bert-Jaap Koops, Should ICT Regulation Be Technology-Neutral?, in 9 It & Law Series, Starting Points For
ICT Regulation, Deconstructing Prevalent Policy One Liners 77 (Bert-Jaap Koops et al., eds., 2006) (arguing
that “legislation should abstract away from concrete technologies to the extent that it is sufficiently sustainable and at
the same provides sufficient legal certainty”), [Link] 918746.
142
Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the Protection of
Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data, Recital 3,
1995 O.J. (L 281) 31 (EC), [Link] OJ:L:1995:281:0031:0050:EN:PDF.
It should be noted that this Directive is now superseded by the Regulation (EU) 2016/679 of the European Parliament
and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data
and on the free movement of such data.
143
See Opinion of the Article 29 Data Protection Working Party, 2012 O.J. (C 727), [Link]
protection/article-29/documentation/opinion-recommendation/files/2012/wp192 [Link].
144
See Jon Brodkin, Germany: Facebook Must Destroy Facial Recognition Database, Ars Technica (Aug. 15, 2012),
[Link]
145
Facing Facts: Best Practices for Common Uses of Facial Recognition Technologies, Fed. Trade Comm’n, [Link]
.[Link]/sites/default/files/documents/reports/facing-facts-best-practices-common-uses-facial-recognition-technologies/
[Link].
146
Face Facts: A Forum on Facial Recognition Technology, Fed. Trade Comm’n, [Link]
facefacts/.
118 Yana Welinder and Aeryn Palmer

Technology neutral regulation does not always mean that regulation has to be particularly
broad. It could, for example, specifically address the instantaneous processing of biometric data,
which would apply to real time identification as well as to other similar processes. It may seem
that broad regulation of face recognition technology will be more effective because it will cover
new face recognition technology implementations as they evolve. But while broad regulation of
automatic face recognition could provide regulation of new implementations as they crop up,
that regulation may not be suitable for them because those uses would not have been anticipated
when the regulation was developed. The regulation will likely unduly burden a new product
development or may not address any of its real problems (if there are any such problems to be
addressed).
In fact, technology neutral regulation may instead outlast seemingly timeless regulation of
face recognition technology.147 Consider, for example, a law that would regulate collection of
data indicating a person’s real time location. If well drafted, such a law today would also apply to
sensitive location data in geolocation apps, which raise similar concerns to apps that can identify
a person in real time using face recognition technology. It would also apply to future applica
tions that would expose individuals in the same manner, such as technologies that would
identify individuals from a distance based on their smell or the rhythm of their heartbeat.148
It would be more targeted at the relevant harm and address all new technologies that have
similar uses. The law would not need to be translated into the language of the future.149
Conversely, the regulation of face recognition technology would be useless with respect to
future technologies even if they were to raise very similar concerns. Indeed, one day, regulation
of face recognition technology could sound just as outdated as the regulation of gramophones or
videocassette tapes sounds today.150

Mandating Meaningful Consent from the Right Person


Consent from the relevant person will be central to regulation of face recognition applications,
which often identify individuals other than the user of the application. This is a tricky concept
given that notice to and consent from the user of an application is such a fundamental
principle in much privacy law. But a few regulatory recommendations with respect to face
recognition applications have already emphasized the need to obtain consent from the person

147
See Koops, supra note 13; but see Christian Laux, Must RFID-Legislation Be Technology Neutral?, The Center for
Internet and Society at Stanford Law School (Apr. 12, 2007, 1:02 PM), [Link]
04/must-rfid-legislation-be-technology-neutral.
148
See Jacob Aron, Your Heartbeat Could Keep Your Data Safe, NewScientist (Feb. 11, 2012), [Link]
.com/article/[Link]; John R. Vacca, Biometric Technolo
gies and Verification Systems 215 (2007) (implying that odor recognition technology may one day recognize
individuals, provided that they have unique bodily odors); see also Paul Marks, Google Glass App Identifies You by
Your Fashion Sense, NewScientist (Mar. 7, 2013), [Link]
glass-app-identifi[Link]; See Koops, supra note 13 (noting that “particular attention must
be given to the sustainability of laws that target technology, because there is a greater risk than usual that changes in
the subject matter may soon make the law obsolete”).
149
See Lawrence Lessig, Code 157–169 (2d ed. 2006).
150
For example, the regulation of “video cassette tapes” in the Video Privacy Protection Act (VPPA) has caused the legislation
to quickly seem antiquated. However, the VPPA also regulates “similar audio-visual technology,” which essentially means
that this is regulation of a use rather than a technology. Therefore, it has been applied to various subsequent technologies
such as DVDs and online video. Yana Welinder, Dodging the Thought Police: Privacy of Online Video and Other Content
Under the “Bork Bill,” Harv. J. L. & Tech. Dig. (Aug. 14, 2012, 6:11 PM), [Link]
dodging-the-thought-police-privacy-of-online-video-and-other-content-under-the-bork-bill.
Face Recognition, Real Time Identification, and Beyond 119

who is identified using the application rather than from the app user. For example, a
2012 report by the FTC states that “only consumers who have affirmatively chosen to partici
pate in [a system that allows others to recognize them in public] should be identified.”151
Similarly, an advisory opinion of the EU Article 29 working party explains the difference
between getting consent from a user of a face recognition application and the “data subject”
whose data is actually processed.152
To meaningfully consent, a person must know to what she is consenting. It is not reasonable
to use small print hidden in a privacy policy to try put a person on notice that her social network
profile can be used to identify her in the street with automatic face recognition. Users rarely read
online terms.153 So it is better to design consent around their general expectations of the services
they use.154 If the main purpose of an application is to organize photos by the individuals it spots
in those photos, users who provide photos of themselves will expect their photos to be used in
this manner. But if an application primarily serves a different purpose and its use of face
recognition is not obvious, separate notice and consent may be needed.155
To be meaningful, consent should also be obtained before a person’s data is processed for
automatic face recognition. But sometimes, prior consent may not be possible or reasonable.
For example, an app may need to match a face to a database to be able to determine whether
that person has consented to being recognized by face recognition technology.156 Similarly, face
recognition technology can be used to find missing persons or to identify an injured individual
who is unable to consent.157 In those cases, it’s important to limit the data processing to the
minimum necessary and to delete all biometric data once it’s no longer needed for the particular
and limited purpose. There may also be ways to allow individuals to preemptively opt out of even
these uses at the time that they originally provide images of themselves. One could also imagine
face recognition free zones similar to prohibitions on photographing found in restrooms and
gym locker rooms.158

Context Centric Regulation


A helpful framework for determining when an application of face recognition technology may
raise privacy concerns is Helen Nissenbaum’s theory of contextual integrity.159 Departing from
the traditional distinction between “private” and “public” information, she focuses on the

151
Facing Facts, at iii.
152
WP29 Opinion, at 5.
153
See Alessandro Acquisti & Ralph Gross, Imagined Communities: Awareness, Information Sharing, and Privacy on the
Facebook, 2006 Privacy Enhancing Tech. Workshop 16, available at [Link]
load?doi [Link].8177&rep rep1&type pdf. (“Among current members, 30% claim not to know whether [Face-
book] grants any way to manage who can search for and find their profile, or think that they are given no such
control.”).
154
See WP29 Opinion, at 7.
155
See id. The FTC has articulated a similar idea in its consumer privacy guidelines, which provide that separate consent
may not be required when a data “practice is consistent with the context of [a] transaction or the consumer’s existing
relationship with the business.” FTC Consumer Privacy Report, at 39.
156
See WP29 Opinion, at 5.
157
See EU Directive, art. 8(c) (providing an exception to the consent requirement when “processing is necessary to
protect the vital interests of the data subject or of another person where the data subject is physically or legally
incapable of giving his consent”).
158
Madelyn Chung, Playboy Model Dani Mathers Slammed for Publicly Body Shaming Nude Woman at Gym,
Huffington Post Canada (Jul. 15, 2016), [Link] n
[Link].
159
Nissenbaum, Privacy in Context, at 2.
120 Yana Welinder and Aeryn Palmer

context in which the information is shared and the norms governing that particular context.160
Her analysis considers:
1. The context in which the information is being shared,
2. The sender, recipient, and the subject of the information,
3. The nature of the shared information, and
4. The transmission principles underlying the information sharing.161
Based on the answers to these questions, Nissenbaum considers whether there are societal norms
for the information flow that have developed in analogous situations and applies those norms to
the situation at hand.162 If the flow doesn’t follow the relevant norms, it violates contextual
integrity.163 This often happens when a flow alters the nature of the information, shares the
information with additional individuals, or changes the transmission principles that were
understood at the time when the information was shared.164
Nissenbaum explains that breaking social norms can sometimes be desirable, but only if it
results in new norms that are morally or politically superior.165 The new norms must benefit the
society with respect to freedom, justice, fairness, equality, democracy, and similar important
concerns.166 Nissenbaum weighs the benefit to the society from the change against the interests
protected by the previous norms.167
To illustrate how the theory of contextual integrity would apply to face recognition applica
tions, we apply it to the use of face recognition technology in social networks. Social network
applications may violate contextual integrity by transforming the nature of information from
photos that users share with their friends to biometric data that could be used by anyone to
identify them.168 If a mobile app further taps into the biometric database of the social network to
allow people to recognize the social network users on the street, it further violates contextual
integrity by changing the transmission principles from a strictly online context to offline tracking
of individuals.
This contextual analysis can be taken into account when designing regulation of face
recognition applications. For example, the FTC applied a similar analysis in its recommenda
tions on face recognition technology.169 Noting that face recognition technology is not consist
ent with the context in which users share photos on a social network, the FTC stated that users
must get separate notice and the ability to opt out before their photos are used for automatic face
recognition.170 Although the FTC does suggest that the ability to opt out can be can be used for
this, it is far from ideal. Social media users rarely use opt out settings.171 A notice with the ability
to opt in is therefore more preferable when data will be used in a new way outside the context in
which it was originally shared.

160
Id. at 125–126.
161
Id. at 149–150.
162
Id. at 138, 149–150.
163
Id. at 10.
164
Id. at 150.
165
Id. at 165.
166
Id. at 182.
167
Id. at 182.
168
Welinder, A Face Tells More than a Thousand Posts, at 186–88.
169
Facing Facts, at 18.
170
Id. at 18–19.
171
See Michelle Madejski et al., The Failure of Online Social Network Privacy Settings, Future of Privacy Forum (July
2011), [Link]
Face Recognition, Real Time Identification, and Beyond 121

Consider Government Access to Privately Collected Biometric Data


When regulating collection and use of biometric data for consumer applications, we also need
to make sure that it doesn’t enable government agencies to unreasonably search private data.172
The US Fourth Amendment jurisprudence that protects against unreasonable searches mostly
developed before companies were collecting the vast amount of data that they collect today.173
While a government agency would need a warrant to look for photos in a person’s home, it only
needs a subpoena or a court order issued pursuant to a lower standard than a warrant to obtain
that person’s photos when they are stored by a company.174 The law in this area is clearly
outdated, given that the photos stored by the company are likely to include more personal
information than photos found in a person’s home: digital photos often include time and
location metadata and have labels identifying the people in the photos.
Photos stored for consumer applications enjoy even less protection from government search
when an agency purports to be investigating something related to foreign intelligence.175 As long
as a foreign intelligence issue is a “significant purpose” of an investigation, even if not the
primary purpose, ordinary electronic surveillance protections can be suspended.176
Regulation should plan for a future where government agencies continue tapping into
privately collected biometric data because governments’ own biometric databases may not be
as effective at identifying individuals. There have been multiple reports that government
agencies are developing their own biometric databases to identify suspects.177 But unlike many
consumer apps, the government’s face recognition system will not be able to contextualize the
matching of faces based on which people are more likely to appear in photos with particular
friends.178 Consumer apps can also rely on their users to confirm or deny automatic identifica
tion of their friends, training the identification algorithm every time. As a result, government
agencies will continue seeking access to photos and biometric data stored in consumer apps.
Privately collected biometric data is simply inseparable from issues of government surveil
lance. To avoid having consumer apps become conduits for surveillance of their users, regula
tion should encourage the companies to design services to avoid collecting or retaining

172
See, e.g., Laura K. Donohue, NSA Surveillance May Be Legal But It’s Unconstitutional, Wash. Post (June
21, 2013), [Link]
b9ddec20-d44d-11e2-a73e-826d299ff459 [Link].
173
Digital Duplications and the Fourth Amendment, 129 Harv. l. rev. 1046 (2016), [Link]
digital-duplications-and-the-fourth-amendment/.
174
Stored Communications Act, 18 U.S.C. § 2703(b). To obtain a court warrant under this provision, a law enforcement agency
only needs to provide “specific and articulable facts showing that there are reasonable grounds to believe that the contents of
a wire or electronic communication, or the records or other information sought, are relevant and material to an ongoing
criminal investigation” (18 U.S.C. § 2703(d)). Significantly, the agency does not need to show reasonable belief that evidence
of a crime exists in the location to be searched, as it would need to do for a search warrant.
175
See Foreign Intelligence Surveillance Act, 50 U.S.C. §§ 1804(a)(7)(B), 1823 (a)(7)(B), and 1881(a).
176
Id.; In re Sealed Case, 310 F.3d 717 (FISA Ct. Rev. 2002) (“FISA, as amended, does not oblige the government to
demonstrate to the FISA court that its primary purpose in conducting electronic surveillance is not criminal
prosecution.”).
177
Next Generation Identification, Fed. Bureau of Investigation, [Link] bio
metrics/ngi; FBI Criminal Justice Information Services Division Staff Paper: Update on Next Generation Identifica-
tion, Electronic Frontier Found (June 2012), [Link]
identification; FBI Performs Massive Virtual Line-up by Searching DMV Photos, Electronic Privacy Info.
Center (June 17, 2013), [Link]
178
This may become less valuable given the recent advancement in face recognition technology that relies on neural
networks.
122 Yana Welinder and Aeryn Palmer

unnecessary data.179 They can also be encouraged to implement local storage and end to end
encryption, when that is compatible with their services. This will ensure that government
agencies have to obtain the data directly from the user, which is more analogous to when law
enforcement collects fingerprints from a criminal suspect or searches for photos in the suspect’s
home pursuant to a specific warrant.

recommendations for developers


The best privacy practices come from carefully designed apps rather than from regulation.
Regulation of quickly evolving technology will always lag behind. This lag is actually desirable,
because it gives regulators time to observe how technology evolves to adopt more meaningful
regulations only when it is necessary and avoid blocking innovation. But that does not mean that
companies should abuse personal data until their applications are regulated. Bad privacy
practices result in poor service for the users and erode user trust. That is ultimately bad for
business, particularly for a new application that is trying to build up its reputation. It can also
provoke knee jerk regulations, blocking further innovation in the field.
Developers should aim to create applications that protect personal data given that they are in
the best position to know what data they collect and how they use it. In this section, we discuss
three design principles that can help with developing more privacy protective applications when
using face recognition technology. Additional design principles may later become relevant as
we start seeing new applications of face recognition technology.

Leverage User Experience Design as a Notice Tool


When designing applications with face recognition technology, developers need to consider
how their data collection and processing are experienced by users and those around the users.
The best scenario is when the use of face recognition technology is consistent with the general
context of the application as experienced by the user and the people they photograph. When the
use is less obvious, developers should consider how they can actively notify users and other
affected individuals through the interface. User experience design could help to instinctively
make a user aware of data collection without the need to read or understand a privacy policy.
It can also provide notice to individuals beyond the primary user of a product, which as discussed
in Section IV(B) is particularly relevant for face recognition technology.
People tend to know that they are being photographed when a camera directed at them emits
a shutter sound or a flash.180 Another example of intuitive design is security cameras that have a
video screen next to them directly showing people when they are being recorded.181 Similar
intuitive design could be created for devices equipped with face recognition technology. They
could for example loudly state a person’s name as they recognize her, as well as the source of the
biometric data. Such an announcement would put the person on notice that she is being
recognized, and allow her to take action if she doesn’t want to be part of the biometric database.
The device could also send an electronic message to the identified person stating the time and
place of the identification as well as the identity of the user of the face recognition device to

179
See ECPA Reform: Why Now? Digital Due Process, [Link]
37940370–2551-11DF-8E02000C296BA163.
180
See Calo, at 1036–37 (“Analog cameras make a click and, often, emit a flash when taking a picture.”).
181
See Photo of Self-Checkout at Home Depot, Flickr (Apr. 19, 2011), [Link]
5635513442/in/photostream/.
Face Recognition, Real Time Identification, and Beyond 123

provide mutual transparency. While the exact implementation of these types of features might
vary, the general idea of notifying people when they are automatically recognized is a palpable
example of privacy protective user experience design.
Another solution is to design an application to simulate people’s preexisting expectations
about who is able to recognize their face. For example, an application could allow a user to run
face recognition only for the individuals whom this user had previously identified. This design
would play on people’s expectations that a person they interact with may remember them next
time, no matter how brief the initial interaction. It would also allow the biometric data to be
stored locally on one device, making it less susceptible to abuse.182

Collect Less; Delete More


Excessive data collection and retention create the risk that the data can be misused down the
road. This is particularly problematic when dealing with sensitive data, such as biometrics.
Applications with face recognition technology should therefore actively limit data collection to
what is absolutely necessary to provide a service and regularly delete data when it is no longer
needed. If a user were to delete her account with an application or turn off the face recognition
function in an application with broader functionality, the biometric data is obviously no longer
needed and should be deleted. Importantly, the data should be deleted from the biometric
database and not only from the user interface of the application.183
When designing applications with face recognition technology, it is best to adopt data
retention policies early in the development process. That way, the policy can be optimized
before the application becomes overwhelmed with data. Having an effective data retention
policy will allow a company to prevent data from being misused within the company, help
safeguard the data from third parties, and make it easier to respond to law enforcement demands.
If an application collects limited data and promptly deletes it when it is not necessary to serve the
users, the company in charge of the application can easily reject government demands for data
that it does not store.184

Security by Design
The data used by face recognition applications must be safeguarded carefully given that, unlike a
compromised password or a stolen credit card, a person’s facial features cannot just be
replaced.185 Ideally, it is good to store the biometric data locally in an encrypted format as much
as possible.186 But many applications require the data to be transferred between multiple servers.
In that case, it is important to make sure that the data is encrypted and travels via encrypted
182
Russel Brandom, Apple’s New Facial Recognition Feature Could Spur Legal Issues, The Verge (June 16, 2016),
[Link]
183
See, e.g., Facebook Data Use Policy (stating that “some information may remain in backup copies and logs for up to
90 days” after an account is deleted); but see Facing Facts, at 18 n.70 (referring to Facebook’s testimony that
“Facebook deleted any previously collected biometric data” “if a user opted out of Facebook’s ‘Tag Suggest’ feature”).
184
See, e.g., Request for User Information Procedures & Guidelines, The Wikimedia Foundation (May 14, 2015), https://
[Link]/wiki/Requests for user information procedures %26 guidelines (explaining that user
data requests to the Wikimedia Foundation may be futile because it “collects very little nonpublic information (if
any) that could be used to identify its users offline and it retains that information for a limited amount of time”).
185
See Face Facts: A Forum on Face Recognition Technology, Fed. Trade Comm’n 1 (Dec. 8, 2011), [Link]
video-library/transcripts/120811 FTC [Link] (Alessandro Acquisti testifying, “It’s much easier to change your name
and declare ‘reputational bankruptcy’ than to change your face.”).
186
WP29 Opinion, at 8.
124 Yana Welinder and Aeryn Palmer

channels.187 The multiple servers can be leveraged to store different pieces of a person’s
biometric data to make it more difficult for a third party to access the biometric data when only
one of the servers is compromised.188
There has been significant research into designing biometric systems to protect personal
data.189 One approach is to distort an image during the face recognition process to avoid storing
original images of individuals.190 Another approach is to hash the biometric data in the course of
the recognition process.191 But both these methods have had a negative impact on the effective
ness of the face recognition process.192 A more effective method is to transform biometric data
into two components where the component that pertains to a person’s identity is encrypted and
can be entirely revoked if the system is compromised.193

conclusion
Biometric technology is rapidly evolving, both in its capabilities and in its use. Consumers are
presented with a dizzying array of advancements, from clever mobile apps to pervasive govern
ment surveillance. As difficult as it may be to track changes to the technology itself, it is almost as
difficult to stay abreast of changes to the laws regulating the technology.
While consumer privacy is not always considered in this development, there are signs that
consumers themselves, as well as governments, are working to address privacy concerns. For
better or for worse, lawsuits under the Illinois Biometric Identification Privacy Act will likely
continue to be filed in the United States, and courts seem willing to hear them. Data protection
authorities have urged caution in gathering biometric data and said that consumers need the
ability to make informed choices. Regulators should continue to emphasize the importance of
transparency, notice, and consent. They should also focus on specific uses of face recognition
technology, rather than on the technology in general, to avoid stifling innovation. Further, they
need to ensure that law enforcement and other government agencies consider personal privacy
interests in their efforts to access privately held databases of face recognition information.
Independent of regulation, application developers can take steps to inform and protect
consumers. Good interface design can make it clear what data is being collected and when.
Implementing security by design can help keep the information safe. Generally, developers
should strive to collect less and delete more.
Face recognition technology is here to stay and so are efforts to regulate it and litigate about
it. We can only guess what steps the technology and the laws will take next; the last few years
suggest that it will be a rocky, but interesting, path.

187
Margarita Osadchy et al., SCiFI: A System for Secure Face Identification, Benny Pinkas, 1 (May 2010), [Link]
.net/PAPERS/scifi.pdf; see also WP29 Opinion, at 8.
188
Osadchy et al., at 1.
189
See T. Boult, Robust Distance Measures for Face-Recognition Supporting Revocable Biometric Tokens, University of
Colorado at Colorado Springs and Securics, Inc, Preprint (2006), available at [Link]
~tboult/PAPERS/[Link].
190
Ratha et al., Enhancing Security and Privacy in Biometrics-Based Authentication Systems, 2001, [Link]
.[Link]/xpl/[Link]?tp &arnumber 5386935&url http%3A%2F%[Link]%2Fiel5%2F5288519%
2F5386925%[Link]%3Farnumber%3D5386935.
191
Tulyakov et al., Symmetric Hash Functions for Secure Fingerprint Biometric Systems, 2004, Pattern Recognition
Letters, 28(16):2427–2436, [Link] Symmetric Hash Functions for
Secure Fingerprint Biometric Systems
192
See Boult, supra note 182.
193
Id. [Link]
7

Smart Cities: Privacy, Transparency, and Community

Kelsey Finch and Omer Tene*

introduction
At the beginning of the 20th century, a group of Italian artists and poets called the Futurists
sought to reshape the world around them to reflect a futuristic, technological aesthetic
transforming everything from cities, to train stations, to chess sets.1 In the city of the future, they
believed, technology would inspire and elevate the physical and mental world: “Trains would
rocket across overhead rails, airplanes would dive from the sky to land on the roof, and
skyscrapers would stretch their sinewed limbs into the heavens to feel the hot pulse of radio
waves beating across the planet.”2
But today’s cities and our train stations, self driving cars, and chess sets have moved far
beyond artistic imagination. Today’s cities are already pervaded by growing networks of connected
technologies to generate actionable, often real time data about themselves and their citizens.
Relying on ubiquitous telecommunications technologies to provide connectivity to sensor net
works and set actuation devices into operation, smart cities routinely collect information on cities’
air quality, temperature, noise, street and pedestrian traffic, parking capacity, distribution of
government services, emergency situations, and crowd sentiments, among other data points.3
While some of the data sought by smart cities and smart communities is focused on environmental
or non human factors (e.g., monitoring air pollution, or snowfall, or electrical outages), much of the
data will also record and reflect the daily activities of the people living, working, and visiting the city
(e.g., monitoring tourist foot traffic, or home energy usage, or homelessness). The more connected a
city becomes, the more it will generate a steady stream of data from and about its citizens.
Sensor networks and always on data flows are already supporting new service models and
generating analytics that make modern cities and local communities faster and safer, as well as

* Kelsey Finch is Policy Counsel and Omer Tene is Senior Fellow at the Future of Privacy Forum. Tene is Vice
President of Research and Education at the International Association of Privacy Professionals and Associate Professor at
the College of Management School of Law, Rishon Lezion, Israel.
1
See Adam Rothstein, The Cities Science Fiction Built, Motherboard (Apr. 20, 2015), [Link]
read/the-cities-science-fiction-built; Italian Futurism, 1909 1944: Reconstructing the Universe (Vivian Greene,
ed., 2014).
2
Adam Rothstein, The Cities Science Fiction Built, Motherboard (Apr. 20, 2015), [Link]
the-cities-science-fiction-built (“This artistic, but unbridled enthusiasm was the last century’s first expression of
wholesale tech optimism.”).
3
See Shedding Light on Smart City Privacy, The Future of Privacy Forum (Mar. 30, 2017), [Link]
smart-cities/.

125
126 Kelsey Finch and Omer Tene

more sustainable, more livable, and more equitable.4 At the same time, connected smart city
devices raise concerns about individuals’ privacy, autonomy, freedom of choice, and potential
discrimination by institutions. As we have previously described, “There is a real risk that, rather
than standing as ‘paragons of democracy, [smart cities] could turn into electronic panopticons in
which everybody is constantly watched.”5 Moreover, municipal governments seeking to protect
privacy while still implementing smart technologies must navigate highly variable regulatory
regimes,6 complex business relationships with technology vendors, and shifting societal and
community norms around technology, surveillance, public safety, public resources, openness,
efficiency, and equity.
Given these significant and yet competing benefits and risks, and the already rapid adoption of
smart city technologies around the globe,7 the question becomes: How can communities
leverage the benefits of a data rich society while minimizing threats to individuals’ privacy and
civil liberties?
Just as there are many methods and metrics to assess a smart city’s livability, sustainability, or
effectiveness,8 so too there are different lenses through which cities can evaluate their privacy
preparedness. In this article, we lay out three such perspectives, considering a smart city’s privacy
responsibilities in the context of its role as a data steward, as a data platform, and as a government
authority. While there are likely many other lenses that could be used to capture a community’s
holistic privacy impacts, exploring these three widely tested perspectives can help municipalities
better leverage existing privacy tools and safeguards and identify gaps in their existing frame
works. By considering the deployment of smart city technologies in these three lights, commu
nities will be better prepared to reassure residents of smart cities that their rights will be respected
and their data protected.

city as data steward


In many ways, smart communities are no different than other data rich entities: they act as a data
stewards, ensuring that data is always available, reliable, and useful to their organization. Data
stewardship is well established in the information management field, denoting an individual or
institution with control over the collection, handling, sharing, and analysis of data.9 Increasingly,
data governance in data rich organizations concerns not only carrying out the day to day
management of data assets, but also taking on fiduciary like responsibilities to consider the

4
See, e.g., Smart Cities: International Case Studies, Inter Am. Dev. Bank, (2016), [Link]
emerging-and-sustainable-cities/international-case-studies-of-smart-cities,[Link] (last visited Mar. 31, 2017).
5
Kelsey Finch & Omer Tene, Welcome to the Metropticon: Protecting Privacy in a Hyperconnected Town, 41 Fordham
Urb. L.J. 1581, 1583 (2015), [Link] 2549&context ulj.
6
Including regulatory regimes and principles that create sometimes competing obligations to keep personal data private
while also making the data held by government more transparent and accessible to the public. See, e.g., Report of the
Special Rapporteur on the Right to Privacy, Annex II, Human Rights Council, U.N. Doc. A/HRC/31/64 (May 8, 2016)
(by Joseph A. Cannataci).
7
See Research & Markets, Global Smart Cities Market Insights, Opportunity Analysis, Market Shares and
Forecast 2017 2023 (Jan. 2017).
8
See ISO 37120:2014 Sustainable development in communities: City Indicators for Service Delivery and
Quality of Life (2014), [Link] briefing [Link].
9
See Mark Moseley, DAMA DBOK Functional Framework, The Data Mgmt. Ass’n. (Version 3.02, Sept. 10, 2008),
[Link] Functional Framework v3 02 [Link].
Smart Cities: Privacy, Transparency, and Community 127

ethical and privacy impacts of particular data activities and to act with the best interests of
individuals and society in mind.10
All organizations dealing with data must ensure that their data assets are appropriately
secured, handled, and used. While privacy laws and commitments give organizations in both
the private and public sectors clear motivations to protect personally identifying information
(PII), non personal data is oftentimes just as robustly safeguarded because of concerns for
intellectual property and trade secrets. While this paper focuses on methods designed to protect
PII, data governance and accountability mechanisms instituted throughout the data lifecycle
often mitigate risks to both PII and non PII.11
Companies, NGOs, and government agencies of all stripes are familiar to some extent with
the variety of roles and responsibilities that accompany the day to day use and maintenance of
data assets and that keep systems running smoothly: IT departments ensure that data is secure
and uncorrupted, lawyers oversee compliance with privacy and other legal regimes, engineers
architect new and better ways of developing data, researchers explore datasets for new insights,
and business units and policy teams determine what data to collect and how to use it. In the
municipal context, oftentimes it is a chief innovation officer (CIO), a chief technology officer
(CTO), a chief data officer (CDO), or increasingly a chief privacy officer (CPO) who oversees
this process and inculcates institutional norms around privacy and security.
Thus, the data steward model and many of the data (and privacy) governance tools and
terminology that accompany it is already familiar to a wide set of IT, compliance, and privacy
professionals in the private sector.12 It is also familiar to career civil servants in public sector
entities especially data intensive environments such as national security, healthcare and
13
education. As municipalities expand their technology and data capabilities, many of the
professionals they hire will bring with them experience with data and privacy governance.
Nevertheless, municipalities need to be forward thinking and purposeful in planning, supervis
ing, and controlling data management and use within and between their numerous
departments, agencies, and public private partnerships.
What tools and considerations, then, should smart cities take into account in their role as data
stewards?
Privacy management. As data stewards, cities must devise privacy management programs to
ensure that responsibility is established, accountability is maintained, and resources are allocated
to successfully oversee, govern, and use individuals’ data. Documenting and routinizing these
principles and practices throughout the entire data lifecycle are critical to ensuring
accountability.

10
See Jan Whittington et al., Push, Pull, and Spill: A Transdisciplinary Case Study in Municipal Open Government, 30
(3) Berkeley Tech. L.J. 1989 (2015), [Link] 3/1899–1966%[Link]; Jack
Balkin & Jonathan Zittrin, A Grand Bargain to Make Tech Companies Trustworthy, The Atlantic (Oct. 3, 2016),
[Link]
11
We note, as well, that the line between PII and non-PII is often indistinct, and that data traditionally considered non-
identifying may become PII in the future as look-up databases or technical systems emerge to link that data to
individuals. See Jules Polonetsky, Omer Tene & Kelsey Finch, Shades of Gray: Seeing the Full Spectrum of Practical
Data De-identification, 56 Santa Clara L. Rev. 593 (2016), [Link]
Given these risks, instituting a variety of forward-thinking safeguards throughout the full data lifecycle is critical.
12
See, e.g., Moseley, supra note 9.
13
See, e.g., Susan Baird Kanaan & Justine M. Carr, Health Data Stewardship: What, Why, Who, How, Nat’l Comm. on
Vital & Health Statistics, U.S. Dep’t of Health & Human Svcs. (Sept. 2009), [Link]
content/uploads/2014/05/[Link]; Data Governance and Stewardship, Privacy Technical Assistance Ctr., U.S.
Dep’t of Educ. (Dec. 2011), [Link]
128 Kelsey Finch and Omer Tene

The core of any privacy management program is establishing principles and practices that
apply to collecting, viewing, storing, sharing, aggregating, analyzing, and using personal data. In
order to strengthen public trust, some city leaders have used the process of developing core
privacy principles as an opportunity to engage their communities. For example, as the City of
Seattle developed its six citywide “Privacy Principles,”14 the Seattle IT department created a
Community Technology Advisory Board (CTAB), made up of local experts, business represen
tatives, and academics from the University of Washington,15 and invited their input, as well as
that of local privacy advocacy groups.16 The principles were ultimately adopted by City Council
Resolution 31570,17 and laid the foundation for a more in depth, public facing privacy policy
detailing the city’s privacy and security practices.18
Once their guiding principles are established, there are many models that city officials might
turn to in building workable, auditable municipal privacy programs. In 2016, the Federal Office
of Management and Budget (OMB) updated Circular A 130, the document governing the
management of federal information resources.19 Recognizing the impact of big data trends on
government data management, the updated OMB Circular requires federal agencies to:

• Establish comprehensive, strategic, agency wide privacy programs;


• Designate Senior Agency Officials for Privacy;
• Manage and train an effective privacy workforce;
• Conduct Privacy Impact Assessments (PIA);
• Apply NIST’s Risk Management Framework to manage privacy risk throughout the infor
mation system development life cycle;
• Use Fair Information Practice Principles (FIPPs) when evaluating programs that affect
privacy;
• Maintain inventories of personally identifiable information (PII); and
• Minimize the collection and usage of PII within agencies.
20

Another leading example in the United States has emerged from the Federal Trade Commis
sion’s (FTC) body of privacy and security settlements. The FTC’s model is likely to influence
many of the technology vendors a city might partner with, who are expected to stay in line with
their primary regulator’s best practices or may be already under a settlement order themselves.21
In broad strokes, the FTC has increasingly required settling companies to maintain a privacy
program that:

14
See City of Seattle Privacy Principles, City of Seattle (Mar. 2015), [Link]
[Link].
15
See CTAB Blog, City of Seattle, [Link] (last visited Apr. 24, 2017).
16
See City of Seattle’s Tech Board Restarts Privacy Committee, Seattle Privacy Coal. (Sept. 27, 2016), [Link]
.[Link]/city-of-seattles-tech-board-restarts-privacy-committee/.
17
See Res. 31570, A resolution adopting the City of Seattle Privacy Principles governing the City’s operations, which will
provide an ethical framework for dealing with current and future technologies that impact privacy, and setting timelines
for future reporting on the development of a Privacy Statement and Privacy Toolkit for their implementation, Seattle
City Council (Mar. 3, 2015), [Link] [Link].
18
See Privacy, City of Seattle, [Link] (last visited Apr. 24, 2017).
19
See Revision of OMB Circular A-130, “Managing Information as a Strategic Resource,” FR Doc. 2016–17872 (July 28,
2016), [Link]
20
See Circular No. A-130, Managing Information as a Strategic Resource, Appendix II, Office of Mgmt. & Budget,
Exec. Office of the President (July 28, 2016), [Link]
OMB/circulars/a130/[Link].
21
See Daniel Solove & Woodrow Hartzog, The FTC and the New Common Law of Privacy, 114 Colum. L. Rev. 583
(2014), [Link] id 2312913.
Smart Cities: Privacy, Transparency, and Community 129

• Is reasonably designed to: (1) address privacy risks related to the development and manage
ment of new and existing products and services for consumers, and (2) to protect the privacy
and confidentiality of personal information;
• Is fully documented in writing;
• Contains privacy controls and procedures appropriate to the organization’s size and com
plexity, the nature and scope of its activities, and the sensitivity of the personal information;
• Designates an employee or employees to coordinate and be accountable for the privacy
program;
• Conducts a privacy risk assessment identifying reasonably foreseeable, material risks, both
internal and external, that could result in the organization’s unauthorized collection, use,
or disclosure of personal information specifically including risks related to employee
training and management and product design, development, and research;
• Implements reasonable privacy controls and procedures to address the risks identified
through the privacy risk assessment, and regularly tests or monitors their effectiveness;
• Takes reasonable steps to select and retain service providers capable of maintaining
appropriate security practices, and requires service providers to contractually implement
appropriate safeguards; and
• Reevaluates and adjusts the privacy program in light of any new material risks, material
changes in the organization’s operations or business arrangements, or any other circum
stances that might materially impact the effectiveness of the privacy program.22
International standards and regulations add clarity on the importance of robust privacy pro
grams. The OECD’s 1980 privacy guidelines, amended in 2013, dictate that data controllers
should “Have in place a privacy management programme that:

• Gives effect to these Guidelines for all personal data under its control;
• Is tailored to the structure, scale, volume and sensitivity of its operations;
• Provides for appropriate safeguards based on privacy risk assessment;
• Is integrated into its governance structure and establishes internal oversight mechanisms;
• Includes plans for responding to inquiries and incidents; [and]
• Is updated in light of ongoing monitoring and periodic assessment.”23
The new European General Data Protection Regulation (GDPR), which goes into force in May
2018, will also require the appointment of Data Protection Officers within organizations of all
shapes and sizes, and the establishment of structured accountability programs.24 Indeed, the
Article 29 Working Party believes that “the DPO is a cornerstone of accountability.”25 DPOs are
at a minimum tasked with monitoring their organizations’ compliance with the GDPR; advising
their organizations in the course of conducting data protection impact assessments (DPIAs);
taking risk based approaches to their data protection activities; and maintaining records of all of
their organizations’ data processing operations.26 While uncertainty remains as to the precise

22
See id. at 617.
23
See OECD Guidelines Governing the Protection of Privacy and Transborder Flows of Personal Data, The OECD
Privacy Framework, 16 (July 11, 2013), [Link]
24
See Regulation 2016/679, General Data Protection Regulation, art. 37, 2016 O.J. (L. 119) 1, [Link]
data-protection/reform/files/regulation oj [Link]; Art. 29 Data Protection Working Party, Guidelines on Data
Protection Officers (Dec. 13, 2016), [Link] society/newsroom/image/document/2016–51/
wp243 en [Link].
25
Art. 29 Working Party, supra note 24, at 4.
26
Id. at 16–18.
130 Kelsey Finch and Omer Tene

contours of the DPO role within EU data protection practice, the International Association of
Privacy Professionals (IAPP) estimates that at least 28,000 new DPO positions will be created in
the coming years in Europe alone in response to the GDPR.27
Privacy oversight. Designating a governance lead (such as a Chief Privacy Officer, a DPO or
a Senior Agency Official for Privacy28) who oversees privacy responsibilities can create an
authoritative hub where dedicated experts navigate relevant laws and regulations, advise other
officials and departments, create documentation and policies, look for and remediate violations,
and educate the public workforce on privacy policies and practices. As data stewards, smart cities
should clearly establish governance structures and oversight mechanisms for granting access to
data, analytics, and tracking technologies.
In addition to designating a privacy lead, smart cities should consider the value of establishing a
privacy committee made up of a range of stakeholders, including members of the public. Such
working groups are common within the privacy profession, and often stress interdisciplinary represen
tation within the working groups to improve outcomes, make programs more inclusive, and generate
buy in throughout an organization.29 Seattle’s Community Technical Advisory Board is formalized in
the Seattle Municipal Code,30 and in January 2016 the Oakland City Council created and defined the
duties of a formal Privacy Advisory Commission, tasked with (among other things): providing advice
and technical assistance on privacy best practices for surveillance equipment and citizen data,
providing annual reports and recommendations on the city’s use of surveillance equipment, conduct
ing public hearings, drafting reports, and making findings and recommendations to the city council.31
While only Seattle and Oakland have established formal citywide privacy advisory boards at
this date, specific agencies within local government have also turned to their communities for
input local libraries, for instance, have long been on the forefront of progressive citizen
engaged privacy policymaking.32 And the original Array of Things installation in Chicago, a
partnership between the City of Chicago, the University of Chicago, and the Argonne National
Laboratory, has convened independent governance boards responsible for overseeing the privacy
and security practices of its distributed sensor arrays and data processing activities.33
Privacy Risk Management. Robust data governance requires identifying, assessing, and
ultimately mitigating privacy risks. While many organizations have their own internal risk
management structures, privacy specific frameworks are less systematic and, given the largely
subjective nature of privacy harms, more difficult to quantify.34 One instructive risk manage
ment framework for municipal officials to consider is a recent effort by the US National Institute

27
See Warwick Ashford, GDPR Will Require 28,000 DPOs in Europe and US, Study Shows, Computer Weekly
(Apr. 20, 2016), [Link]
shows.
28
See Office of Mgmt. & Budget, supra note 20.
29
See IAPP-EY Annual Privacy Governance Report 7 (2015), [Link] -
IAPP ey privacy governance report 2015/$FILE/[Link].
30
Seattle Community Technology Advisory Board (CTAB) Membership and duties, Seattle Municipal
Code 3.23.060, [Link] code?nodeId TIT3AD SUBTITLE
IIDEOF CH3.23SEINTEDE 3.23.060SECOTEADBOCTEMDU (last visited Apr. 24, 2017).
31
Privacy Advisory Commission, City of Oakland, [Link]
PrivacyAdvisoryCommission/[Link] (last visited Apr. 24, 2017).
32
See, e.g., San Francisco Pub. Library Tech. & Privacy Advisory Comm., Summary Report: Radio Frequency
Identification and the San Francisco Public Library (Oct. 2005), [Link]
[Link].
33
See Array of Things Operating Policies (Aug. 15, 2016), [Link]
34
See NIST Internal Report (NISTIR) 8062, Privacy Risk Management for Federal Information Systems 1
(Jan. 4, 2017), [Link] 8062 [Link] (“Although existing tools such as
the Fair Information Practice Principles (FIPPs) and privacy impact assessments (PIAs) provide a foundation for taking
Smart Cities: Privacy, Transparency, and Community 131

of Standards and Technology (NIST) to develop a comprehensive system for “Privacy Risk
Management for Federal Information Systems.”35 NIST explicitly modeled this effort on its
successful cybersecurity risk management framework (RMF) and accompanied the develop
ment of a privacy risk model with a foundation for “the establishment of a common vocabulary
to facilitate better understanding of and communication about privacy risks and the effective
implementation of privacy principles in federal information systems.”36
A recurring challenge for smart communities in deploying risk mitigation strategies is that
reducing privacy risk often entails impairing data utility, thus inhibiting potentially beneficial
uses of data. For smart communities and other organizations, considering the risks of a project is
merely one part of a balanced value equation; decision makers must also take into count the
project’s benefits in order to make a final determination about whether to proceed.37 In another
article, we suggested that in addition to conducting a Privacy Impact Analysis (PIA), therefore,
decision makers need to conduct a Data Benefit Analysis (DBA), putting a project’s benefits and
risks on an equal footing.38 This is especially true as cities, researchers, companies, and even
citizens engage in the sort of big data analyses that promise tremendous and often unexpected
benefits, but which also introduce new privacy and civil liberties concerns associated with large
scale data collection and analysis. On the one hand, if a city can provide free internet access for
thousands of un or underserved individuals, for example, it may be legitimate to deploy such a
service even though not all of its privacy risks can be completely eliminated.39 On the other
hand, where smart city benefits are small or remote, larger privacy risks would not be justified.40
These assessments should take into account variables such as the nature of the prospective risk
or benefit, the identity of the impacted subject(s), and the likelihood of success. These
assessments should consider and document specific impacts to individuals, communities,
organizations, and society at large, in part to help determine whether risks and benefits are
accruing fairly and equitably across these populations.41 Cities must also negotiate the difficult
reality that social and cultural priorities and sensitivities may vary widely among their constituent
communities, and ensure that all interested members of the public can legitimately have their
voices heard on the potential impacts of civic projects.42
Vendor management. Public private partnerships have also emerged as a leading driver for
smart city developments. Rather than simply outsourcing technical work to service providers,
cities are increasingly co organizing, co operating, co funding, and co branding data intensive
projects with private enterprises.43 In such high profile relationships, cities (and vendors) must

privacy into consideration, they have not yet provided a method for federal agencies to measure privacy impacts on a
consistent and repeatable basis.”).
35
Id.
36
Id. at 3.
37
Jules Polonetsky, Omer Tene & Joseph Jerome, Benefit Risk Analysis for Big Data Projects 1 (Sept. 2014),
[Link] DataBenefitAnalysis [Link] [hereinafter “DBA”].
38
Id.
39
See, e.g., Eillie Anzilotti, To Court a Skeptical Public, New York Sends Wi-Fi Ambassadors, CityLab (Aug. 12, 2016),
[Link]
40
See DBA, supra note 37, at 4.
41
Id. at 9.
42
Id. at 7 (“For example, the relative value of a health or national security benefit may differ from society to society.
Some societies may place a high value on individual benefits, while others give greater weight to community values.”).
43
See Inter-Sessional Panel on “Smart Cities and Infrastructure” and “Foresight for Digital Development,” U.N.
Conference on Trade & Dev. (Jan. 11–13, 2016), [Link] 2015 ppt07
Bufi [Link]; PPP for Cities Case Studies Quick Facts and PPP Learned Lessons, Specialist Centre on PPP in Smart
and Sustainable Cities (Nov. 17, 2016), [Link]
132 Kelsey Finch and Omer Tene

do their due diligence and clearly delineate each party’s responsibilities (and capacities) for
managing, using, sharing, securing, or destroying data; for communicating with the public about
privacy; and for supervising other contractors or subcontractors.
Even when initiating projects on their own, smart cities rely extensively on vendors, particu
larly in deploying, maintaining, and analyzing emerging data tools and technologies (including
procuring Internet of Things devices, maintaining sensor networks, and publishing open data to
platforms). As scholars have noted, “Vendors have different capabilities and incentives than a
municipal government; they may be more or less capable of keeping data secure, and are not
likely to be as responsive to residents as their city government . . . [and] stakeholders will
ultimately hold cities responsible as stewards and expect them to uphold constituent values.”44
An instructive example of how the dynamics between public sector agencies and technology
vendors can lead to gaps in individual privacy protection was presented in a study by the Center
for Law and Information Policy at Fordham Law School. In the study, researchers analyzed
contracts between US public schools and cloud computing service providers and found that
“only 25% of districts inform parents of their use of cloud services, 20% of districts fail to have
policies governing the use of online services, and a sizeable plurality of districts have rampant
gaps in their contract documentation, including missing privacy policies.”45 Their findings
showed that vendors often produced commercial boilerplate contracts that did not adequately
address the student data context46; that school districts lacked knowledgeable privacy officers and
staff; and that each party to a transaction expected that the other would raise any relevant privacy
and security concerns.47 These findings and the fierce public backlash that ensued should
serve as a warning sign for other smart community services, which leverage private sector vendor
technologies and business models but could create or exacerbate privacy risks.
Standard contract terms, such as data use limitations, data ownership, data security policies,
confidentiality statements, and commitments not to reidentify data are crucial tools for ensuring
individuals’ privacy in multilayered smart city ecosystems. Given that the technologies and data
analysis and management tools required for smart city operations typically are sold or managed
by third party service providers, it is essential that city leaders carefully select, engage, and
supervise their vendors. It is also essential that privacy and security obligations flow with the
data, binding subcontractors to equivalent protections throughout the data lifecycle. City leaders
must continue to regularly monitor, assess, and audit whether service providers and other
partners continue to adhere to contracts and agreed upon practices.
Another reason that city leaders must be forward thinking in selecting and contracting with
their service providers is that unlike dispersed public schools cities are “market makers, not
market takers.”48 Cities wield significant purchasing power, and by requiring commitments
around privacy and security in their deals with vendors they can effectively set nationwide
industry standards. Best practices and standard contractual clauses at the largest companies can
then have a trickle down effect. This is particularly true as cities turn to start ups and smaller

44
See Whittington et al., supra note 10, at 1947.
45
See Joel Reidenberg et al., Privacy and Cloud Computing in Public Schools, Fordham Ctr. On L. and Info. Policy
(Dec. 12, 2013), [Link] US/[Link].
46
For example, including terms that would violate the Federal Educational Rights and Privacy Act, the Protection of
Pupil Rights Amendment, and the Children’s Online Privacy Protection Act. See id. at 35.
47
Id.
48
See Whittington et al., supra note 10, at 1954.
Smart Cities: Privacy, Transparency, and Community 133

organizations for their technology services, which may not have the same institutional expertise
with privacy and security as large enterprise vendors.49
Data research and ethical reviews. Smart cities are becoming storehouses of hugely valuable
information for public and private researchers. But appropriating civic data that was originally
collected for another purpose without citizens’ knowledge or consent raises significant privacy
concerns and weighty ethical and technical questions.
Traditionally, privacy laws have envisioned researchers utilizing de identification to unleash
the value of data while protecting privacy.50 In recent years, however, advances in reidentifica
tion science and the increasing availability of external datasets with potentially revealing
elements have led scientists and policymakers to doubt the reliability of de identification
measures to appropriately reduce the risk of an individual being reidentified from a dataset.51
A robust scholarly debate continues unabated to this day between data scientists, researchers,
lawyers, and regulators over whether and to what extent data can be scientifically or legally
considered de identified.52
Nevertheless, even as the debate continues to rage, communities leveraging citizens’ data for
additional, secondary purposes, including conducting scientific research or business analytics,
must do so while respecting individuals’ privacy. If consent to use data in a particular manner is
not feasible to obtain, or de identification unduly degrades the data or offers inadequate
guarantees, urban data stewards must evaluate and document risk benefit assessments as part
of a structured ethical review process.
In the United States, federal and federally supported institutions conducting human subject
research have been governed by the Common Rule since 1991, which is itself grounded in the
principles articulated by the Belmont Report of the 1970s. Under the Common Rule guidelines,
researchers who are studying human subjects seek the informed consent of their subjects or,
where that is not feasible, obtain the approval of an institutional review board composed of
trained experts from diverse backgrounds who are charged with balancing the risks to individuals
against the benefits of a research project. To the extent that cities accept federal funding, they
are also directly subject to the Common Rule.
At the same time, the sort of big data research and analysis that municipalities and even
corporate institutions increasingly are interested in have challenged existing legal and ethical
frameworks, including the Common Rule. For example, the Common Rule defines a human
subject as “a living individual about whom an investigator . . . conducting research obtains (1)
data through intervention or interaction with the individual, or (2) identifiable private infor
mation.”53 In the age of data focused research, however, it is unclear whether research of large
datasets collected from public or semi public sources even constitutes human subject research,
as it often requires no interaction with individuals or involves data that has been de identified or

49
See, e.g., 80+ Startups Making Cities Smarter Across Traffic, Waste, Energy, Water Usage, and More, CB Insights
(Jan. 24, 2017), [Link] Jason Shueh, How Start-
ups Are Transforming the Smart City Movement, GovTech (Sept. 1, 2015), [Link]
[Link]; Ben Miller, 3 Reasons Some Local Governments Are Eschewing
Big Tech Vendors for Startups, GovTech (Oct. 27, 2016), [Link]
[Link].
50
See, e.g., Paul Schwartz & Dan Solove, The PII Problem: Privacy and a New Concept of Personally Identifiable
Information, 86 NYU L. Rev. 1814 (2011).
51
See Ira Rubinstein & Woodrow Hartzog, Anonymization and Risk, 91 Wash. L. Rev. 703 (2015), [Link]
.com/sol3/[Link]?abstract id 2646185.
52
See id.
53
45 C.F.R. 46.102(f ).
134 Kelsey Finch and Omer Tene

that was in the public domain.54 The size and scope of the data that researchers can now
access often involving the mining of massive datasets that can be years old and gathered from
around the world also tends to render traditional “informed consent” mechanisms ineffect
ive.55 Furthermore, as research projects are initiated beyond traditional academic institutions,
new ethical review processes and principles will continue to develop.56 City officials who wish to
enable data research should be aware of the robust and ongoing discussions within the research
community about how to responsibly and ethically use data in the pursuit of knowledge.
Irrespective of the precise scope of the Common Rule, municipalities conducting data
research must consider ethical review processes and benefit risk analyses as critical parts of
privacy and civil liberties protections, particularly when sensitive data or vulnerable populations
are concerned. Part of the difficult calculus of data based research is determining when the risks
of using personal information in a particular way so strongly outweigh the benefits that a project
becomes unethical and should not be allowed to proceed or, conversely, when individuals
must assume some risk for the greater good. Without robust ethical review processes to help
researchers, data stewards, and publishers answer these questions, valuable data research results
could become locked away or research projects never even started for fear of public backlash or
regulatory action.”57 Cities that seek to conduct research or to enable research by others must
address these difficult challenges, and should engage with researchers and ethicists to develop
new approaches to ethical review and big data research.58
As communities begin to actively push their data out into the hands of researchers, citizens,
businesses, and other civic constituencies, however, they move beyond the routine tools of
enterprise data management. When they act as platforms for data inputs and outputs, smart cities
must rely on additional tools to strike the right balance between protecting privacy and enabling
data use for the public good.

city as platform
Historically, communities have been governed by “nineteenth and twentieth century ideas of
civic organization and social norms . . . revolv[ing] around representative governance and
centrally directed bureaucracies overseen by experts using strict, formal rules of procedure.”59
Municipal data management has followed similar trends: data was often lost in siloed, incompat
ible systems, inaccessible to other agencies, let alone the public. Even where data was made
public, it was often buried in labyrinthine city websites, in non searchable or cross linkable
formats.60

54
Omer Tene & Jules Polonetsky, Beyond IRBs: Ethical Guidelines for Big Data Research 1
(Dec. 2015), [Link]
[Link].
55
See id.
56
See generally Conference Proceedings: Beyond IRBs: Ethical Guidelines for Big Data Research, Future
of Privacy Forum (Dec. 10, 2015), [Link]
12–[Link].
57
See Jules Polonetsky, Omer Tene & Joseph Jerome, Beyond the Common Rule: Ethical Structures for Data Research
in Non-Academic Settings, 13 Colo. Tech. L.J. 333, 336 (2015), [Link]
Polonetsky-Tene-fi[Link].
58
See Matthew Zook et al., Ten Simple Rules for Responsible Big Data Research, PLoS Comput. Bio. 13 (2017), http://
[Link]/ploscompbiol/article?id 10.1371/[Link].1005399.
59
David Bollier, The City as Platform: How Digital Networks Are Changing Urban Life and Governance, The Aspen Institute
Commc’ns & Soc’y Program (2016), [Link]
60
See id. at 8.
Smart Cities: Privacy, Transparency, and Community 135

Today, however, technological progress is helping municipalities make a revolutionary shift. By


mediating interactions between communities and their citizens, smart city technologies, apps, and
datasets are helping cities position themselves not as fixed, distant decision makers, but as vital,
central platforms that support the efforts of citizens, businesses, and other organizations to play a
direct role in community operations.61 Rather than relying on “separate islands of software that
don’t communicate,” cities are centralizing and interconnecting “all the digital functionality the
city needs to serve internal operating requirements and to engage with citizens.”62 In the process,
they are becoming massive data and computing platforms, controlling what can and cannot be
done with data, whether data will flow in and out of the city, and how privacy and security
protections will be embedded throughout its physical and digital infrastructure.
In an era in which “code is law,”63 municipalities should embrace their roles as digital
platforms and the opportunity to set norms and standards around privacy for emerging technolo
gies. Smart cities sit at the convergence of every major technology trend: the Internet of Things,
cloud computing, mobile connectivity, big data, crowdsourcing, artificial intelligence, algorith
mic decision making, and more. Just as the Apple iTunes and the Google Play platforms
mediate interactions between consumers and apps,64 municipalities are creating platforms to
mediate interactions between citizens and the civic environment. Similarly to commercial
platforms, too, cities have an opportunity to write their own terms of service and to embed
privacy and security protections throughout their physical and digital infrastructures.
Given the political gridlock and the pace of technological advancement today, privacy policy is
seldom written by lawmakers in Washington, DC, or faraway state capitols, but rather is being
embedded into the deals that cities are striking with technology and analytics providers. Cities
already deploy sensor networks to monitor air pollution and reduce asthma rates; they support
smartphone apps to optimize bus routes (or 911 services, or snow removal, remedying potholes, or
any number of things); they develop facial recognition and algorithms to make policing more
efficient; they provide free (but often ad supported) public Wi Fi; they send drones to monitor road
congestion; and they rely on citywide electric grids to self report usage and maintenance needs.65
Many cities are already absorbing data from the urban environment including existing
infrastructure and systems, sensor networks, social media feeds, user generated app data, and
more and then centralizing and repackaging it in accessible, usable interfaces for developers,
civic hackers, researchers, businesses, other cities, and citizens to take advantage of.66 Providing
a city’s many constituents with access to data from and about their lives promotes a more
engaged polity. Importantly, it also helps prepares citizens to consider their own data footprint
and how “the technologies that seamlessly connect individuals to their environments change
how they interact with the city and how the city interacts with the world at large.”67

61
See id.
62
See Barbara Thornton, City-As-A-Platform: Applying Platform Thinking to Cities, Platform Strategy (http://
[Link]/city-as-a-platform-applying-platform-thinking-to-cities/ (last visited Apr. 24, 2017).
63
Lawrence Lessig, Code Is Law: On Liberty in Cyberspace, Harvard Magazine (Jan. 1, 2000), [Link]
.com/2000/01/code-is-law-html.
64
See Adrian Fong, The Role of App Intermediaries in Protecting Data Privacy, 25 Int’l J. L. & Info Tech. 85 (2017),
doi: 10.1093/ijlit/eax002.
65
See Shedding Light on Smart City Privacy, The Future of Privacy Forum (Mar. 30, 2017), [Link]
smart-cities/.
66
See, e.g., Rob van der Meulen, Developing Open-Data Governance in Smart Cities, Gartner (June 21, 2016), https://
[Link]/smarterwithgartner/developing-open-data-governance-in-smart-cities/ (“CitySDK, a project by the
European Commission, and the Smart Nation API coLab from Singapore are two examples already in progress.”).
67
Matt Jones, The City Is a Battlesuit for Surviving the Future, Io9 (Sept. 20, 2009), [Link]
the-city-is-a-battlesuit-for-surviving-the-future.
136 Kelsey Finch and Omer Tene

Another significant, technology driven aspect of this policy shift is that many of these efforts
rely on novel combinations of municipal data, consumer data, and corporate data. For example,
the City of Los Angeles entered into a data sharing partnership with Waze, the smartphone app
that tracks traffic in real time, in which “Data flows upward from motorists to Waze about traffic
accidents, police traps, potholes, etc., and the City shares with Waze its data about construction
projects, big events and other things that may affect traffic.”68 Where partnerships with private
companies are not established, local regulatory authorities may instead require data sharing by
law. In New York, San Francisco, and Sao Paolo, for example, local governments have revised
rules or brought bills requiring Uber to share granular data about individual trips for such
purposes as guiding urban planning, regulating driver fatigue, or reducing traffic congestion.69
Thus, smart cities will increasingly find themselves situated as intermediaries in an ecosystem
of complex data and analytics flows. Given cities’ unique regulatory, market, and social
positioning, they will become important gatekeepers, dictating when, how, and for what
purposes civic data may flow from one entity to another. What tools and considerations, then,
should smart cities take into account in their role as platform?
Data mapping. Before cities can begin mediating the complex flows of municipal data to and
from individuals and other entities, city officials need to understand what data they are collecting
and how it is used throughout the entire smart city ecosystem. While this task can be daunting
for an entity with the size and complexity of a municipality, a central component of a successful
privacy and security program is knowing what data is collected, how sensitive it is, how
identifiable it is, where it is stored, who has access to it, when and how it might be disposed
of, and what it is or will be used for. When the city is acting as a platform for sharing data, the
complexity of mapping data increases as do the consequences should a city fail to understand
the legal, economic, and social impacts of citizens’ data spilling out without clear oversight.70
In particular, although these categories remain hotly debated, it is important to classify data as
personally identifiable, pseudonymous, or de identified.71 Whether data can be used to identify
or single out an individual within the city will have major legal implications and will determine
under what conditions and for what purposes data may be used. This sort of identifiability
classification is increasingly popular as part of open data schemas,72 even as cities should be
aware that there is significant debate within privacy and data science communities over when
and how to regard data as “de identified” or “anonymous.”73
Urban datascapes raise difficult questions about data ownership. Who owns civic data the
individual who generates the data? The technology system provider? The municipality that
contracted for the technology system? The public? Cities and those in privity with them need to
navigate these complex waters, while keeping abreast of related issues such as what capacity
parties have to secure data systems (e.g., a small public school may lack the expertise of a global
68
David Bollier, The City as Platform, [Link] (Feb. 19, 2016), [Link]
69
See Assembly Bill A6661, 2017–2018 (NY), [Link] Regulating Individ-
ual Transportation in Sao Paolo: What Is at Stake?, InternetLab (Jan. 12, 2016), [Link]
opinion/regulating-individual-transportation-in-sao-paulo-what-is-at-stake/; Joe Fitzgerald Rodriguez, SF Wants Access
to Uber and Lyft to Tackle Traffic Congestion, SF Examiner (Mar. 31, 2017), [Link]
access-uber-lyft-data-tackle-traffic-congestion/; Lauren Smith, NYC Taxi & Limousine Commission Proposal Requiring
Drop-Off Location Data Raises Privacy Concerns, Future of Privacy Forum (Dec. 30, 2016), [Link]
30/privacy-implications-collecting-hire-vehicle-drop-off-location-data/.
70
See Whittington et al., supra note 10.
71
See Jules Polonetsky, Omer Tene & Kelsey Finch, supra note 11.
72
See, e.g., Data Classification Policy, Office of the Chief Tech. Officer (Mar. 30, 2011), [Link]
default/files/dc/sites/octo/publication/attachments/DataClassifi[Link].
73
See Part III below (“Open Data”).
Smart Cities: Privacy, Transparency, and Community 137

technology firm); who should be liable in the event of a data breach; what extra legal privacy or
security commitments entities have made; what data can or may cross territorial borders; and
under what circumstances a particular party might be compelled to turn data over to a third party
(e.g., a company holding citizen data may provide data to law enforcement subject to a
subpoena or warrant, and a municipality may provide it to an individual subject to a freedom
of information request).
Privacy notices. Every city, no matter how sizable its smart technology stores, should also
establish and make publicly available a comprehensive privacy policy. These policies help all
community stakeholders understand how and why data will be collected and used throughout
the city, encouraging accountability and building public trust. While the science of effective
disclosures continues to develop,74 privacy policies remain a foundational tool for businesses and
government organizations.
These public facing policies should describe urban data practices, including, but not limited
to, the following key provisions:

• How data is collected, stored, used, secured, shared, and disclosed


• For what purposes data is collected and used
• Which data sets are owned by which stakeholders, and what data rights and protections
accompany them
• Which data sets are private or require individuals’ consent before being used
• Which data sets can be shared with the city or with authorized third parties
• How de identified data can be shared
• What options, if any, individuals have to access, correct, or request the deletion of their
personal data
• If personal data will be used for automated decision making, meaningful information about
the logic involved and the significance and envisaged consequences of such processing for
the individual
• How data holders will respond to law enforcement requests for data
• The identity and the contact details of the data controller
• Whether data will be used for research
• The period for which personal data will be stored
Cities can and should experiment with additional features such as layered notices, just in time
notifications, and explanatory illustrations and data visualizations, as well as ensure that privacy
policies are consistent, easy to understand, and accessible to all members of the public,
including individuals with disabilities. Determining where and how to place notices in public
spaces can be challenging. For example, in many places around the world, CCTV cameras are
accompanied by a printed card with the legal authority, a statement that the camera is in
operation, and contact details for obtaining additional details about the data processing.75 Once
cities begin expanding to distributed devices and sensors, however, notice will become even
more difficult should citizens look for a license plate on a drone overhead? Should there be a
sign at the entrance to every subway station with mobile location analytics systems? Should cities

74
See, e.g., Lorrie Cranor, Reflections on the FTC’s Disclosure Evaluation Workshop, Fed. Trade Comm’n (Nov. 30,
2016), [Link]
75
See Data Protection and CCTV, Data Protection Comm’r (Ireland), [Link]
Protection-CCTV/m/[Link] (last visited Apr. 24, 2017).
138 Kelsey Finch and Omer Tene

provide an app that will pop up notices about active data collection?76 With a vast array of
devices and sensors hovering around the cityscape, the public sphere can quickly become
cluttered with an incomprehensible cacophony of privacy notices.
In addition to the challenge of balancing comprehensive disclosures against readable disclos
ures, smart city officials have sometimes struggled to draft disclosures that are temporally
appropriate. That is, sometimes privacy policies describe not just the city’s current technological
and data collection capabilities, but also those that it hopes to roll out in the future (sometimes
seeking to look years forward).77 While cities should be commended for thinking about and
making public the potential privacy impact of their new technologies and services suitably far in
advance, making such disclosures in a privacy policy without additional discussion can
muddy the water.
Much like their corporate counterparts, city attorneys that are not sure precisely how a new
feature will work in practice often find themselves drafting privacy policies with the broadest
possible terms, providing flexibility for when the city does finally roll out an ambitious project.
Until the feature is available, however, such broad, permissive policies may give citizens,
communities, and consumer and privacy advocates (and anyone else who reads the privacy
policy)78 cause for concern. Confusion and mistrust are only likely to compound when city
officials (understandably) cannot describe any concrete privacy controls for as yet inactive
features.
This is not to say that cities should withhold information about prospective privacy impacting
technologies or services; after all, constant updates to a privacy policy every time a new feature
comes online may also fail to satisfy (or adequately notify) citizens and advocates. Rather, cities
should aspire to publish privacy policies that are current and timely, but also to supplement
them with additional transparency mechanisms.
Transparency. In a smart city, privacy policies are necessary but not sufficient for informing
citizens about how data is collected and used. Municipalities must strive to be truly transparent
to the public they serve. This requires engaging with communities and stakeholders early and
often, seeking creative ways to alert citizens to data driven activities, and increasing data literacy
throughout the city’s population.
When cities literally act as data platforms, they gain additional leverage to promote transpar
ency throughout the smart city data ecosystem. Through their terms of service or contractual
dealings, cities may condition other parties’ access to civic data on maintaining appropriate
transparency mechanisms. Consider, for example, the critical role played by the Apple iTunes
and Google Play platforms, which require apps to provide privacy policies and link to them from

76
See, e.g., Art. 29 Data Protection Working Party, Opinion 8/2014 on the Recent Developments on the
Internet of Things (Sept. 16, 2014), [Link] 1088; Art. 29 Data
Protection Working Party, Opinion 01/2015 on Privacy and Data Protection Issues relating to the
Utilisation of Drones (June 16, 2015), [Link]
ion-recommendation/files/2015/wp231 [Link]; Opening remarks of FTC Chairwoman Edith Ramirez, Privacy and the
IoT: Navigating Policy Issues, Int’l. Consumer Electronics Show (Jan. 6, 2015), [Link]
documents/public statements/617191/[Link].
77
For example, early versions of the privacy policy for LinkNYC, which offered free public Wi-Fi through sidewalk
kiosks, contained language about camera and facial recognition for months before the cameras were turned on. See
Brady Dale, Meet the Brave Souls Who Read LinkNYC’s Two Different Privacy Policies, Observer (July 28, 2016),
[Link]
78
Cf. Aleecia McDonald & Lorrie Cranor, The Cost of Reading Privacy Policies, 4 I/S: A J. of L. and Policy for the
Info. Soc’y 543 (2008), [Link]
Smart Cities: Privacy, Transparency, and Community 139

within the platforms themselves.79 App platforms also tackle more sensitive data collection by
requiring and prompting users with just in time notifications about particular data uses, such as
location tracking or access to address book contacts.
Cities can also leverage their unique control over both physical and digital spaces to deploy
multifarious messages about how civic data is collected and used. City officials should pro
actively and preemptively assess in what manner to provide information about data collection
and use for every initiative that involves citizens’ PII. Some uses may be appropriately disclosed
via municipal publications or announcements, while other, more sensitive uses may require
specific, on site public disclosures or signage. The physical environment for cities’ connected
devices also provides creative opportunities, such as lights or noises, to indicate when data is
being collected. Some devices will be more obvious than others; for example, a streetlight
triggered by a motion detector likely does not need a more specific notification. A streetlight
passively sensing a nearby smartphone’s MAC signal, on the other hand, would raise a more
significant concern.80
Cities can also invest in digital literacy and education campaigns, to help citizens understand
and take advantage of technological offerings more generally while bridging the digital divide.
Purpose specification and data minimization. Finally, when designing systems to collect or
use personal data, smart cities should specify the purpose of data collection and ensure data
minimization to avoid collecting beyond what is necessary for those purposes. In situations where
notice and choice may not be practical, or where data may leave the city’s specific control and
enter an unregulated ecosystem, narrowing collection and use of personal data in accordance with
these principles will safeguard individuals’ privacy and bar indiscriminate surveillance. At the same
time, these principles should not be rigidly applied to hinder the ability of smart cities to improve
and develop innovative new services. Smart cities should consider looking to the Federal Privacy
Act of 1974 as a model for addressing the privacy impact of government databases.
City officials must also keep in mind their influence as custodians of a data and technology
intermediary, where data practices, permissions, and assumptions written into a city’s code
can become de facto laws.81 By controlling the pipeline of civic data and restricting the types
and classes of data going into and out of their data platforms (whether internal or public
facing), cities will have tremendous power to set privacy protective standards, norms, and
technologies in place to enforce or, conversely, to undercut the principles of purpose
specification and data minimization. This requires cities not only to implement these
principles, but to enforce them and monitor data recipients and systems for compliance. At
the same time, however, cities should be cognizant of their own limitations: data that enters
city hands may be more susceptible to being made public via freedom of information requests
or open data mandates.

79
See, e.g., Google API Terms of Service, Google Developers (Dec. 5, 2014), [Link]
Terms and Conditions, Apple Developer, [Link] (last visited Apr. 24, 2017); FPF Mobile
Apps Study, Future of Privacy Forum (Aug. 2016), [Link]
Apps-Study fi[Link] (showing upward trend of privacy policies in app stores).
80
See, e.g., Mobile Location Analytics Opt-Out, Smart Places, [Link] (last visited Apr. 24, 2017); Wi-
Fi Location Analytics, Info. Comm’r’s Office (U.K.) (Feb. 16, 2016), [Link]
[Link]; U.S. v. InMobi Pte Ltd., Case No.: 3:16-cv-3474 (N.D. Cal. 2016), [Link]
files/documents/cases/[Link]; WiFi Tracking Technology in Shops and on Public Roads by Bluetrace:
Investigation by the Dutch Data Protection Authority, Autoriteit Persoonsgegevens (Oct. 2015), [Link]
[Link]/sites/default/files/atoms/files/conclusions bluetrace [Link].
81
See Martjin de Waal, The City as an Interactive Platform, The Mobile City (Oct. 9, 2009), [Link]
2009/10/09/593/.
140 Kelsey Finch and Omer Tene

Collecting data without an intent to use it in specific ways, or storing it after it has served its
purpose, is risky behavior for any organization, but cities especially hold the keys to highly
sensitive data, often from vulnerable populations.82 The New York City municipal ID program,
for example, was designed to help undocumented immigrants integrate into the city’s residential
fabric. Upon the election of President Trump and a policy shift by the federal government
towards actively tracking and deporting undocumented immigrants, New York City officials
have struggled over what to do to protect their database: challenge in court any attempt by the
federal data to access it? Destroy it?83 When cities choose to collect personal and sensitive data,
they must consider how that information could be reused by others. It is common privacy gospel
that if sensitive data cannot be adequately safeguarded, it should not be collected in the
first place.

city as government
Even the most technologically advanced city in the world84 is still ultimately a political entity,
accountable to the needs and desires of its constituents. Unlike private sector data stewards or
platforms, cities cannot pick and choose which populations to serve, and every misstep can have
huge and lasting impacts on the urban life of citizens. The technologists’ desire to “move fast
and break things” is dangerous when real lives and the public interest are at stake.
In their more traditional role as a local governmental entity, cities must navigate their
obligations to ensure the safe, efficient, and equitable administration of city services; to govern
transparently; and to protect the civil liberties of city residents. Often, these goals need to be
balanced against each other, as for example, transparency mandated by freedom of information
laws may run up against individuals’ privacy rights. Complicating this, cities must account for
the competing public values and preferences of their highly diverse constituents: some citizens
broadly support the use of body worn cameras by police for improving accountability and public
safety, for example, while others distrust and reject this measure for increasing government
surveillance of vulnerable populations.85
Furthermore, municipalities’ reliance on data driven decision making raises concerns that
“technocratic governance” could supplant citizen centered political processes.86 Municipalities
that are overeager to deploy technologically oriented solutions may inadvertently find them
selves prioritizing some citizens over others.87 The City of Boston, for example, developed a
smartphone app that would use the phone’s accelerometer and GPS data to automatically report

82
See, e.g., Deepti Hajela & Jennifer Peltz, New York City Could Destroy Immigrant ID Card Data After Donald Trump
Win, The Denver Post (Nov. 15, 2016), [Link]
card-data/.
83
See Liz Robbins, New York City Should Keep ID Data for Now, Judge Rules, N.Y. Times (Dec. 21, 2016), [Link]
.[Link]/2016/12/21/nyregion/[Link].
84
According to one recent study, Tokyo. IESE Cities in Motion Index 25 (2016), [Link]
[Link]/.
85
See Harvard Law Review, Considering Police Body Cameras, 128 Harv. L. Rev. 1794 (Apr. 10, 2015), [Link]
[Link]/2015/04/considering-police-body-cameras/.
86
See Rob Kitchin, The Real-Time City? Big Data and Smart Urbanism, 79 GeoJournal 1–14 (2014), [Link]
.[Link]/6e73/[Link].
87
See Jathan Sadowski & Frank Pasquale, The Spectrum of Control: A Social Theory of the Smart City, 20 First
Monday (2015), [Link] (“To take some obvious examples: should new forms of
surveillance focus first on drug busts, or evidence of white-collar crime, or unfair labor practices by employers? . . . Do
the cameras and sensors in restaurants focus on preventing employee theft of food, stopping food poisoning, and/or
catching safety violations?”).
Smart Cities: Privacy, Transparency, and Community 141

potholes to the city’s Public Works Department as users drove around town.88 Before launching
the app, however, the city and the app developers realized that variances in smartphone
ownership could foster inequities in road improvement. The populations that were most likely
to own a smart phone the young and the wealthy were at risk of diverting city services away
from poor and elderly neighborhoods.89 Instead, the city modified their rollout plans, “first
handing the app out to city road inspectors, who service all parts of the city equally, relying on
the public for only additional supporting data.”90
Municipalities must ever be conscious of how the deployment of data collecting technologies
will shift the balance of power maintained between citizens and the city. What tools and
considerations, then, should smart cities take into account to protect individual privacy in their
role as local government?
Open data. Many federal, state, and municipal governments have committed to making their
data available to city partners, businesses, and citizens alike through Open Data projects and
portals.91 Open data efforts characterize themselves as providing the social, economic, and
democratic values that cities often seek to embody92: they are about “living up to the potential
of our information, about looking at comprehensive information management and making
determinations that fall in the public interest,” “unlock[ing] troves of valuable data that
taxpayers have already paid for,” and establishing “a system of transparency, public participation,
and collaboration.”93 As a practical matter, too, governments are uniquely situated to give back
to their communities due to the quantity and centrality of the government’s data collection, as
well as the fact that most government data is public data by law.94
In the spirit of civic innovation and reform, many cities are not only making their databases
public, they are increasingly doing so by default. The City of Louisville has a standing executive
order for all data to be open,95 for example, and the mayor and city council of the City of Palo
Alto have also recently decreed data to be open by default.96 The City of Seattle, which finds
itself attempting to balance a stringent public records law against a robust civic tech ethos, has
decreed that data will be made “open by preference.”97 Indeed, all levels of government are
encouraging open data: in 2013, “President Obama signed an executive order that made open
and machine readable data the new default for government information,”98 and the federal data.

88
See Exec. Office of the President, Big Risks, Big Opportunities: the Intersections of Big Data and
Civil Rights (May 2016), [Link] 0504 data
[Link]; John D. Sutter, Street Bump App Detects Potholes, Tells City Officials, CNN (Feb. 16, 2012),
[Link]
89
See Kelsey Finch & Omer Tene, supra note 5, at 1604.
90
Id.
91
See Roberto Montano & Prianka Srinivasan, The GovLab Index: Open Data, GovLab (Oct. 6, 2016), [Link]
.org/govlab-index-on-open-data-2016-edition/.
92
See, e.g., Jane Jacobs, The Death and Life of Great American Cities (1961) (“Cities have the capability of
providing something for everybody, only because, and only when, they are created by everybody.”).
93
See Open Government Initiative, The White House, [Link] (last visited Apr. 24,
2017), Why Open Data?, Open Data Handbook, [Link] (last visited
Apr. 24, 2017).
94
See id.
95
Mayor of Louisville, Executive Order No. 1, Series 2013, An Executive Order Creating an Open Data Plan (Oct. 11,
2013), [Link]
96
City of Palo Alto, Proclamation of the Council Proclaiming the City of Palo Alto as Open [Data] by Default (Feb. 10,
2014), [Link]
97
Office of the Mayor, City of Seattle, Executive Order 2016–01 (Feb. 27, 2016), [Link]
uploads/2016/02/[Link].
98
See Open Government Initiative, The White House, [Link] (last visited
Apr. 24, 2017).
142 Kelsey Finch and Omer Tene

gov catalog contains 5,610 datasets from nineteen contributing city governments, 1,392 datasets
from seven county governments, and 9,619 datasets from twenty one state governments.99
City leaders must also carefully evaluate local public records laws100 to ensure that individuals’
personal data is not inadvertently made public by open programs. The breadth of any relevant
Freedom of Information Act or similar laws should also be considered in determining what
personal information a city can or should collect. While freedom of information laws uniformly
include exceptions to protect individuals from the “unwarranted invasion of personal privacy,”101
they predate the advent of big data and smart city technologies. Consequently, cities and
governments more broadly have struggled to adapt to the realities of modern de identification
and reidentification science in determining what constitutes protected personal information. In
2013, for example, the New York Taxi and Limousine Commission collected “pickup and drop
off times, locations, fare and tip amounts, as well as anonymized (hashed) versions of the taxi’s
license and medallion numbers” for every taxi ride in the city.102 The data was obtained via a
freedom of information request and subsequently made public, at which point industrious data
scientists began reidentifying trips made by particular celebrities (including exact fare and
tipping data), as well as, more salaciously, detailing the travels of everyone who took a taxi to
or from Larry Flynt’s Hustler Club, “pinpointing certain individuals with a high probability.”103
Open and accessible public data benefits citizens by helping cities uphold their promises
towards efficient and transparent governance, but also poses a significant risk to individual
privacy. One of the greatest risks of opening government datasets to the public is the possibility
that individuals may be reidentified or singled out from those datasets, revealing data about them
that would otherwise not be public knowledge and could be embarrassing, damaging or even life
threatening.104 Recent advances in smart city technologies, reidentification science, data market
places, and big data analytics raise those reidentification risks.105
These concerns loom all the larger as open data efforts continue to mature, no longer simply
publishing historic data and statistics but increasingly making granular, searchable, real time
data about the city’s and its citizens’ activities available to anyone in the world. Databases of
calls to emergency services for 911, or fire departments, or civil complaints about building
codes, restaurants, and even civil rights violations are all obvious risks for the leakage of
sensitive data. Data sets that are more bureaucratic may fail to raise the same privacy red flags,
while still leaving individuals just as exposed. In 2017, for example, a parent who was examining
expenditure files on the Chicago Public Schools website discovered that deep within the tens of
thousands of rows of vendor payment data were some 4,500 files that identified students with
Individualized Educational Programs revealing in plain text the students’ names, identification

99
Data Catalog, [Link], [Link] (last visited Jan. 3, 2017).
100
See, e.g., State Freedom of Information Laws, Nat’l Freedom of Info. Coal., [Link]
information-laws (last visited Apr. 24, 2017).
101
See, e.g., Exemption 6, Dep’t of Justice Guide to the Freedom of Info. Act, [Link]
default/files/oip/legacy/2014/07/23/exemption6 [Link] (last visited Apr. 24, 2017).
102
Anthony Tockar, Riding with the Stars: Passenger Privacy in the NYC Taxicab Database, Neustar Research (Sept. 15,
2014), [Link]
103
Id.
104
Report of the Special Rapporteur on the Right to Privacy, Annex II, Human Rights Council, U.N. Doc. A/HRC/31/64
(May 8, 2016) (by Joseph A. Cannataci); Ben Green et al., Open Data Privacy Playbook (Feb. 2017), https://
[Link]/publications/2017/02/opendataprivacyplaybook.
105
See, e.g., Arvind Narayanan & Edward Felten, No Silver Bullet: De Identification Still Doesn’t Work
(July 9, 2014), [Link]
Smart Cities: Privacy, Transparency, and Community 143

numbers, the type of special education services that were being provided for them, how much
those services cost, the names of therapists, and how often students met with the specialists.106
Governments and scholars have only recently begun to tackle the difficult question of
publishing and de identifying record level government data.107 In 2016, the National Institute
of Standards and Technologies released a guide to de identifying government datasets,108 and de
identification expert Dr. Khaled El Emam published an “Open Data De Identification Proto
col.”109 The City of San Francisco also published the first iteration of an “Open Data Release
Toolkit,” which walks city officials through the process of classifying data’s appropriateness for
public output, identifying direct and indirect identifiers, applying de identification techniques,
and balancing the residual risks to individual privacy against the potential benefit and utility of
the data.110 The City of Seattle is currently producing an “Open Data Risk Assessment,” which
in collaboration with a community advisory board and local academics, examines the city’s open
data program, organizational structure, and data handling practices and identifies privacy risks
and mitigation strategies.111
De identification may be the single most difficult tool for cities to implement, and yet also
one of the most important if data continues to be made open.112 In addition to risk assessments,
cities should consider alternatives to the “release and forget” model that most open data portals
use.113 Where possible, cities may condition access to data on the signing of a data use agreement
(for example, prohibit attempted reidentification, linking to other data, or redistribution of the
data), or set up a data enclave where researchers can run queries on de identified information
without ever acquiring it directly.114
Communications and engagement strategies. As smart city residents begin to interact with
new technologies and data driven services in their environment, misunderstandings around
what data is collected and fears about how it may be used could risk the viability of valuable
urban projects.
Smart city leaders should develop proactive strategies that anticipate potential privacy con
cerns and seek to address them in public. Materials that are easy for the public to access and
understand should be available from the outset of a program to explain the purposes and societal
benefits of using data in a particular way, as well as the range of safeguards available to mitigate
residual privacy risks. Citizens should be given opportunities to comment publicly on the

106
See Lauren FitzPatrick, CPS Privacy Breach Bared Confidential Student Information, Chicago Sun Times (Feb. 25,
2017), [Link] (further data-
base details on file with authors); Cassie Creswell, How a Parent Discovered a Huge Breach by Chicago Public
Schools of Private School Students with Special Needs, Parent Coal. for Student Privacy (Mar. 5, 2017), https://
[Link]/how-a-parent-discovered-a-huge-breach-by-chicago-public-schools-of-private-school-
students-with-special-needs/.
107
Cf. The U.S. Census Bureau, which has been producing aggregated statistics and engaging with cutting-edge
statistical disclosure control science for decades. Statistical Disclosure Control, U.S. Census Bureau, [Link]
.[Link]/srd/sdc/ (last visited Apr. 24, 2017).
108
Simson Garfinkel, NIST Special Publication 800 188, 57 (2nd Draft): De Identifying Government Data
sets (Dec. 2016), [Link] 188 [Link].
109
Khaled El Emam, A De-Identification Protocol for Open Data, Privacy Tech (May 16, 2016), [Link]
a-de-identification-protocol-for-open-data/.
110
Erica Finkle, DataSF: Open Data Release Toolkit, [Link]
NyNDA/view (last visited Apr. 24, 2016).
111
See Green et al., supra note 105, at 57.
112
See Garfinkel, supra note 108.
113
See Simson Garfinkel, NISTIR 8053: De Identification of Personal Information 14 (Oct. 2015), http://
[Link]/nistpubs/ir/2015/[Link].
114
Id.
144 Kelsey Finch and Omer Tene

development and deployment of smart city technologies, particularly where data will be
collected in new or different ways. Where possible, cities should also consider including citizens
in the development process through user research, hackathons, and other participatory design
events, which will give them an opportunity for deeper and more collaborative engagement than
a public comment period alone. These responses, together with a proactive and responsive
communications strategy, can help explain urban data strategy to alleviate public concerns.
One instructive example is the City of Chicago’s “Array of Things” project. In partnership
with the University of Chicago and the Argonne National Laboratory, the city wanted to deploy
“a network of interactive, modular sensor boxes around Chicago to collect real time data on the
city’s environment, infrastructure, and activity for research and public use.”115 Given the breadth
and novelty of this urban sensing network, concerns about privacy loomed large.116
In addition to taking technical measures to minimize any personally identifying data being
captured by the sensors, and instituting a variety of governance tools, the Array of Things also
developed a sophisticated civic engagement plan. Its goals were fourfold: to “educate Chicago
ans about the Array of Things project, process, the potential of the research, and the sensors’
capacities; inform future generations of the Array of Things sensors; understand what the people
want out of the Internet of Things & these neighborhood data; and collect resident feedback on
privacy and governance policies for Array of Things.”117 The project team partnered with local
community organizations to engage and educate Chicagoans, provided several easily accessible
in person and digital mechanisms for individuals to comment on its draft privacy policy, and
developed a curriculum for a local high school to educate students on the Array of Things,
developing their technology skills and engaging them with their city’s real time data flows.118
The result was a more sophisticated privacy policy, an engaged and informed populace, and a
positive model for cities around the world.
In contrast, the implications of failure to communicate effectively and timely can be stark, as
demonstrated by the quick rise and even quicker demise of education technology vendor
inBloom. Only a year after its public launch, the high profile educational nonprofit inBloom
folded in the face of sustained parent, press, and regulatory pressure about student privacy.119
inBloom, in fact had more sophisticated privacy and security processes than many of the public
schools whose data it sought to warehouse “so that school officials and teachers could use it to
learn about their students and how to more effectively teach them and improve their perform
ance in school.”120 Yet the organization largely failed to communicate proactively and respond
to concerns about privacy, assuming that the value proposition of its data tools was self evident.
In doing so, it lost sight of the need to involve parents in the creation and implementation of the
project and failed to prepare its partners school districts and states to talk about privacy and
new technologies at a time when student data analytics were new to many stakeholders
(including students, parents, and teachers). The results of failing to engage and communicate

115
What Is the Array of Things, Array of Things, [Link] (last visited Apr. 24, 2017).
116
See Amina Elahi, City Needs More Detail in Array of Things Privacy Policy, Experts Say, Chicago Tribune (June 20,
2016), [Link]
.html.
117
Array of Things Civic Engagement, Smart Chicago, [Link]
of-things-civic-engagement/ (Apr. 24, 2017).
118
Id.
119
See Natasha Singer, InBloom Student Data Repository to Close, N.Y. Times (Apr. 21, 2014), [Link]
.com/2014/04/21/inbloom-student-data-repository-to-close/.
120
See Dan Solove, Why Did inBloom Die? A Hard Lesson About Education Privacy, LinkedIn (Apr. 29, 2014), https://
[Link]/pulse/20140429042326-2259773-why-did-inbloom-die-a-hard-lesson-about-education-privacy.
Smart Cities: Privacy, Transparency, and Community 145

with citizens were a $100 million failed project, a skeptical and distrusting populace, and a wave
of legislation permanently restricting the sharing of student data.121
Surveillance and individual control. While ubiquitous and persistent monitoring technolo
gies are increasingly available to cities including CCTV and body worn cameras, stingrays,
facial recognition, and automated license plate readers the important goals of security and
efficiency should not open the door to unlimited surveillance of urban residents.122 A recent
study by the Georgetown Law Center on Privacy and Technology suggests that half of all US
residents are in a police facial recognition database, but that the systems are typically unregu
lated and hidden from the public.123 Troublingly, the report notes that “of 52 agencies, only four
(less than 10%) have a publicly available use policy. And only one agency, the San Diego
Association of Governments, received legislative approval for its policy,” and that “only nine of
52 agencies (17%)” had an intent to log and audit officers’ face recognition searches for improper
use.”124 Further, the report underscores racial disparities built into the facial recognition
technologies, which both “include a disproportionate number of African Americans” and
“may be less accurate on black people.”125 The lack of transparency, lack of strict oversight,
sensitivity of the data, and power imbalance inherent in surveillance programs significantly
threatens civil liberties and undercuts public trust in all other civic technology programs. The
report, along with further testimony before the House Committee on Oversight and Govern
ment Reform, also implicates the enhanced risk to privacy and transparency when local
government joins forces with federal agencies in this case, allowing the FBI access to state
DMV photo databases through contractual memoranda with state governments.126
Wherever possible, cities should strive to give citizens detailed information and legitimate
choices about how their data is collected and used. In some situations, however, cities may be
faced with technologies and data services that make it impractical or infeasible to offer citizens
traditional notices or choices. If citizens could opt out of automated tolling enforcement, or
security cameras, or court records, important public safety and accountability goals could not be
met.127 If cities needed to inform citizens of every instance in which their smartphones’ mobile
identifiers were collected by a city run Wi Fi connection,128 individuals would constantly be
bombarded by information and grow unreceptive even to important notices. Nevertheless, cities
should sometimes be prepared to trade off perfect data for individual privacy, in order to build
trust. While citywide smart grids will not be as efficient without 100 percent participation, and
many citizens may be perfectly happy to share their utility information for lower costs overall,

121
See Brenda Leong & Amelia Vance, inBloom: Analyzing the Past to Navigate the Future, Data & Soc’y (Feb. 2,
2017), [Link]
122
See U.S. v. Jones, 132 [Link]. 945 (2012) (J. Sotomayor, concurring).
123
Clare Garvie, Alvaro Bedoya & Jonathan Frankle, The Perpetual Line Up (Oct. 216), [Link]
[Link]/.
124
Id.
125
Id.
126
See Law Enforcement’s Policies on Facial Recognition Technology: Hearing Before the H. Comm. on Oversight and
Gov’t Reform, 115th Cong. (2016), [Link]
nology/.
127
Paradoxically, in order to maintain such opt-outs the government would need to maintain a database of individuals
who did not want to be tracked in order to effectuate those choices.
128
See, e.g., Steven Irvine, Wifi Data Trial Understanding London Underground Customer Journeys, Transport for
London Digital Blog (Nov. 23, 2016), [Link]
ground-customer-journeys/.
146 Kelsey Finch and Omer Tene

cities have nevertheless found ways to offer free and easy opt outs for those citizens who do not
wish to participate.129
If cities cannot notify citizens about a specific data collection in advance or at the time of
collection, they should consider alternatives to safeguard individual privacy and bar indetermin
ate surveillance. If notice or choice is not provided in a particular instance, cities should:
Conduct a privacy impact assessment and document in writing why notice or choice was not
provided130 (and revisit the decision on a regular basis),
Implement processes to aggregate or de identify data as soon as possible,131
Seek the input and approval of an independent ethical review board,
Provide individuals with information about how their data was used within a reasonable
period of time after it had been collected,132 and/or
Minimize data to only what is necessary for a particular purpose.
Indeed, given that citizens may have no reasonable alternatives to opt out of municipal infor
mation systems, smart cities should seek to minimize data collection, or otherwise restrict the use
and retention of personal data. As we have discussed previously, “one of the fundamental
principles of informational privacy is to prevent the creation of secret databases.”133
In addition to broader, public access to open government datasets that provide aggregate data
on city and citizen activities, individual access rights are critical drivers for establishing trust and
support in smart city technologies. They can ensure that smart city surveillance is not adversarial
and secretive by empowering users to see for themselves what information has been collected
about them. Where cities rely on algorithms to make substantive decisions with individual
impacts, they should make efforts to reveal which databases they maintain and what criteria are
used in their decision making processes. If individuals cannot understand how and why civic
institutions use their data, individual access rights may ring hollow.
Equity, fairness, and antidiscrimination. City leaders increasingly rely on big data analytics
and algorithms to make cities, e government, and public services faster, safer, and more efficient.

129
See, e.g., U.S. Energy Info. Admin., Smart Grid Legislative and Regulatory Policies and Case Studies
(Dec. 2011), [Link] Cassarah Brown, States Get Smart:
Encouraging and Regulating Smart Grid Technologies, Nat’l Conference of State Legislatures (July 2013),
[Link] (listing states with
legislative action creating smart grid opt-outs); Nancy King & Pernille Jessen, For Privacy’s Sake: Consumer Opt-
Outs for Smart Meters, 30 Computer L. & Security Rev. 530 (2014), [Link]
bitstream/handle/1957/55599/KingNancyBusinessForPrivacy’[Link]; jsessionid 9D15BA5022E662CA12B15F1FAC
292B49?sequence 1.
130
For example, the FBI’s Privacy Impact Assessments often include this question: “Clear and conspicuous notice and
the opportunity to consent to the collection and use of individuals’ information provides transparency and allows
individuals to understand how their information will be handled. Describe how notice for the system was crafted with
these principles in mind, or if notice is not provided, explain why not.” See, e.g., Privacy Impact Assessment for
the FIRST (Firearms Information, Registration & Shooter Tracking) Application, Fed. Bureau of
Investigation, Dep’t of Justice (July 2013), [Link]
impact-assessments/first.
131
See, e.g., Info. Comm’r’s Office (U.K.), supra note 79 (recommending data from Wi-Fi analytics be aggregated or
have identifiable elements removed as soon as possible); Autoriteit Persoonsgegevens, supra note 79 (suggesting
that Wi-Fi tracking within shops would be “less intrusive if the personal data processed will be made anonymous as
soon as possible, or at least within 24 hours.”).
132
See, e.g., Lukasz Olejnik, Switzerland’s New Surveillance Law, Security, Privacy, & Tech Inquiries (Sept. 26,
2016), [Link]
133
See Kelsey Finch & Omer Tene, supra note 5, at 1613 (“From its inception, information privacy law has been
modeled to alleviate this concern, which arose in the Watergate period in the United States and the Communist era
in Eastern Europe when secret databases were used to curtail individual freedoms.”).
Smart Cities: Privacy, Transparency, and Community 147

It is important that smart cities also use these tools to make their environments fairer, and not
unfairly distribute resources or inadvertently discriminate against certain groups, including not
only historic minorities, but also any group of individuals with a smaller digital footprint, who
may otherwise be left out of data driven analytics.134 City leaders will need to be particularly
forward thinking as they consider the societal impact of revolutionary new technologies that may
have conflicting impacts on different populations: automated vehicles, for example, promise to
bring new freedom and mobility to the elderly and people with disabilities, and to rededicate
urban spaces to people, rather than parking, but at the same time may eat up thousands of
driving jobs.135
As municipal governments begin to gain real time awareness of the people and activities
within the city, that information feeds into policies with “profound social, political and ethical
effects: introducing new forms of social regulation, control and governance; extending surveil
lance and eroding privacy; and enabling predictive profiling, social sorting and behavioural
nudging.”136 With increasingly robust data and analytics, cities will be more equipped than ever
to subtly “nudge” their citizens to modify their behavior in subtle, low cost interventions
hopefully, for the common good.137 For example, when the Center for Economic Opportunity
in New York City implemented its $aveNYC initiative, it relied on behavioral economics to
nudge low income households to opening savings accounts, tying the accounts to financial
incentives in the form of a 50 percent savings match, with results showing that “half of the
program’s participants reported no history of savings, 80% saved for at least one year to receive
the match and 75% continued to save thereafter.”138 City leaders must be careful to nudge
individual behavior in ethical ways, rather than in ways that will constrain individual behavior,
or profile and discriminate against a certain class of people.
As cities increasingly rely on data to automate their decision making, they must be careful to
think holistically about why and how data is being used: bad data can lead to bad policies, even
(or especially) in “smart” systems. Predictive policing and predictive sentencing, for example,
have repeatedly been undercut by studies revealing racial bias in both their inputs (historic arrest
and recidivism data, respectively) and their outputs, leading to institutional racial profiling.139
As we have discussed previously, big data and increased data flows may both exacerbate and
alleviate governmental discrimination, whether intentional or inadvertent. Given this, it is more
important than ever that cities engage citizens early and often in designing such systems, and

134
See Exec. Office of the President, Big Risks, Big Opportunities: the Intersections of Big Data and Civil
Rights (May 2016), [Link] 0504 data discrim
[Link]; John D. Sutter, Street Bump App Detects Potholes, Tells City Officials, CNN (Feb. 16, 2012), http://
[Link]/2012/02/16/tech/street-bump-app-detects-potholes-tells-city-officials/[Link].
135
See Jackie Ashley, The Driverless Car Revolution Isn’t Just about Technology: It’s about Society Too, The Guardian
(Jan. 1, 2017), [Link]
lane.
136
See Rob Kitchin, Getting Smarter About Smart Cities: Improving Data Privacy and Data Security, Data Protection
Unit, Department of the Taoiseach (Jan. 2016), [Link] 2016/
Smart Cities Report January [Link].
137
Monika Glowacki, Nudging Cities: Innovating with Behavioral Science, Data Smart City Solutions (May 17,
2016), [Link] (“At the
15th Convening of the Project on Municipal Innovation, mayoral chiefs of staff and leaders in the field discussed
how behavioral science can be used as a tool to improve public policy.”).
138
Id.
139
See Julia Angwin et al., Machine Bias, ProPublica (May 23, 2016), [Link]
risk-assessments-in-criminal-sentencing; Kelsey Finch & Omer Tene, supra note 5, at 1602–1603.
148 Kelsey Finch and Omer Tene

provide individuals who may be adversely impacted be informed of the criteria used in the
decision making processes, if not necessarily the raw data or code behind the determination.
Municipalities must be constantly vigilant to ensure they are serving all of their citizens,
however difficult it may be to strike a balance between smart city beneficiaries and smart city
casualties.

conclusion
As Jane Jacobs said half a century ago, “Cities have the capability of providing something for
everybody, only because, and only when, they are created by everybody.”140 The goal of local
governments, technology developers, and community organizations should be to empower and
engage citizens to ensure that the cities of the future are created by everybody. And while many
technological innovations are emerging first in urban spaces, they hold the potential to trans
form communities of all shapes and sizes.
Smart and data driven technologies launch new conversations and new ways to converse
between community leaders and community residents, creating room for the cultural growth
and democratic impulses that have caused modern cities to flourish. Through open data
programs, hackathons, participatory governance, and innovative community engagement pro
cesses, local governments are giving individuals new ways to interact with themselves, each
other, and the world around them. When individuals have more control of their own data for
their own purposes, a culture of data driven decision making, civic participation, and empower
ment takes hold.
At the same time, if citizens do not trust that their data will be protected or do not see the
benefits of new technologies, they could begin to fear the smart city’s sensors and services as tools
of discipline and surveillance, rather than cherish them as vehicles for transparency and
innovation. City officials will need to learn how to make thoughtful decisions about providing
appropriate notices, choices, and security measures to protect citizens’ data, and to compete on
accountability and transparency as much as on technological advancement. They will need to
act as data stewards, diligently and faithfully protecting the personal data that the city and its
partners collect. They should embrace their roles as platforms for data and technology, setting
the bar high for privacy and security practices. And they must always strive to govern both their
citizens and their citizens’ data legally, fairly, and ethically.
If city leaders, technology providers, community organizations, and other stakeholders work
together to address core privacy issues and principles, they will be able to leverage the benefits of
a data rich society while minimizing threats to individual privacy and civil liberties.

140
Jane Jacobs, The Death and Life of Great American Cities 238 (1961).
par t ii i

Ethical and Legal Reservations about Tracking Technologies


8

Americans and Marketplace Privacy

Seven Annenberg National Surveys in Perspective

Joseph Turow

The arrival of the commercial internet on a broad scale in the mid 1990s marked the beginning
of new opportunities for advertisers and retailers to interact with and follow shoppers. Before
hand, advertisements were typically one way phenomena. A person could read, hear or view a
media delivered commercial message, but the sending organization couldn’t immediately note
the audience members and their responses; nor could it reply to those individuals. To be sure,
the actual act of selling products was not devoid of immediate knowledge about purchasers or of
a marketer’s ability to present feedback to them. Store clerks have a tradition of working with
visitors and even in individual cases remembering their preferences. Moreover, catalog firms
have long been collecting data on the purchase patterns of customers and sending them catalogs
based on previous purchases. Too, chain retailers such as supermarkets, department stores, and
discount chains have since the 1980s been keeping tabs on the purchase patterns of repeat
customers by linking their loyalty cards to the goods’ barcode scans at checkout. Nevertheless,
most retailers didn’t explore their customers’ shopping habits in detail, and none followed them
individually through the aisles.
The commercial internet changed all that. Virtually from its start, it allowed an unpreced
ented level of interactivity and surveillance for both the advertising and retailing industries.
Retailers’ needs got early attention. It was for tracking the same person’s multiple purchases on
the same retailing website using a desktop browser that Netscape software experts created the
cookie in 1993.1 Advertising firms then quickly realized they could introduce cookies to sites on
which they bought commercial messages and use the cookies to recognize individuals (or at least
their devices) across different sites. Over the next twenty years, advertisers and retailers pushed
the boundaries of tracking technology beyond the desktop to the laptop, smartphone, tablet,
gaming console, television set, smart watch, and an accelerating number of devices worn by
individuals or in their homes. The aim was to gain immediate feedback about what audiences
and shoppers were doing on web browsers and in apps so they could learn about and target
messages to small segments of the population and even individuals in ways previously impossible
via earlier media. By the early 2010s, retailers and their technology partners had begun to
introduce ways to bring to physical stores the kinds of interactions and tracking that had become
common on the web and in apps.2

1
See Joseph Turow, The Daily You (New Haven: Yale University Press, 2011), p. 47.
2
See Joseph Turow, The Aisles Have Aisles (New Haven: Yale University Press, 2017), pp. 66–106.

151
152 Joseph Turow

Public advocates worried about this commercial surveillance from their start. They have
complained to the government and the press that the activities are ethically problematic, are
clearly raising concerns among the citizenry, and would raise greater alarm if the companies
involved would tell the public openly what they are doing. Industry spokespeople have
responded that their key motivation for tracking is relevance. They reason that if people find
advertising and retailing messages relevant, they will be happy to receive them. Moreover,
marketers have claimed that people understand the trade off of their data for the benefits of
relevant messages and discounts. They argue that young people, especially, don’t mind because
the upcoming generation takes such tracking for granted.
In the context of growing surveillance by advertisers and retailers, the aim of this chapter is to
interrogate their justifications based on the results of seven nationally representative telephone
surveys I have conducted with colleagues from 1999 through 2015.3 The purpose of the surveys
was to assess Americans’ understanding of their commercial internet environment as well as to
evaluate the claims by marketers that relevance and trade offs compensated in people’s minds for
the tracking that takes place. In the following pages, I present findings that tend to refute
marketers’ justifications for increased personalized surveillance and targeting for commercial
purposes. I also argue that Americans resist data collection and personalization based on data
collection because they don’t think those activities are right, even if not confronted with
immediate quantifiable harm resulting from them. I also argue that, contrary to the claim that
a majority of Americans consent to data collection because the commercial benefits are worth
the costs, our data support a quite different explanation: a large pool of Americans feel resigned
to the inevitability of surveillance and the power of marketers to harvest their data. When they
shop it merely appears they are interested in trade offs. The overall message of the surveys is that
legislators, regulators, and courts ought to rethink the traditional regulatory understanding of
harm in the face of a developing American marketplace that ignores the majority of Americans’
views and is making overarching tracking and surreptitious profiling a taken for granted aspect of
society.

the commercial surveillance environment


In the late 2010s, the advertising and retailing industries in the US are developing their abilities
to monitor, profile, and differentially interact with individuals in ways that extend far beyond
their capabilities during the early years of the cookie. Technology firms that didn’t exist a decade
ago are deeply involved in helping to create what many in the business call a new marketing
ecosystem that gives advertisers and merchants increased ability to discriminate among niche
population segments and individuals. The opportunity to target messages increasingly takes

3
Joseph Turow and Lilach Nir, The Internet and the Family 2000 (Philadelphia: Annenberg Public Policy Center,
2000); Joseph Turow, Americans and Online Privacy: The System Is Broken (Philadelphia: Annenberg Public Policy
Center, 2003); Joseph Turow, Lauren Feldman, and Kimberly Meltzer, Open to Exploitation: American Shoppers
Online and Offline (Philadelphia: Annenberg Public Policy Center, 2005); Joseph Turow, Jennifer King, Chris Jay
Hoofnagle, and Michael Hennessy, Americans Reject Tailored Advertising (Philadelphia: Annenberg School for
Communication, 2009); Joseph Turow, Michael X. Delli Carpini, Nora Draper, and Rowan Howard-Williams,
Americans Roundly Reject Tailored Political Advertising (Philadelphia: Annenberg School of Communication, 2012);
Joseph Turow, Michael Hennessy, and Nora Draper, The Tradeoff Fallacy (Philadelphia: Annenberg School of
Communication, 2015). The first six reports are collected in Joseph Turow, Americans, Marketers, and the Internet,
1999–2012, [Link] id 2423753. The Tradeoff Fallacy is available at http://
[Link]/sol3/[Link]?abstract id 2820060. Methodological and statistical specific accompany every
report.
Americans and Marketplace Privacy 153

place in so called programmatic marketplaces where advertisers bid to reach individuals with
specific characteristics, often in real time, as they are entering websites or apps. In tandem with
the growth of new ways and places to reach people, the past few decades have seen the rise of a
data cornucopia that marketers can combine for all sorts of personalized offers and other
messages. Most programmatic domains claim to be able to match shopper names, e mail
addresses, cookies, or other unique customer documentation with the individuals whom the
websites and apps are offering for targeting. They also tout the ability to find large numbers of
“lookalikes” who have the characteristics of those known individuals.
Marketers who want even more information about their customers, including information
that can potentially identify them on their sites and apps on different devices, can buy it from
data brokers such as Acxiom and Experian who at mid decade had put together their own ways
of tracking individuals across digital devices. Acxiom, for example, introduced an offline/online
cross device tracking system in 2013 that, it said, continually gathers information about approxi
mately 700 million identifiable individuals from three sources: Fortune 100 companies’ records
of people who “purchased something or signed up for a mailing list or some kind of offer”;
“every data attribute you could gather publicly about consumers from public records”; and its
own cross device cookie like system that can match the same individuals across a huge number
of digital properties and gives Acxiom “access to the last 30 days of behavior on more than one
billion consumers.” The Acxiom executive in charge reported, “For every consumer we have
more than 5,000 attributes of customer data.” Mediapost editor in chief Joe Mandese asked him
and the company’s chief executive about the possibly “creepy” nature of Acxiom’s aggressive
quantification of individuals. As Mandese paraphrased their answers, he saw the work as “just a
fact of modern consumer life, and all Acxiom is trying to do is make it more scientific so that the
data works the way it should, friction is taken out of the process for marketers and agencies, and
consumers at least get the most relevant offers and messages targeted at them.” The Acxiom
executive contended his firm can predict individuals’ future behaviors because it knows demo
graphic information about them, has actual offline and online purchase data about them, and
can follow what they do on different digital devices. “We know what your propensity is to buy a
handbag,” he said. “We know what your propensity is to go on vacation or use a loyalty card.”4
In addition to third party database firms such as Acxiom that traffic in identifiable individuals,
marketers can turn to digital advertising networks that claim to bring together data about
individuals for marketers from a wide variety of places, find those people or people like them
and send persuasive messages to them on a wide gamut of devices. Google and Facebook do this
kind of work all the time. They follow individuals who “login” to them on multiple devices.
(Facebook requires personal identification to enter, while Google requires it for Gmail, Google
+, and a few others of its services.) Individuals typically need to login once, after which the
identification and tracking are automatic. Based on these actions, the networks claim “deter
ministic” knowledge that they are locating the same person on their smartphone, tablet, laptop,
and desktop, for example.
But Google, Facebook and a relatively few other firms are distinctive in their ability to insist
that visitors identify themselves personally. Many networks, apps, and sites recognize people’s
unwillingness to take the time to register on every site or app that requires it. One solution is to
piggyback their logins on Google+, Facebook or Twitter identifications. Janrain is one firm that
facilitates the process by helping sites and apps exploit social network logins, in which people

4
Joe Mandese, “Supplier of the Year: Acxiom,” Media, January 8, 2014, [Link]
216930/supplier-of-the-year-acxiom-whos-on-fi[Link], accessed September 16, 2016.
154 Joseph Turow

register and sign in to the location with their passwords from one of the social networks to which
they belong. Janrain claims that “more than half of people” worldwide do it, with 81 percent of
them going through Facebook or Google+. One cost to people using the technique is that the
social network logging them in learns where they are on the internet. Another is that “social
login identity providers” such as Facebook and Google offer the site or app owners the ability to
learn each person’s name as well as “select specific pieces of customer data that go beyond the
basic name and verified email address to include elements like a customer’s birthday, Likes,
relationship status, photos, and friends’ lists, to name a few.” Moreover, the social network
updates the data every time the person return through the login. “Your customers simply select
their preferred social identity from Facebook, Twitter, or other networks,” Janrain notes, “and
choose whether to share some of their information with you.” Logging into the retailer on every
device with the same social login gives the person a persistent identity that can be helpful in
noting the person’s interactions with the retailer across different devices.5
A drawback to the site or app using this approach is that in recent years privacy advocates have
convinced Facebook, Google, and other social platforms to tell their users what kinds of data
they transfer when they login with their social accounts. Calls for greater control over what gets
sent led Facebook in 2015 to allow individuals to check off the specific types of data they don’t
feel comfortable sharing when logging into a site or app. A Janrain report noted, “Some
consumers are discouraged from using social login when the personal data requested seems
excessive for or irrelevant to the intended transaction.” The best way to allay customer concerns,
it said, is by asking the social media site to transfer only the data that will allow the retailer to
recognize the individual. Once that basic material has crossed the threshold and allowed
persistent identity the data floodgates would be open. At that point, Janrain advised, “build a
supplemental strategy to collect everything else.”6
Janrain listed nine ways to collect “everything else” when cross device identities are persist
ent. E mail service providers can provide the site with each individual’s forwarding behavior,
e mail frequency preferences, newsletter opt in, stated preferences, and inferred preferences.
Content management systems can provide the person’s content viewed, shared, and saved.
Contest platforms can inform the marketer of sharing behavior. Games can point out referrals
through sharing. E commerce platforms can list products purchased, frequency of purchase,
recentness of purchase, average order amount, discount code used, and the reviews the
individual posted. Chat and comment possibilities on a website can provide a light on an
individual’s sentiment and frequency of participation. Firms that provide “social listening”
services can report on an individual’s follows and un follows on Twitter, the person’s senti
ments on Facebook, the person’s views and comments on YouTube. Ratings and review
platforms can enlighten a marketer about an individual’s sentiments regarding particular
products, services, and companies. And on site behavioral analytics can yield a person’s
frequency of visits, time spent, content viewed, abandonment rate, and the site from which
the person arrived to that site.7

5
“Identity-Driven Marketing: Best Practices for Marketing Continuity,” Janrain, 2014, p. 6, [Link]
resources/white-papers/identity-driven-marketing-best-practices-for-marketing-continuity/, accessed September 26,
2016, [Link] accessed September 26, 2016,
6
“Marketing Continuity,” Janrain, May 1, 2014, p. 15, [Link] accessed Septem-
ber 26, 2016.
7
“Identity-Driven Marketing: Best Practices for Marketing Continuity,” Janrain, 2014, p. 6, [Link]
resources/white-papers/identity-driven-marketing-best-practices-for-marketing-continuity/, accessed September 26, 2016.
Americans and Marketplace Privacy 155

Not all targeting that takes places involves known individuals. The continual matching of
people and their characteristics for the purpose of commercial messaging takes place in an odd
world where those traded across sites, apps, and devices might be personally identified or
anonymous. Some marketers, sensitive to being accused of trading their customers but wanting
to make money from the information, will get around the issue by insisting that the companies
buying them scrub the personally identifiable information from them. Although the database
companies that buy those data do not have the deterministic knowledge of an individual that a
Google or Facebook have, they claim that by using advanced statistics they can indeed identify
and follow the same person across sites, apps, and devices.
Conversant, a subsidiary of the big database marketing firm Experian, is one firm that revels in
such statistical inferences. Its website states “We accurately recognize consumers like no one
else,” because of its use of “ an unprecedented amount of data, including anonymized transac
tional [store] data (both online and off ) from more than 4,000 brands.” Because it has so many
data points on every individual (including data about their devices), Conversant claims a
96 percent accuracy rate in statistically determining that a person it is following in one domain
is the same as the person in another. “We track over 1 million online actions per second to build
each profile across more than 7,000 dimensions including web browsing, app usage, video
plays, email activity, crosscreen engagement, life events, hobbies, ad interactions and product
interests.” In Conversant’s constantly updating world, marketers can find individuals who do or
could use their products and target them with ads personalized to their profiles and delivered
“across 3.3 million websites . . . and on 173,000+ mobile apps.” The company adds that “our
reach is on par with Google. And our competitors can’t come close to our scale or persistence of
recognition.” 8
When anonymity is touted, left unstated is that in many cases the lack of personally identifi
able information such as name and address may not matter for marketers’ targeting purposes.
Because people lead so much of their lives on Internet spaces, for many advertisers it is enough
that Facebook, Conversant, and many other audience suppliers can deliver the targets they need
as they interact with websites or apps on a variety of digital devices, and lead them to respond to
the advertisements sent their way. It is common, moreover, for marketers to match anonymous
ID tags (such as cookies) with providers such as Blue Kai that match them with other tags, add
more information about the individuals to the original cookies, and return them to the providers.
The marketer thereby learns a growing amount about the individual and has a broad ability to
reach that person despite not knowing the name and address. Marketing executives realize, too,
that commercial anonymity online need not at all be permanent. It is quite possible to cajole
people who are anonymous to reveal their identities. One way is to send an ad encouraging the
person to sign up for a sweepstakes or free report using name and e mail address. Another is to
match a cookie or other ID of the anonymous person with the cookie for that person at a website
or app that knows the person’s identity.
The armamentarium of techniques to track and identify people with the aim of sending them
digital messages that convince them to buy continues to grow. Over the past half decade many
retailers have been increasingly involved in bringing the process into their physical stores.
Although most also sell online, the merchants are aware that the great majority (in the US,
about 90 percent) of retail sales still take place in the brick and mortar domain. They are
consequently accelerating their ability to exploit tracking and data as people move with their

8
Matt Martella, “Making True 1:1 Marketing Happen, at Scale,” Conversant, December 1, 2015, pp. 5–6, [Link]
.[Link]/insights/making-true-11-marketing-happen-scale, accessed September 7, 2016.
156 Joseph Turow

smartphones from their homes through out of home places and into the retailers’ aisles. The
technologies to do that include cellular, GPS, WiFi, Bluetooth Low Energy, sound impulses
that interact with the phone’s microphone, light systems that work with the phone’s camera, and
indoor mapping systems that work with the phone’s accelerometer. These tools are typically used
to determine a device’s indoor and outdoor locations with the aim of sending a shopper messages
about the retailer’s goods, or personalized discount coupons. One way retailers can reach
potential customers is to bid on reaching them when they are near their stores; programmatic
marketplaces auction the outdoor locations of devices along with certain characteristics of their
owners in a process called geofencing. When known customers who use certain apps walk
through the stores, merchants can detect them in front of certain products and persuade them
with personalized messages and discriminatory pricing price discounts based on the store’s
profile of them and evaluation of their importance.
It should be clear that none of these activities is straightforward; the process of tracking,
profiling, and targeting is fraught with challenges for the entire digital ecosystem. Many in store
recognition systems, for example, depend on the shoppers’ downloading of an app (not neces
sarily the retailer’s app) that can interact with the location technologies in the store. (Facial
recognition and other biometric identification systems exist but have so far not been distributed
widely.) Apart from the need for shoppers’ participation in some in store tracking activities are
the difficulties the technology firms helping advertisers reach people with messages on websites
and apps have with click fraud and ad blocking. Click fraud takes place in pay per click
advertising when the owners of digital locales that post ads are paid an amount of money based
on the number of visitors to those domains who click on the ads. An unscrupulous site or app
owner will pay people or create automated scripts to click on ads simply to accumulate money
deceptively.9 Ad blocking is the act of using a type of software (less commonly, computer
hardware) to remove advertising content from a webpage or app.10 While industry players debate
the specific reasons for these activities, they agree they cause substantial economic losses.11
A report from Distil Networks estimated that in 2015, for every $3 spent on advertising, $1 went to
ad fraud, costing the industry about $18.5 billion dollars annually. 12 As for ad blocking, one
report estimated that around a quarter of Americans and Europeans block ads. A study from
Adobe and PageFair concluded that the number of people engaged in the activity worldwide
rocketed from roughly 21 million in 2009 to 198 million in 2015, with $21.8 billion in global ad
revenues stymied in the first six months of 2015.13

the rhetoric of trade-offs and relevance


These developments greatly worry US marketing organizations such as the Interactive Advertis
ing Bureau, and they have mobilized to address them.14 At the same time, the business’ leaders

9
[Link] fraud, accessed September 13, 2016.
10
[Link] blocking, accessed September 13, 2016.
11
George Slefo, “Report: For Every $3 Spent on Digital Ads, Fraud Takes $1,” Advertising Age, October 22, 2015, http://
[Link]/article/digital/ad-fraud-eating-digital-advertising-revenue/301017/, accessed September 13, 2016, accessed
September 13, 2016.
12
Greg Sterling, “Ad-Blocking Report,” Marketing Land, August 10, 2015, [Link]
nearly-200-million-users-22-billion-in-lost-ad-revenue-138051, accessed September 13, 2016.
13
Greg Sterling, “Ad-Blocking Report,” Marketing Land, August 10, 2015, [Link]
nearly-200-million-users-22-billion-in-lost-ad-revenue-138051, accessed September 13, 2016.
14
Ginny Marvin, “Google, Facebook, IAB & Major Brands Form Coalition for Better Ads,” MarketingLand, September
16, 2016, [Link] accessed September 16, 2016,
Americans and Marketplace Privacy 157

are confronting public advocates and policymakers who decry the very elements of digital
targeting the executives see as succeeding: the tracking of actual individuals, the digital dossiers
marketers create about them based on various forms of tracking, and the personalized messages
that circulate daily to hundreds of millions of people based on ideas about them that those
individuals don’t know and might not even approve. Marketing, retailing, database, and tech
nology executives and organizations such as the Interactive Advertising Bureau that represent
some of their firms insist the public accepts their activities even as they acknowledge some
Americans feel discomfort regarding the data firms gather about them.15
Central to their arguments is the privacy paradox. It’s the idea that, as New York Times reporter
Brad Stone put it, “normally sane people have inconsistent and contradictory impulses and
opinions when it comes to their safeguarding their own private information.” 16The Accenture
consulting firm highlighted this phenomenon in a March 2015 web survey of 1,000 US adults.
The firm found that while “nearly 60% of consumers want real time promotions and offers,”
only 20 percent want retailers to “know their current location” so the retailers could tailor those
offers.17 The McCann Worldwide advertising network’s Truth Central project underscored the
same contradiction from “a global research study surveying over 10,000 people in eleven
countries,” including the United States. Not breaking down the results by country or detailing
the survey method, McCann stated that while “71% worry about the amount online stores know
about them, 65% are willing to share their data as long as they understand the benefits for
them.”18 Such seeming contradictions have led firms, including McCann and Accenture, to
argue that what people do should trump what they say when it comes to marketers’ uses of their
data. The editor for mCommerceDaily interpreted the findings to mean “the tracking of
consumers all comes down to the trade off in value.”19 Along the same lines, the president
and chief strategy officer of Mobiquity, a mobile strategy consultancy, wrote in 2012 that “the
average person is more than willing to share their information with companies if these organiza
tions see the overall gain for end users as a goal, not just for themselves.”20 A May 2014 report by
Yahoo Advertising followed this logic in interpreting its survey of “6,000 respondents ages 13 64,
a representative sample of the U.S. online population.” It highlighted the finding, “Roughly two
thirds of consumers find it acceptable or are neutral to marketers using online behavior or
information to craft better ads.” Digitally connected Americans, the study concluded, “demon
strate a willingness to share information, as more consumers begin to recognize the value and
self benefit of allowing advertisers to use their data in the right way.”21 Industry executives further
argue that young adults some specifically say Millennials, the huge population of Americans

15
Pam Baker, “Shoppers OK with Online Tracking, Not So Much With In-Store Tracking,” FierceRetailIT, July 15,
2013, [Link] accessed
September 26, 2016.
16
Brad Stone, “Our Paradoxical Attitudes toward Privacy,” New York Times, July 2, 2008, [Link]
2008/07/02/our-paradoxical-attitudes-towards-privacy/, accessed April 6, 2015.
17
“US Consumers Want More Personalized Retail Experience and Control Over Personal Information, Accenture Survey
Shows,” Accenture, March 9, 2015, [Link]
[Link], accessed December 17, 2017,
18
“The Truth About Shopping,” McCann Truth Central, August 20, 2014, [Link]
uploads/2014/09/McCann Truth About Shopping [Link], accessed May 8, 2015.
19
Chuck Martin, “What the Shopper Gets Out of Being Tracked,” mCommerceDaily, May 28, 2014, [Link]
.[Link]/publications/article/226734/[Link], accessed May 8, 2015.
20
Scott Snyder, “Mobile Devices: Facing the ‘Privacy Vs. Benefit’ Trade-Off,” Forbes, August 3, 2012, [Link]
.com/sites/ciocentral/2012/08/03/mobile-devices-facing-the-privacy-vs-benefit-trade-off/, accessed May 8, 2015.
21
“The Balancing Act: Getting Personalization Right,” Yahoo! Advertising, May 2014, p. 11, [Link]
.com/Insights/[Link], accessed May 8, 2015.
158 Joseph Turow

born from 1980 through 1996 are much more likely to be comfortable with data capture
and the notion of trade offs.
Marketers also insist relevance is an argument for carrying out tracking and targeting despite
the audience’s professed discomfort. Virtually all search engines, social media, and retailers use
the term to justify to the public their digital tracking and targeting activities. Google’s “privacy
policy” states right up front, for example, that “When you share information with us, for example
by creating a Google Account, we can make those services even better to show you more
relevant search results and ads, to help you connect with people or to make sharing with others
quicker and easier.”22 Similarly, Facebook’s “data policy” states that “We use the information we
have [about you] to improve our advertising and measurement systems so we can show you
relevant ads on and off our Services and measure the effectiveness and reach of ads and
services.”23
Relevance is also part of the rhetoric that firms within the tracking and targeting ecosystem
use to increase the data exchange business. A contention going back to the dawn of the
commercial web is that the best way to engage possible customers with a brand is to follow
them with messages that are tailored to be as close as possible to their behaviors, interests,
backgrounds, and relationships.24 An Acxiom brochure tells potential clients to “Imagine being
able to match your customers at various life stages and across touch points to be an accurate,
comprehensive view.” It adds, “Now that you are marketing to them in personal and relevant
ways they are more willing to offer their trust and loyalty to your brand.”25 Carrying this idea
forward, a report from the Forrester Research consultancy urges marketers to link relevance to
trade offs. “Provide services that customers find useful,” a report exhorted, “and get back data on
product use and customer affinities.” Carrying this idea forward, the managing director of
Mobile at the Mindshare agency suggested that his clients persuade their audiences to give up
their data through use of wearable technologies (e.g., digital watches and exercise bands) by
asserting that they will find the data as relevant as the marketers. “The Truth will be present in
everything,” was the trade off rhetoric he said they should use to their targets. “You’ll know
everything about yourself and your loved ones if you opt in.”26

what the public knows and thinks of tracking,


relevance and trade-offs
This image of a powerful consumer making rational decisions in the face of considered concerns
has become a way for marketers and the publishers who serve them to claim to policymakers and
the media that Americans accept widespread tracking of their backgrounds, behaviors, and
lifestyles across devices. Industry actors, moreover, cite studies they say indicate that Americans,
while wary, are nevertheless are quite willing in actual circumstances to countenance robust
amounts of tracking, data storage, and profiling, in trade for messages and benefits personally

22
“Privacy Policy,” Google, August 29, 2016, [Link] accessed September 16, 2016.
23
“Data Policy,” Facebook, undated, [Link] accessed September 16, 2016.
24
See Joseph Turow, The Daily You (New Haven: Yale University Press, 2011), Chapter 2.
25
“Welcome to a World of Marketing to One,” Acxiom, 2016, [Link]
BRO-AbiliTec [Link], accessed September 16, 2016. See also Adobe, “New Day for TV Advertising,” paid (TBrand-
Studio) post in New York Times, 2016, [Link] nyt
2016-august-nytnative morein-adobe-0824–0831?module MoreInSection&version PaidPostDriver&region Footer&
pgType article&action click. accessed December 17, 2017.
26
“The Wearable Future,” PricewaterhouseCoopers US, [Link]
[Link], p. 42, accessed September 26, 2017.
Americans and Marketplace Privacy 159

relevant to them. Unfortunately the specific questions and methods of these studies often don’t
get reported with the findings, so it’s hard to evaluate the results carefully. Sometimes the
respondents are volunteers responding to web ads, and their comments have no statistical
relationship to the population as a whole. Connected to this difficulty with the reported research
is that the people they get to answer the questions are recruited online. It is quite possible that
volunteers to fill out online surveys are more comfortable with giving up data than is the
population at large. To cap it off, some of the survey results are inconsistent, and even marketing
executives are sometimes loath to fully champion the trade off view. An Accenture executive
interpreted his company’s survey to mean: “If retailers approach and market personalization as a
value exchange, and are transparent in how the data will be used, consumers will likely be more
willing to engage and trade their personal data.”27 The Bain consultancy was even more cautious
about results it had, saying “customers’ trust cannot be bought by companies offering compen
sation in exchange for selling or sharing personal data.”
So what do Americans know and what are their attitudes when it comes to commercial
tracking, trade offs, and personalization? Over the years, valuable noncommercial research has
touched on parts of this question. The focus here will be on the Annenberg National Internet
Surveys, conducted seven times from 1999 through 2015, because they have uniquely drilled
down into key issues relating to Internet commercialism. Together, they yield a wide as well as
deep view of Americans’ understanding of and attitudes toward the digital marketing world.
I created the surveys with the help of a team at the school and at times academic colleagues at
other academic venues, notably Berkeley Law School. Major polling firms Roper, ICR, and
Princeton Research Associates International asked the questions in twenty minute interviews
with random, nationally representative samples of (typically) 1,500 Americans, eighteen years
and older. The survey firms reached the people by phone landlines in the early studies and in
recent years a combination of cell phones and landlines to account for the rise of mobile phone
only households. We asked a number of questions across more than one survey to assess the
questions’ reliability and, sometimes, the extent of changing knowledge and attitudes. In our
reports we present the specific questions with the results.28
The answers are consistent, possibly surprising, and may contradict marketers’ key conten
tions. Below I present an overview of key interconnected findings from various surveys, leading
up to data that challenge the pro marketing implications of personalized relevance and trade
offs.
1. People are nervous / concerned / disagreeable about the data marketers hold about them and
their family members.

As Table 8.1 indicates, in the early surveys we asked about general nervousness and concerns.
Later surveys probed the extent of respondents’ agreement or anger regarding specific activities
by “companies” and “stores” as well as by Internet firms such as Facebook and Google. The
findings are quite clear and consistent across the decade and a half. In high percentages, people
acknowledge being nervous about firms having information about them, especially when the
firms retrieve it online. People believe that what companies know about them can hurt them
and find it unacceptable if a store charges a price based on what it knows about them, if an

27
“US Consumers Want More Personalized Retail Experience and Control over Personal Information, Accenture
Survey Shows,” Accenture, March 9, 2015, [Link]
[Link], accessed April 26, 2015.
28
See Joseph Turow et al., “Americans, Marketers, and the Internet, 1999–2012,” Annenberg School for Communi-
cation report, 2014, [Link] id 2423753, accessed November 29, 2015.
160 Joseph Turow

table 8.1 Percentage of Americans agreeing or agreeing strongly to statements about the data
marketers hold about them and their family members

1999: The internet is a safe place for my children 2005: What companies know about me won’t
to spend time: 26% hurt me: 17%
2000: I am nervous about websites having 2005: It’s OK if a store charges me a price based on
information about me: 73% what it knows about me: 8%
2000: I worry more about what information a 2009: It is ok for internet mapping websites, such as
teenager would give away to a website than a MapQuest or Google Maps, to include a photo of my
younger child under 13 would: 61% home on an online map: 40%
2003: I am nervous about websites 2012: If I found out that Facebook was sending me ads
having information about for political candidates based on my profile information
me: 70% that I had set to private, I would be angry: 75%
2003: I would worry more about what 2012: If I knew a website I visit was sharing
information a teenager would give away to a web information about me with political advertisers,
site than a younger child under 13 would. 58% I would not return to the site: 77%
2005: I am nervous about websites 2015: It is OK if Facebook links what people do when
having information about they access the internet on a laptop computer with
me: 79% what they do on their cell phone’s or tablet’s apps: 28%
2005: I am more concerned about giving away 2015: It’s OK if a store charges me a
sensitive information online than about giving price based on what it knows about
away sensitive information any other way: 65% me: 14%
2005: It would bother me if websites I shop at 2015: What companies know about me from my
keep detailed records of my buying behavior: 57% behavior online cannot hurt me: 25%

internet mapping firm shows a photo of their home, and if Facebook sends them political ads
based on their private profile information. (All of these take place regularly for many if not most
Americans.) We found that people fret about the safety of their children online. They worry
more about teens than children under age thirteen giving up data online. The responses also
show Americans specifically pushing back against a retail environment in which companies
collect personal information. In 2005 and 2015, low percentages of Americans (17 percent and 14
percent, respectively) agreed with the idea of a store charging prices based on what it knows
about them.
2. Most people know they are being tracked but Americans don’t grasp the complexity of what
happens to data behind the screen.

They don’t understand data mining, the way companies plumb and merge data about individ
uals to come to broader conclusions about them. Our 2005 study found that 80 percent of
Americans know that “companies today have the ability to follow my activity across sites on the
web,” and 62 percent know that “a company can tell I have opened its e mail even if I don’t
respond.” Similarly, in our 2003 study, 59 percent of adults who used the internet at home knew
that websites collect information about them even if they don’t register. They did not, however,
understand more granular aspects of the tracking activities. For example, 49 percent could not
detect illegal “phishing” the activity where crooks posing as banks or other firms send e mails
to consumers that ask them to reveal crucial information about their account. Yet when
presented with an everyday scenario of the way sites track, extract, and share information to
Americans and Marketplace Privacy 161

make money from advertising, 85 percent of the respondents did not agree to accept it even on a
site they valued. When offered a choice to obtain content from a valued site with such a policy
or pay for the site and not have it collect information, 54 percent of adults who go online at
home said that they would rather leave the web for that content than do either. Among the
85 percent who did not accept the policy, one in two (52 percent) had earlier said they gave or
would likely give the valued site their real name and e mail address the very information a site
needs to begin creating a personally identifiable dataset about them.
We did not repeat the scenario in subsequent years to see if this general awareness of tracking
along with a low level of understanding about its particulars still holds. But answers to true false
questions about the nature of internet activities indicate it does. In 2015, we found (almost exactly
as in 2003) that 61 percent of Americans know “that a company can tell I have opened its e mail
even if I don’t respond.” Further, 74 percent of Americans know that “it is possible for Facebook to
link what people do when they access the internet on a laptop computer with what they do on
their cell phones’ or tablets’ apps.” Yet in 2015 fully 66 percent got the phishing question wrong
much higher than in 2005. Moreover 64 percent did not know the answer (false) to the statement,
“If I want to be sure not to be tracked on my cell phone, I should clear the phone’s cookies.” By far
the strongest indication that most Americans still don’t understand the particulars of tracking and
its relation to data mining comes from the following consistent finding:

3. Most Americans do not understand the meaning of the label “privacy policy.”

In five separate national surveys from 2005 through 2015, we presented to our representative
sample of US adults a true false question regarding a statement about the meaning of the label
privacy policy. In 2014, the Pew Internet and Society program did the same.29 As Table 8.2
indicates, our surveys in 2009, 2012, and 2015 presented virtually the same false statement. In
2009 and 2012 we said, “If a website has a privacy policy, the site cannot share information about
you with other companies, unless you give the website your permission.” In 2015 the phrasing
was, “When a website has a privacy policy, it means the site will not share my information with
other websites or companies without my permission.” The Annenberg studies of 2003 and 2005,
as well as the Pew study, phrased the concept differently. They omitted the point about
permission, and the 2003 question asked whether the respondent agreed or disagreed with the
statement, as opposed to presenting the true false formulation of the later years.
Table 8.2 shows that over 50 percent of Americans in every year affirmatively chose the wrong
answer that is, answered true (or agree in 2003) rather than false, disagree, or don’t know. In two
cases, the percentage of people choosing the wrong answer reached above 70 percent, and in
two years it passed 60 percent. The years with relatively lower percentages involved the most
unusual approaches to the question: the 2003 request for an agree/disagree answer, and the
2014 Pew statement, which used an exceptionally strict formulation of the privacy policy’s
meaning via the phrase “ensures that the company keeps confidential all the information it
collects.” The overall impression is clear, though: despite different characterizations of the
label’s meaning, well over half of American adults in six surveys across thirteen years get
it wrong.
The most obvious implication of accepting the truth of these statements is that the label’s
mere presence on a site or app leads people to believe they are safe from unwanted sharing of

29
See Aaron Smith, “Half of Online Americans Don’t Know What a Privacy Policy Is,” Pew Research Center, December 4,
2014, [Link] accessed
September 21, 2016.
162 Joseph Turow

table 8.2 Percent of incorrect answers to privacy policy questions, 2003 2015

Survey Creator & % Incorrect


Sample Size* Phrasing Answer

2003 Annenberg When a web site has a privacy policy, I know that the site 59% agree or
(N=1,155) will not share my information with other websites or agree strongly
companies.
2005 Annenberg When a website has a privacy policy, it means the site will 71% true
(N=1,257) not share my information with other websites or
companies.
2009 Annenberg If a website has a privacy policy, it means that the site 73% true
(N=842) cannot share information about you with other companies,
unless you give the website your permission.
2012 Annenberg If a website has a privacy policy, it means that the site 65% true
(N=1,228) cannot share information about you with other companies,
unless you give the website your permission.
2014 Pew (N=1,034) When a company posts a privacy policy, it ensures that the 54% true
company keeps confidential all the information it collects
on users.
2015 Annenberg When a web site has a privacy policy, it means the site will 63% true
(N=1,399) not share my information with other websites or
companies without my permission.
* Does not include Don’t Know/Not Sure or No Response.

data with third parties. As noted earlier, that’s simply not the case. Search engines, social media
sites, and retailers routinely trade data about their visitors in one way or another. Another
possible consequence of many Americans’ belief in the label’s reassuring message is that they
don’t go on to try to read the actual documents and get a sense of the litany of surveillance
activities taking place behind the screen. The operative words are “try to read.” Lawyers see
privacy policies as contractual documents, designed (a number have told me) to broadly protect
the right of a site or app to carry out its business. The policies are filled with industry jargon such
as affiliates, third parties, tags, beacons, and cookies, and their very length makes them difficult to
digest, as Cranor and others have found.30 Nevertheless, the label may well diminish even the
chance of gaining knowledge about the tracking, profiling, selective targeting, and data sharing
that many firms carry out. Lack of knowledge is certainly the case among contemporary
American adults:

4. Large percentages of Americans don’t know the rules of the new digital marketplace, and they
overestimate the extent to which the government protects them from discriminatory pricing.

Our 2005 and 2015 surveys focused on these topics through true false questions. Slightly higher
percentages of American adults knew the correct answers in the latter year than in the former, so
perhaps a bit more of the population is getting conversant with the new digital commercial
30
See, for example, Jonathan A. Obar and Anne Oeldorf-Hirsch, “The Biggest Lie on the Internet: Ignoring the Privacy
Policies and Terms of Service Policies of Social Networking Services,” July 2016, [Link]
.cfm?abstract id 2757465; Solon Barocas and Helen Nissenbaum, “On Notice: The Trouble With Notice And
Consent.” In Proceedings of the Engaging Data Forum: The First International Forum on the Application and
Management of Personal Electronic Information (2009), [Link] SII
On [Link]; and Aleecia McDonald and Lorrie Faith Cranor, “The Cost of Reading Privacy Policies,”
I/S: A Journal of Law and Policy for the Information Society, 4:3 (2008).
Americans and Marketplace Privacy 163

world. The lack of knowledge about basic topics is, however, still widespread. Our 2015 study
noted that

• 49 percent of American adults who use the internet believe (incorrectly) that by law a
supermarket must obtain a person’s permission before selling information about that
person’s food purchases to other companies
• 69 percent do not know that a pharmacy does not legally need a person’s permission to sell
information about the over the counter drugs that person buys;
• 55 percent do not know it is legal for an online store to charge different people different
prices at the same time of day;
• 62 percent do not know it is legal for an offline or physical store to charge different people
different prices at the same time of day; and
• 62 percent do not know that price comparison sites like Expedia or Orbitz are not legally
required to include the lowest travel prices;
Despite this pervasive misunderstanding of the rules of their digital commercial environment,
Americans have consistent and strong opinions when these issues are brought to their attention.
Especially noteworthy, and not at all predictable, are their attitudes about various forms of
personalization that are increasingly hallmarks of that new world:
5. Most people don’t think personalization by marketers in general or retailers in particular is a
good thing, especially when they are told how the data for personalization are obtained.

We found definitively in the 2010 survey that, contrary to what many marketers claim, most adult
Americans do not want advertisements, news, and discounts “tailored to their interests.” The
percentages were 66 percent, 57 percent, and 51 percent, respectively. Even the people who
accepted the idea pushed back when they found out how the personalization is carried out. We
presented these people with three common ways marketers gather data in order to tailor ads
“tracking the website you have just visited,” “tracking you on other websites,” and “tracking you
offline (for example, in stores).” We found that when people understand how their data are
mined for personalization, the overall percentage objecting to the activity skyrockets. For
example, when the people who earlier approved of tailored advertising learned the data would
come from tracking them on the website they had just visited, the percentage of all respondents
saying they don’t want personalized ads jumped to 73 percent. When told the data would come
from watching what they had done offline, the total number saying no jumped even higher, to
86 percent. With discounts, the corresponding percentages were 62 percent and 78 percent, and
with news, resistance rose from 71 percent to 85 percent.
Our 2012 survey replicated these results with respect to consumer ads, discounts, and news. Our
aim in that election year was to add political advertising to the list. The findings in that sphere
showed even more definitive public distaste for personalization: fully 80 percent of Americans said
they do not want political campaigns to tailor advertising to their interests. When the relatively few
who agreed to political tailoring were then told the data would come from following them on the
website they had just visited, the total number saying no to personalization jumped to 89 percent.
When told the data would come from watching three of their activities offline purchases in
stores, political party affiliation, and whether they voted in the past two elections the total
disapproving hit 93 percent, 90 percent, and 91 percent respectively.
Americans’ distaste for tracking and the personalization that often goes along with it was also
reflected in answers to the 2010 survey that presented them with targeting activities and asked
whether or not (given the choice) they would allow them. Only 12 percent said that they would
164 Joseph Turow

allow marketers to follow them “online in an anonymous way in exchange for content,” only
10 percent said they would allow marketers to “share information about [their] internet use in
exchange for free content,” and only 9 percent said they would allow marketers to “use infor
mation about [their] internet activities to deliver advertisements on [their] cell phone or video
game system in exchange for free content.” Of course, internet marketers carry out these
activities continually (and did back in 2010). The respondents’ answers suggest that if they had
their way these activities would not take place. Americans similarly indicated their dislike for
common political tracking and personalization scenarios. In response to questions during our
2012 survey, 64 percent of Americans said their likelihood of voting for a candidate they support
would decrease (37 percent said decrease a lot, 27 percent said decrease somewhat) if they
learned a candidate’s campaign organization buys information about their online activities and
their neighbor’s online activities and then sends them different political messages it thinks will
appeal to them. In a similar vein, 77 percent of Americans agreed (including 35 percent who
agreed strongly) that “if I knew a website I visit was sharing information about me with political
advertisers, I would not return to the site.”
A broad set of findings, then, call into question marketers’ claim that people value the
personalization that, marketers argue, leads to relevant commercial messages. Additional find
ings from the surveys push back against another major pillar of marketers’ justification of their
tracking activities: that Americans agree that they are getting value in trading their data for free
content and other internet benefits. To the contrary, we found that

6. Most people philosophically do not agree with the idea of trade offs

In 2015 we presented a random cross section of Americans everyday circumstances where


marketers collect people’s data. We phrased the situations as trade offs and learned that very
many feel those trade offs are unfair.

• 91 percent disagreed (77 percent of the total sample strongly) that “if companies give me a
discount, it is a fair exchange for them to collect information about me without my
knowing”;
• 71 percent disagreed (53 percent strongly) that “it’s fair for an online or physical store to
monitor what I’m doing online when I’m there, in exchange for letting me use the store’s
wireless internet, or Wi Fi, without charge”; and
• 55 percent disagreed (38 percent strongly) that “it’s okay if a store where I shop uses
information it has about me to create a picture of me that improves the services they
provide for me.”
Further analysis of these responses indicates that a very small percentage of Americans agrees
with the overall concept of trade offs. In fact, only about 4 percent agreed or agreed strongly with
all three propositions. If we use a broader definition of a belief in trade offs the average value of
all three statements even then only 21 percent of the respondents accept the idea. Despite this
principled disagreement with trade offs, we found in our survey that many would act as if they
agreed with them. We presented the people interviewed with a real life trade off case, asking
whether they would take discounts in exchange for allowing their supermarket to collect infor
mation about their grocery purchases. We found that 43 percent said yes to trade offs there
more than twice as many who believe in the concept of trade offs according to the broader
definition. Underscoring the inconsistency, we found that 40 percent of the people who said
they would accept the grocery discount deal did not agree with the third trade off statement
listed earlier, even though the type of exchange it suggests is similar.
Americans and Marketplace Privacy 165

So what is going on? Why the inconsistency? Based on examining the findings carefully, our
answer is:

6. Contrary to the claim that a majority of Americans consent to discounts because the commer
cial benefits are worth the costs, we find a new explanation: a large pool of Americans feel
resigned to the inevitability of surveillance and the power of marketers to harvest their data.
When they shop it merely appears they are interested in trade offs.

The meaning of resignation we intend is, to quote a Google dictionary entry, “the acceptance of
something undesirable but inevitable.”31 And, in fact, our 2015 study reveals that 58 percent of
Americans agreed with the statement “I want to have control over what marketers can learn
about me online,” while at the same time they agreed “I’ve come to accept that I have little
control over what marketers can learn about me online.” Rather than feeling able to make
choices, Americans believe it is futile to manage what companies can learn about them.
People who are resigned do not predictably decide to give up their data. Watching what
shoppers do doesn’t reveal their attitude. We found that while people who believe in trade offs
are quite likely to accept supermarket discounts, we couldn’t predict whether a person who is
resigned to marketers’ data gathering activities would accept or reject the discounts. Marketers
want us to see all those who accept discounts as rational believers in trade offs. But when we
looked at those surveyed who agreed to give up their data for supermarket discounts, we found
that well over half owere resigned rather than being believers in trade offs. Ironically, and
contrary to many academic claims about the reasons people give up their information, those
who knew the most about these marketing practices were more likely to be resigned. Moreover,
resigned people’s decisions to accept supermarket discounts even when the supermarket collects
increasingly personal information were also positively related to knowledge. When it comes to
protecting their personal data, then, our survey found those with the wherewithal to accurately
calculate the costs and benefits of privacy are likely to consider their efforts futile.

7. Sex, education, income, race, and age tend not to separate Americans substantially when it
comes to privacy issues.

Age and gender showed no differences with regard to resignation. When it came to education
and race, statistically significant differences did appear: higher resignation percentages for
Whites compared to non Whites and for more educated people compared with respondents
with a high school education or less. Yet those comparisons still showed that one half or more of
the individuals in most categories of respondents were resigned. The same tendency showed up
with regard to our questions relating to knowing the meaning of privacy policy as well as those
about tracking and serving tailored political ads: despite some differences, most Americans of all
these categories revealed misplaced assurance about giving up their data when they see the
privacy policy label. Similarly, concern with aspects of tailored or targeted political advertising
never fell below 50 percent for any of the social groupings and was frequently far above that
proportion. In fact, the proportions of demographic segments saying no were typically in the
80 90 percent range with respect to the central question about the desire for tailored political
advertising.
It is important to stress that being a young adult (aged twenty two to twenty four) makes little
difference compared to other age groups when it comes to views on resignation, understanding of
privacy policy, and attitudes toward tracking and tailoring with regard to political advertising.

31
“Resignation,” Google, [Link] rd ssl#q resignation, accessed May 18, 2015.
166 Joseph Turow

Marketing executives are wont to claim that young people “are less concerned with maintaining
privacy than older people are.”32 One reason is that media reports teem with stories of young
people posting salacious photos online, writing about alcohol fueled misdeeds on social network
ing sites, and publicizing other ill considered escapades that may haunt them in the future. Some
commentators interpret these anecdotes as representing a shift among Millennials (those born
between 1980 and 1996) and Generation Z (those born after 1996) in attitude away from infor
mation privacy compared to older Americans.
The findings noted earlier contradict this assertion, at least with regard to the young cohort of
Millennials. A deeper analysis of our 2009 data with colleagues from Berkeley Law School found
that expressed attitudes towards privacy by American young adults (aged eighteen to twenty four)
are not nearly as different from those of older adults as many suggest. With important exceptions,
large percentages of young adults are in harmony with older Americans when it comes to
sensitivity about online privacy and policy suggestions. For example, a large majority of young
adults

• said they refused to give information to a business where they felt it was too personal or not
necessary;
• said anyone who uploads a photo of them to the internet should get their permission first,
even if the photo was taken in public;
• agreed there should be a law that gives people the right to know all the information websites
know about them; and
• agreed there should be a law that requires websites to delete all stored information about an
individual.
In view of these findings, why would so many young adults act in social networks and elsewhere
online in ways that would seem to offer quite private information to all comers? Some research
suggests that people twenty four years and younger approach cost benefit analyses related to risk
differently than do individuals older than twenty four.33 An important part of the picture,
though, must surely be our finding that higher proportions of eighteen to twenty four year olds
believe incorrectly that the law protects their privacy online and offline more than it actually
does. This lack of knowledge in a tempting environment, rather than a cavalier lack of concern
regarding privacy, may be an important reason large numbers of them engage with the digital
world in a seemingly unconcerned manner.

concluding remarks
Over the past two decades, large sectors of the advertising and retailing industries have built
opaque surveillance infrastructures to fuel ever escalating competitions aimed at the efficient
targeting of messages to likely customers. When critics and policymakers have confronted them
about these activities, industry representatives have often played down the concerns. They
present a portrait of a nation of rational shoppers uneasy about marketers’ data capture in the
abstract but at the same time quite aware of the data transfer taking place and willing to give up

32
Ariel Maislos, chief executive of Pudding Media, quoted in Louise Story, “Company Will Monitor Phone Calls to Tailor
Ads,” New York Times, September 24, 2007, available at: [Link]
33
See Margo Gardner and Laurence Steinberg, “Peer Influence on Risk Taking, Risk Preference, and Risky Decision
Making in Adolescence and Adulthood: An Experimental Study,” Developmental Psychology 41:4 (2005), 625–635. No
one 23 or 24 years of age was in the sample; and Jennifer Barrigar, Jacquelyn Burkell, and Ian Kerr, “Let’s Not Get
Psyched Out of Privacy,” Canadian Business Law Journal, 44:54, pp. 2006–2007.
Americans and Marketplace Privacy 167

information about themselves in exchange for relevant information and discounts. The Annen
berg surveys over the past decade and a half indicate consistently that this view of the American
public is highly flawed. Most people are certainly nervous about data marketers capture about
them, but they do not display the reasoned, knowing perspective the marketing establishment
claims. They don’t understand the particularities of tracking alarmingly, most cannot detect a
description of phishing or basic aspects of data mining. They consistently mistake the label
privacy policy for an assurance that marketers will not use information gleaned about them
without their permission. And, surely in part because they misunderstand privacy policy, they
believe it is illegal for marketers to gather data and personalize prices in ways that are actually
quite common.
Even if Americans are approaching the new commercial digital world through rational lenses,
then, our surveys show that the knowledge they have to apply their reasoning is slim and in
important cases wrong. The Annenberg Internet Surveys also reveal the fallacy in marketers’
insistence that individuals accept the idea of trade offs. Quite the contrary: Americans reject the
idea of trade offs as unfair even as they sometimes act in their shopping as if they are making the
trade off between the qualms they have about giving up information and the desires they have to
get good deals. We found, instead, that for most the explanation for giving up the data is
resignation. They see no opportunity to get what they really want from digital marketers the
ability to control what the marketers know about them. In that context, their unpredictable
responses to the blandishment of discounts reflects futility rather than rational trade off strategies.
The resignation finding was not unanticipated. Members of our team have long heard discussions
by people around us that reflect a dislike of marketers’ surveillance, a powerlessness to do anything
about it but a need to adapt to the new environment. (Often the discussion relates to Facebook.)
Scholarly descriptions have also suggested resignation is at work in the new surveillance world.34 In
the face of markers’ quite different claims, systematic national surveys show that most Americans do
not feel in control of the information commercial entities have about them and would like to be.
The logical next question is how to change that. One solution is that regulators should force greater
corporate transparency and more personalized controls on data use. Unfortunately, the Annenberg
surveys suggest that most people’s knowledge about digital commerce is so low that it is highly
unrealistic to expect them to get and keep up to speed with the ever changing data collecting
dynamics of the advertising and retailing environments. Another solution is for regulators to step in
to do that with the public interest in mind. Here we confront a key dilemma that has bedeviled the
Federal Trade Commission and other government bodies: the nature of public harm involved in the
retrieval of individuals’ data for commercial purposes. In spite of a growing literature on the social as
well as the individual functions of privacy, US regulatory bodies still define harm in terms that
require evident and often quantifiably determined injury. Fifteen years of Annenberg national
surveys strongly suggest that Americans dispute this view of harm. Americans plainly resist data
collection and personalization based on data collection because they don’t think it is the right thing
to do even if they are not confronted with immediate and/or quantifiable harm. It will be useful to
learn more about this set of expectations. Clearly, however, a desire for dignity, control, and respect
pervades many of the responses we received about data use over the years. Implementing this desire
should be a crucial next step for marketers, regulators, and others interested in a twenty first century
where citizens themselves have respect for the emerging new commercial sphere and the govern
ment officials who regulate it.

34
See, for example, Julie E. Cohen, aa (New Haven: Yale University Press, 2011), p. 108; and Helen Nissenbaum,
Privacy in Context (Stanford: Stanford University Press, 2007), pp. 1, 3, 20, and 65.
9

The Federal Trade Commission’s Inner Privacy Struggle

Chris Jay Hoofnagle

introduction
At the Federal Trade Commission (FTC), all privacy and security matters are assigned to a
consumer protection economist from the agency’s Bureau of Economics (BE). The BE is an
important yet often ignored element of the FTC. Advocates and others operating before the
commission have been inattentive to the BE, choosing to focus instead on persuading commis
sioners on policy matters, and staff attorneys, on case selection. This chapter shows how the BE’s
evaluative role is just as important as attorneys’ case selection role.
This chapter describes the BE, discusses the contours of its consumer protection theories,
and discusses how these theories apply to privacy matters. I explain why the FTC, despite
having powerful monetary remedy tools, almost never uses them: this is because the BE sees
privacy and security injuries as too speculative, because the FTC’s lawyers prefer settlement
for a variety of logistical and strategic reasons, and because the FTC’s remedies come too late
to deter platform age services. The BE is also skeptical of information privacy rights because
of their potential impact on innovation policy and because privacy may starve the market of
information. In this, the BE hews to certain interpretations of information economics,
ignoring research in traditional and behavioral economics that sometimes finds benefits
from the regulation of information. Not surprisingly, calls for the BE to expand its role from
case evaluation to case selection typically come from those wishing to curb the
FTC’s privacy expanding enforcement agenda. Those calls may be strategic, but are not
without merit.
We should expect President Donald Trump’s administration to expand the role of the BE and
to make its role more public. With newfound powers, the BE will argue that more cases should
be pled under the unfairness theory. This will have the effect of blunting the lawyers’ attempts to
expand privacy rights through case enforcement.
But the answer is not to avoid the BE’s preferred pleading theory. Instead, we need to foster a
BE that can contemplate invasions of privacy and security problems as causing injuries worthy of
intervention and monetary remedy. This chapter concludes with a roadmap to do so. Among
other things, the roadmap includes the consideration of existing markets for privacy as a proxy for
the value of personal information. For example, tens of millions of Americans pay money to
keep nonsensitive information, such as their home address, secret. Additionally, the FTC’s civil
penalty factors, which consider issues such as how to deny a defendant the benefits from illegal

168
The Federal Trade Commission’s Inner Privacy Struggle 169

activity, could justify interventions to protect privacy and security. Finally, the BE could explore
how existing information practices have inhibited the kinds of control that could lead to a
functioning market for privacy.

the bureau of economics


The Bureau of Economics is tasked with helping the FTC evaluate the impact of its actions by
providing analysis for competition and consumer protection investigations and rulemakings, and
by analyzing the economic impact of government regulations on businesses and consumers.
With commission approval, the BE can exercise spectacular powers. The BE can issue compul
sory processes to engage in general and special economic surveys, investigations, and reports.
Congress required the BE to perform some of its most interesting recent privacy activities, such
as a study of accuracy in consumer reports. The study found that 13 percent of consumers had
material errors in their files, meaning that tens of millions of Americans could be affected by
inaccuracy in their credit reports.1
The BE is divided into three areas focusing on antitrust law, research, and consumer
protection. About eighty economists educated at the PhD level work for the BE. Twenty two
economists and eight research analysts are tasked to the over 300 attorneys focused on the
consumer protection mission. The economists help design compulsory process, evaluate evi
dence collected from process, provide opinions on penalties to be levied in cases, conduct
analyses of cases independent of the lawyers, serve as expert witnesses, support litigation, and
provide perspective on larger policy issues presented by enforcement. In this last category, the
BE has been an important force in eliminating state laws that restrict certain types of price
advertising.2
By deeply integrating principles of cost benefit analysis in the FTC’s decision making, the BE
has a disciplining effect on the agency’s instinct to intervene to protect consumers.3 As former
Chairman William E. Kovacic and David Hyman explained, the BE “is a voice for the value of
competition, for the inclusion of market oriented strategies in the mix of regulatory tools, and for
awareness of the costs of specific regulatory choices . . . BE has helped instill within the FTC a
culture that encourages ex post evaluation to measure the policy results of specific initiatives.”4
According to Kovacic and Hyman, this disciplining effect is good. The duo explains that the
BE’s tempering role stops the agency from adopting an interventionist posture, warning that
sister agencies (such as the Consumer Financial Protection Bureau) may become overzealous
without economists acting in an evaluative role.
The most comprehensive history of the BE was written in 2015 by Dr. Paul Pautler, longtime
FTC employee and deputy director of the BE.5

1
FTC, Section 319 of the Fair and Accurate Credit Transactions Act of 2003: Third Interim Federal Trade
Commission Report to Congress Concerning the Accuracy of Information in Credit Reports (Dec. 2008).
2
For a general discussion of these contributions, see Janis K. Pappalardo, Contributions by Federal Trade Commission
Economists to Consumer Protection: Research, Policy, and Law Enforcement, 33(2) J. Pub. Pol’y & Mktg 244 (2014).
3
Jonathan Baker, Continuous Regulatory Reform at the Federal Trade Commission, 49(4) Admin. L. Rev. 859 (1997).
4
David A. Hyman & William E. Kovacic, Why Who Does What Matters: Governmental Design, Agency Performance,
the CFPB and PPACA, 82 Geo. Wash. L. Rev. 1446 (2014).
5
See Paul A. Pautler, A History of the FTC’s Bureau of Economics, AAI Working Paper No. 15–03, ICAS Working Paper
2015–3 (Sept. 2015).
170 Chris Jay Hoofnagle

The BE’s Conceptions of Consumer Injury


If the BE is skeptical of privacy harms, why is it that the FTC brings so many privacy cases
without evidence of pure fraud or out of pocket monetary loss? The answer is that staff level
FTC lawyers have broad discretion in target selection, and the lawyers have focused on
expanding pro privacy norms through enforcement. Privacy enforcement has often focused on
large, mainstream, reputable companies such as Google, Facebook, and Microsoft rather than
more marginal companies.
While the lawyers select the cases, the economists evaluate them and recommend remedies to
the Commission. The BE has developed substantial policy thinking surrounding remedies. The
BE wishes to achieve deterrence both specific and general with an emphasis on avoiding
over deterrence. This is tricky because risk of detection affects deterrence, and the FTC’s small
staff means that the vast majority of illegal practices will go undetected and unremedied. One
thus might conclude that penalties should be massive, but large penalties might cause others to
overinvest in compliance, making the entire economy less efficient.6
A number of factors are considered in the difficult calculus of balanced remedies. The BE
weighs options that could make the consumer whole, by putting the consumer in the position
she occupied before the illegal transaction. BE also considers how a deception shapes demand
for a product, thereby inducing individuals to buy who would not make a purchase absent an
illegal practice, or whether customers paid more for a product because of a deception.
In its evaluative activities, the BE’s lodestar is “consumer welfare” and its economists claim
that they have no other social agenda in their activities. The BE’s approach “has traditionally
focused on fostering ‘informed consumer choice’ in well functioning markets.”7
The special dynamics of personal information transactions make it difficult for the BE to
justify monetary remedies in privacy cases. Consider a fraud where consumers are promised an
18 karat gold trinket but are delivered a 10 karat one. The FTC can easily calculate the injury to
the consumer based on the price differential between the two products. A market exists that
clearly differentiates between these products and assigns a higher price to the 18 karat object.
The transaction is a simple, bounded one.
Turning to privacy cases, the calculation is not as simple. Many services provided to a
consumer lack a price tag because they are “free.”8 The alleged deception might be unrelated
to price, but rather to a subtle feature, such as the degree of publicity given to some fact about
the user. Complicating matters is that the boundaries of the transaction are unclear because
services change over time, and in the process, shift consumer expectations and desires.
Furthermore, individual privacy preferences vary. Some consumers may never have con
sidered privacy attributes in their service selection or may not care a great deal about privacy.
Unlike something as salient as the purity of a gold object, specific information uses may not
enter into the consumer’s awareness when selecting a service. These uses of information may
never come into the consumer’s mind until something goes wrong. When that happens, users
often cannot point to an economic injury from unwanted disclosures. All of these problems are
compounded by the fact that many services do not offer an alternative, “privacy friendly” feature
set or comparative price point.

6
Mallory Duncan, FTC Civil Penalties: Policy Review Session (1980).
7
Paul A. Pautler, A Brief History of the FTC’s Bureau of Economics: Reports, Mergers, and Information Regulation, 46
Rev. Ind. Org. 59 (2015).
8
John M. Newman, The Myth of Free, 86 Geo. Wash. L. Rev. (2017).
The Federal Trade Commission’s Inner Privacy Struggle 171

The above discussion shows that assigning a dollar value to a privacy violation is not a simple
exercise. But other dynamics cause the BE to be skeptical of information privacy cases more
generally.9 This skepticism is expressed both in methods and in larger ideological issues. For
instance, lawyers may point to surveys as evidence of privacy harm, but the BE systematically
dismisses survey research in this field, because decisions about privacy can implicate complex
short and long term trade offs that are not well presented in surveys. Sometimes economists will
argue that consumer behavior belies stated preferences for privacy. One oft stated rationale is
that if consumers really cared about privacy, they would read privacy notices.
Ideologically the BE has had a reputation of hewing to conservative economic norms.10 This
may be in part a problem of disciplinarity. Pautler’s 2015 history of the BE notes that when it
became active in consumer protection in the 1970s, economists had just started considering the
topic. Similarly, Ippolito’s 1986 survey only cites to three pre 1970 works on consumer protection
economics.11 This narrow view is strange, because the consumer protection literature spanned
the 20th century, often describing economic problems in the language of psychology or
marketing. Popular, pro advertising works, such as David Ogilvy’s Confessions of an Advertising
Man (1963), provide credible insights about consumer psychology, decision making, and the
effect of FTC regulation. Similarly, Samuel Hopkins Adams’s 1905 work explains the economic
conflicts that prevented market forces from policing patent medicines.12 Yet these kinds of works
are not defined as being in the discipline.
Aside from a narrow view of disciplinary relevance, the literature has a conservative lens.
Scanning BE literature reviews, the notable omissions are liberal and even centrist works on
consumer protection Albert Hirschman, Arthur Leff, Arthur Kallet and F.J. Schlink, Ralph
Nader, David A. and George S. Day’s multi edition compilations on “consumerism,” and the
“ghetto marketplace” research (some of which was generated by the BE itself ) of the 1960s.
President Reagan’s appointment of economist James Miller to the chairmanship of the FTC
in 1981 also added to the BE’s reputation as conservative. The Miller era leadership strengthened
the FTC in some ways, making it more enforcement oriented. But Miller also scaled back many
consumer protection efforts and pursued aggressive policies reflecting great faith in contractual
freedom.13 Miller installed economists in consumer protection leadership positions to influence
how the agency weighed case policy. Also, relevant to today’s debates about case selection,
Miller turned away from normative causes that the FTC might have pursued in favor of policing
pure fraud cases.
Wendy Gramm was a director of the BE during Miller’s tenure. To get a taste of the flavor of
Miller era consumer protection policy, consider Gramm’s defense of debt collection tools such
as the “blanket security interest.” These were agreements that empowered creditors to show up at
debtors’ homes and seize household goods unrelated to the debt. The record showed that some
9
Peter P. Swire, Efficient Confidentiality for Privacy, Security, and Confidential Business Information, Brookings
Wharton Papers on Financial Services 306 (2003)(“. . . based on my experience in government service, graduate
training in economics is an important predictor that someone will not ‘get’ the issue of privacy protection.”).
10
Patrick E. Murphy, Reflections on the Federal Trade Commission, 33(2) J. of Pub. Pol’y & Mktg 225 (2014)(The
economists had a “more conservative mindset [than the lawyers]; in general, they were more reluctant to support
cases unless some economic harm could be proved. There seemed to be an ongoing battle between these two
groups.”).
11
Pauline M. Ippolito, Consumer Protection Economics: A Selective Survey, in Empirical Approaches to Consumer
Protection Economics pp. 1–33 (Pauline M. Ippolito and David T. Scheffman, eds)(1986).
12
Among other reasons, advertisers banned publishers from running anti-patent-medicine content. See Samuel Hopkins
Adams, The Patent Medicine Conspiracy against the Freedom of the Press, Collier’s, in The Great American
Fraud pp. 147 (American Medical Association) (n.d).
13
Thomas O. McGarity, Freedom to Harm: The Lasting Legacy of the Laissez Faire Revival (2013).
172 Chris Jay Hoofnagle

creditors lorded the agreements over debtors, causing debtors psychological terror through the
risk of arbitrary seizure of their possessions, most of which were valueless and would not satisfy
the debt obligation. But Gramm reasoned that consumers accepted blanket security agreements
in order to send important signals about the commitment to repay. If consumers really wanted to
avoid the risk of their things being seized, perhaps they would shop elsewhere for credit. If
denied the choice to agree to security agreements, perhaps consumers could not get credit at all.
There are three important points about the Miller era BE ideology. First, institutions are
shaped by people. The BE is typically directed by an academic economist with impeccable
credentials.14 But a thesis of my book, Federal Trade Commission Privacy Law and Policy is that
FTC staff, who often remain at the agency for decades, have a profound role, one perhaps more
powerful than even the appointed political leaders of the FTC.15 The current staff leadership of
the BE’s consumer protection division all joined the FTC in the 1980s. Miller, and his similarly
oriented successor, Daniel Oliver, hired all three of the economists currently responsible for
privacy and security cases.
Second, one should not confuse Miller era policy instincts with mainstream economics.
I expand on this point in the next part of this chapter. For now, it is sufficient to observe that
support for privacy and security rights and rules can be found outside the sometimes maligned
field of behavioral economics.16 The BE marches to a different drum and has not incorporated
scholarship from traditional economic fields that finds benefits to social welfare from privacy.
Third, the Miller era emphasis on contractual freedom and consumer savvy frames consumer
harm as a foreseeable risk assumed by calculating, even wily consumers. Returning to the
example of the blanket security interest, through the Gramm/Miller lens, consumers in such
relationships agreed to be subject to the indignity of having their things taken. The mother who
had her baby furniture taken17 may be harmed, but on the other hand there is some risk of moral
hazard if consumers think the government might intervene in private ordering. When public
attention turned to the unfairness of blanket security agreements, Gramm commented, “Con
sumers are not as ignorant as you might suspect.”18 Translated into consumer privacy, this
attitude holds that consumers are happy to enjoy the benefits of free services that trade in
personal information and have calculated the risks flowing from these services.
Finally, the Miller era had a partial revival with the election of President George W. Bush,
who appointed Timothy Muris, a protégé of James Miller, as chairman in 2001. An eminently
qualified Chairman, Muris focused the FTC on a “harms based” approach. This approach was
shaped by concerns about innovation policy, and in part by a kind of naïve belief in the power of
information to lead markets to correct decisions.19 A trade in personal information is necessary
and indeed beneficial for enabling modern economic infrastructures, such as consumer
reporting. Thus, the harms based approach allowed information flows presumptively, and

14
FTC Office of the Inspector General, Evaluation of the Federal Trade Commission’s Bureau of
Economics, OIG Evaluation Report 15 03 (June 30, 2015).
15
Chris Jay Hoofnagle, Federal Trade Commission Privacy Law and Policy p. 82 (Cambridge Univ. Press
2016).
16
Alessandro Acquisti, Curtis R. Taylor, & Liad Wagman, The Economics of Privacy, 54(2) J. Eco. Lit. 442 (Jun. 2016).
17
Federal Trade Commission, Credit practices: final report to the Federal Trade Commission and
proposed trade regulation rule (16 CFR part 444) (1980).
18
Michael deCourcy Hinds, The Rift over Protecting Consumers in Debt, p. F8, N.Y. Times, May 8, 1983.
19
Muris was part of a chorus of thinkers who downplayed the risks of the housing bubble, arguing that richer
information in credit reporting enabled safe lending to “underserved” (i.e. subprime) prospects. See e.g. Fred H.
Cate, Robert E. Litan, Michael Staten, & Peter Wallison, Financial Privacy, Consumer Prosperity, and the Public
Good (AEI-Brookings Joint Center for Regulatory Studies 2003).
The Federal Trade Commission’s Inner Privacy Struggle 173

intervention was limited to situations where “harm” was present. “Harm,” a thorny concept that
seemingly has expanded over time, was never defined in a satisfying way. The Bush FTC found
telemarketing to be “harmful” and adopted a dramatic policy intervention for sales calling: the
National Do Not Call Registry. Yet, when it came to privacy, the Bush FTC’s idea of harm did
not justify adoption of a rights based framework. Pautler marks 2008 as the end of the harms
based era.
Using the Freedom of Information Act, I obtained training materials for the BE and a
literature review of privacy papers apparently used by the BE during the harms based approach
era. Some of the microeconomic work showing the costs to consumers from a lack of privacy
protection, as well as work in behavioral economics or law regarding consumer challenges in
shopping for privacy, make no appearance in the paper list including articles by some of the
best known scholars in the field and articles published in journals familiar to economists who
work on consumer protection.20 Instead, the BE’s literature had a distinctly laissez faire bent,
with the first paper listing the product of an industry think tank supported by five and six figure
donations from telecommunications companies and Silicon Valley firms.

The Bureau of Economics versus the Bureau of Consumer Protection


There is tension between the lawyers of the Bureau of Consumer Protection (BCP) and the
economists of the BE over consumer injury, and thus case selection.21 It is not obvious why
lawyers and economists would be at loggerheads over damages in consumer cases. Lawyers are
comfortable allowing judges and juries to determine damages for inherently subjective injuries,
such as pain and suffering, and the loss of marital consortium. The law also provides remedy for
mere fear of harm (such as assault).22
Yet, economists may have an even broader view of harm than do lawyers. As Sasha Roma
nosky and Alessandro Acquisti explain, “economic considerations of privacy costs are more
promiscuous [than those of tort law]. From an economic perspective, the costs of privacy
invasions can be numerous and diverse. The costs and benefits associated with information
protection (and disclosure) are both tangible and intangible, as well as direct and indirect.”23
Romanosky and Acquisti’s observation positions economists as potentially more open to
recognizing consumer injury than are lawyers. Their point is growing in persuasiveness as legal
impediments to consumer lawsuits expand, particularly those requiring more proof of “injury” to
gain standing, and thus jurisdiction in court. In a case decided by the Supreme Court in 2016,
several information intensive companies argued that they should not be subject to suit unless the
consumer suffers financial injury even if the company violates a privacy law intentionally.24
Many consumer lawsuits for security breaches and other privacy problems have been tossed out

20
Surprising omissions from the list include, James P. Nehf, Shopping for Privacy on the Internet, 41 J. Consumer Aff.
351 (2007); Alessandro Acquisti & Hal R. Varian, Conditioning Prices on Purchase History, 24(3) Mktg. Sci.
367 (2005).
21
Joshua L. Wiener, Federal Trade Commission: Time of Transition, 33(2) J. Pub. Pol’y & Mktg 217 (2014)(“Prior to
working at the FTC, I naively thought in terms of FTC versus business. I quickly learned that a more adversarial
contest was lawyers versus economists.”).
22
Ryan Calo, Privacy Harm Exceptionalism, 12(2) Colo. Tech. L. J. 361 (2014); Ryan Calo, The Boundaries of Privacy
Harm, 86 Ind. L. J. 1131 (2011).
23
Sasha Romanosky and Alessandro Acquisti, Privacy Costs and Personal Data Protection: Economic and Legal
Perspectives, 24(3) Berk. Tech. L. J. 1060 (2009).
24
See amicus curie brief of eBay, Inc., Facebook, Inc., Google, Inc., and Yahoo! Inc. in Spokeo v Robins, No. 13–1339
(SCT 2015).
174 Chris Jay Hoofnagle

of court on jurisdictional grounds for lacking “injury”25 but economists may view these same
cases as meritorious.
The BE sees each case selected as an important policy decision. From the BE’s lens, those
policy decisions should focus not on rule violations, but on the harm suffered. The BE’s
approach is thus more evaluative of and more critical of legal rules. The BE wants to see some
detriment to consumer welfare as a result of rule breaking. This reflects a revolution in thinking
about business regulation also present in the FTC’s antitrust approaches. With per se antitrust
rule violations out of favor, the FTC focuses now on rule of reason style approaches with more
evaluation of consumer harm. In addition to reflecting policy commitments, adopting a harm
approach empowers the economists structurally, because a focus on harm causes economists to
be more deeply involved in consumer protection cases.26
Lawyers on the other hand are more moralistic, and likely to view a misrepresentation as an
inherent wrong. Lawyers are trained and indeed ethically bound to uphold legal processes. In
fact, many lawyers see the prosecution of cases as merely being “law enforcement,” and are
unwilling to acknowledge the policy issues inherent in case selection, as the BE correctly does.
The problem with the lawyers’ approach is that the law can be applied inefficiently and
produce perverse outcomes. The lawyers’ approach can be rigid and out of touch with the
market. The problem with the economists’ approach is that it can supplant democratic pro
cesses. The word “harm” appears nowhere in Title 15 of the US Code, which governs the FTC,
yet the economists have read the term into the fabric of the agency. Sometimes democratic
processes create ungainly regulatory approaches, but setting these aside and reading harm into
the statute is governance by philosopher king rather than rule of law.
The BE has a more academic culture than the BCP as well. Since at least the 1990s, the
economists have been able to obtain leave for academic positions and for academic writing.27
The economists are free to express their opinions, and even press them in situations where they
are in disagreement with the FTC’s actions. This internal questioning can cause attorneys to
think that the economists are not fully participating in the consumer protection mission, and
instead frustrating it by trying to engage in academic discourse about situations attorneys see as
law enforcement matters.
Attorneys know that the agency’s hand is weakened in litigation when it is apparent that a
matter is controversial within the FTC. Attorneys also see economists as serving in an expert
witness role, a service function that should be deferential to the strategic decisions of the
litigators. Kenneth Clarkson and former Chairman Timothy Muris explain: “The economists’
role is controversial. Many attorneys, sometimes even those the top of the bureau, are dissatisfied
with the economists’ substantive positions, with their right to comment, and what they perceive
as the undue delay that the economists cause.”28 But if they truly are to be independent advisors,
the kind accepted by courts as legitimate experts, the economists need to have the very comforts
that attorneys find discomforting.

25
Lexi Rubow, Standing in the Way of Privacy Protections: The Argument for a Relaxed Article III Standing Requirement
for Constitutional and Statutory Causes of Action, 29 Berkeley Tech. L.J. 1007, 1008 (2014).
26
Paul A. Pautler, A History of the FTC’s Bureau of Economics, AAI Working Paper No. 15–03, ICAS Working Paper
2015–3 (Sept. 2015).
27
Paul A. Pautler, A History of the FTC’s Bureau of Economics, AAI Working Paper No. 15–03, ICAS Working Paper
2015–3 (Sept. 2015).
28
Kenneth W. Clarkson & Timothy J. Muris, Commission Performance, Incentives, and Behavior 280–306, in The
Federal Trade Commission Since 1970: Economic Regulation and Bureaucratic Behavior (Kenneth
W. Clarkson & Timothy J. Muris, eds., 1981).
The Federal Trade Commission’s Inner Privacy Struggle 175

The lawyers’ instinct to intervene also causes tension between the BCP and the BE.
Economists are more likely to take a long view of a challenge, allowing the marketplace to
work out the problem even where the law prohibits certain practices or gives the agency tools to
redress the problem. The BE may also trust that consumers are more sophisticated in advertising
interpretation than the lawyers do.
Beliefs about consumer sophistication and the ability to leave advertising representations to
the market can go to extremes, however. Consider John Calfee, a long time expert with the
American Enterprise Institute and former Miller era BE advisor. Calfee thought that most
regulation of advertising was perverse and thus consumer advocates harmed the public interest
by attempting to police it. To press the point, he used cigarette advertising the bête noir of
consumer advocates as a model. He argued that the cigarette industry’s own health claims
actually undermined tobacco companies. For instance, an advertising claim that there was, “Not
a single case of throat irritation due to smoking Camels,”29 is interpreted differently by con
sumers and lawyers. Lawyers assume that consumers are more ovine than vulpine. A lawyer
views the claim as a simple form of deception that should not appear in advertising. But
according to Calfee, consumers may read the same sentence and think that cigarettes are
generally dangerous after all, at least some of them cause throat irritation.
In Calfee’s view cigarette advertising that mentioned any health issue taught consumers that
all smoking was unhealthful. In fact, no amount of regulation could tell consumers about
smoking’s danger more effectively than the very ads produced by tobacco companies. According
to Calfee, FTC regulation caused tobacco companies to stop mentioning health completely,
and the industry’s advertising became less information rich. In short, Calfee argued that
regulation caused smoking to be portrayed in a kinder light.30 But to the more legalistic culture
of the BCP, Calfee’s reasoning rejects the FTC’s statutory mandate of preventing deceptive
practices and false advertising.
Perhaps the different views of consumer sophistication also explain why the FTC has not
updated guidelines on various forms of trickery for decades. The guidelines surrounding the use
of the word “free” were introduced in 1971 and never updated. The “bait and switch” and “price
comparison” (“sales” that misrepresent the regular price of an item) guidance have never been
updated since their introduction in 1967. Within the commission, there is fear that updating
these different informational remedies would cause them to be watered down by the BE. Yet,
any user of the internet can see that free offers, bait and switch marketing, and fake price
comparisons are rampant online.
Finally, the lawyers too can steer the FTC away from monetary awards and other dramatic
remedies. Pursuing such remedies may force the agency into litigation. The FTC is a risk averse
litigant because it has more to lose from bad precedent than do other actors, such as class action
attorneys. The burdens of litigation can consume precious staff attorney time, slowing down or
even stopping the investigation of other cases. In addition, a 1981 study by Sam Peltzman found
that FTC actions, even those without civil penalties, have a dramatic, negative effect on

29
R. J. Reynolds Corp., Not One Single Case (1947) in Stanford School of Medicine, Stanford Research into the
Impact of Tobacco Advertising, available at [Link] main/[Link]?token2 fm st069
.php&token1 fm [Link]&theme file fm [Link]&theme name Scientific%20Authority&subtheme
name Not%20One%20Single%20Case.
30
John H. Calfee, Fear of Persuasion (1997); Posner too expressed qualified support for this reasoning, and argued
that low-tar, improved filters, and new technology, such as lettuce-based cigarettes, might reduce the harms of
smoking. See ABA, Report of the ABA Commission to Study the Federal Trade Commission (Sept. 15,
1969)(Separate Statement of Richard Posner).
176 Chris Jay Hoofnagle

respondents.31 FTC attorneys may thus feel satisfied that respondent companies are punished
enough by the bad press and legal bills that invariably come from a settled case.

the bureau of economics’ economics of privacy and security


The FTC has resolved over 150 matters involving privacy and security, using its authority to bring
cases against deceptive and unfair trade practices. The BE is involved in every case to a varying
degree. Under the Federal Trade Commission Act, “unfair practices” clearly call for cost benefit
analysis. The FTC has to show that a practice causes “substantial injury” and that it is not
outweighed by benefits to consumers or competitors. This balancing between injury and
benefits is nicely suited to economists’ strengths.
The FTC’s power to police deception is less burdened than the unfairness test. There is
essentially no balancing involved, because across ideological lines, deception is believed to harm
consumers and the marketplace. Because deception cases are easier to bring indeed, only
consumer “detriment” need be proven instead of injury it is no surprise that the FTC relies on
its deception power when wading into new areas, such as privacy. Doing so has another strategic,
internal benefit for the lawyers: framing a wrong as deceptive essentially circumvents the BE.
Deception cases receive much less economic attention.
There is growing tension at the FTC surrounding cases where the lawyers clothe unfairness
cases in deception garb. Typically, this happens where a company engages in normatively
objectionable behavior and some minor deception is present. The FTC enforces against the
deception in order to quash the normatively objectionable practice. For instance, consider the
2015 [Link] matter, where the FTC brought an administrative action against a company that
created a website that allowed users to rate people as “jerks.” The FTC’s basis for the matter was
the false representation that the site was based on organic, user generated content, when in
reality, the profile data were scraped from Facebook. In another case, a company tracked
consumers by monitoring unique identifiers emitted from phones. The company, Nomi,
violated the law not because it tracked people, but because it failed to live up to promises of
providing notices of its activities.
Why would anyone care about whether [Link]’s data were organically generated user
content? Why would anyone care about whether Nomi faithfully posted privacy notices? The
real issue underlying these cases is our normative commitment to privacy: Do we really want
websites that label people jerks or companies that collect unique identifiers from phones? The
unfairness theory better fits the privacy problems presented by Jerk and Nomi. But the BCP
lawyers realized that if they styled these practices as unfair, the BE would have to be convinced
that overall consumer welfare was harmed by their activities. The easily satisfied deception
power gave the FTC a simple path to policing these objectionable practices.
Returning to unfairness, the FTC has alleged such substantial injury in dozens of privacy and
security cases. For instance, many FTC security cases involve the exposure of millions of credit
card, debit card, and checking account identifiers. Yet, only a handful of security cases have
involved monetary remedies of any type.
FTC observers might conclude that the lack of fines can be attributed to the agency’s limits
on civil penalties (for the most part, the FTC cannot levy civil penalties in privacy and security
matters). But the FTC has a broad range of monetary and other remedies in addition to civil

31
Sam Peltzman, The Effects of FTC Advertising Regulation, 24(3) J. L. Econ. 403 (Dec. 1981).
The Federal Trade Commission’s Inner Privacy Struggle 177

penalties. It can seek restitution, redress, disgorgement, asset freezes, the appointment of
receivers, and the recession of contracts.
There are several reasons why various remedies go unused. First, the BE does not believe
there is a market for privacy. This leads the BE to undervalue privacy wrongs. Without some
kind of penalty, companies may find it economically efficient to violate privacy, in particular
because privacy violations are so difficult to detect. Second, the BE’s focus on providing
information to the consumer at service enrollment finds its roots in standard, physical product
marketing. Today, the approach is antiquated and deficient because so many transactions are
based on personal information, with the ultimate goal of establishing a platform rather than
selling a specific product or service. The next sections explain these problems in greater detail.

No Monetary Damages in a World with No Privacy Market


The BE’s methods of evaluating relief drive monetary penalties to zero in most privacy matters.
And even where civil penalties are applied, they tend to be too low to serve retributive or
deterrent goals. One illustration comes from the agency’s case against Google. In it, Google was
found to have deceived users of the Apple Safari browser by tracking these users despite
promising not to. Google was fined $22.5 million, one of the largest privacy related recoveries
by the commission.32 Google’s behavior was intentional, and the company was already under a
consent decree for other privacy violations (thus making it possible for the FTC to apply civil
penalties, as explained above).
Google derived clear benefits from tracking Apple users. Apple is a luxury brand in technol
ogy, thus Apple users are attractive to advertisers. In addition, eroding Apple’s efforts to shield its
users from Google tracking may have been strategically and psychologically valuable. Detection
of Google’s tracking required forensic analysis on a specific kind of software, and thus there was
little risk that regulators would discover the practice. Google clearly had the ability to pay a
much larger fine. In a way, the fine created incentives for bad behavior by setting such a low
penalty for intentional misbehavior.
To a BE analyst the fine could be seen as disproportionately high. Consumers do not pay with
money when they use search engines, and there is no option to pay extra to avoid the kind of
tracking that Google used. Thus, the market did not set a price to avoid Google’s deception.
While millions of consumers who use both Safari and Google would have been affected by the
practice, perhaps few of them had ever read Google’s privacy policy, known of Google’s
statements on the matter, or even chosen Safari because of its privacy features. Only a small
number were actually deceived by the representation and subsequent tracking. In sum, the
practice justified a relatively small fine because any price on the tracking would be speculative,
and because many who were tracked probably did not care about it. The absence of any kind of
monetary damages in this and other privacy cases points to a general inability of the BE to
consider privacy invasion a harm in itself.

Economic Reasoning for Physical World Products in the Platform Age


The BE’s privacy work appears still to operate in a pre platform economy era, with a fixation on
price and on the information available to the user at enrollment in a service rather than on the
complex interdependencies that develop between users and services as time goes on (this is not

32
In the Matter of Google, FTC File No. 102 3136 (2011).
178 Chris Jay Hoofnagle

true of the BE’s antitrust work).33 For instance, a 2014 BE working paper modeled a market in
which privacy policies were transparent and well understood by consumers two key assump
tions refuted by a wealth of research in consumer privacy.34 The BE authors concluded that that
under the two assumptions, a competitive marketplace could provide consumers privacy
options.35
But the 2014 study is important for an entirely separate reason. The study reveals the shading of
the BE’s privacy lens. Recall from section 2 that the BE’s economics is not necessarily
“traditional,” but rather grounded in relatively conservative economic work. This is reflected
in the 2014 study’s references. Reading over those references, one sees little overlap with the
literature discussed in Acquisti et al., The Economics of Privacy.36 Instead, the authors refer to the
above mentioned training materials and the advertising literature rather than the privacy
literature.
Two problems emerge from the BE’s view of the literature. First, it ignores the diverse array of
traditional and empirical economic work that explores the potential welfare gains from privacy
protection. Second, the focus on the economics of advertising is misplaced because privacy
policies are not like price or product attribute advertising. Privacy features are much more
complex, hidden, and most importantly, changeable. Today’s technology market is not so much
about an individual, discrete product. Instead, consumers are bargaining with platforms that are
attempting to mediate many different aspects of consumer experience. These platforms are
trying to influence how consumers understand and expect rights from technology.
If firms are strategic, they will compete to both capture benefits and deny them to competitors.
Through this lens, Google’s tracking of Safari users could be motivated by a desire to capture
benefits from tracking, but also to deny Apple the ability to compete on privacy. Denying Apple
the competitive benefit could also affect the psychology of consumers, leading them to think no
company can protect privacy. This is what Joe Farrell has called a dysfunctional equilibrium,37 a
situation in which no firm is trusted to deliver on privacy, and therefore no one can compete
on it.
Companies that are competing to be the dominant platform are constantly changing the
bargain with the consumer through continuous transactions over time. Platforms build huge
user bases with promises of privacy, often ones that distinguish the company from competitors
on privacy. Once a large user base is obtained and competitors trumped, the company switches
directions, sometimes adopting the very invasive practices protested against.38

33
Although according to a critique by Maurice Stucke and Allen Grunes, antitrust authorities have systematically
avoided examining the consumer-side of multi-sided transactions in data-driven mergers and acquisitions, leading to a
focus on competitive effects on advertisers but not on privacy and quality issues that affect consumers. Maurice E.
Stucke & Allen P. Grunes, Big Data and Competition Policy 103–4, 114, 153–154, 224 (Oxford Univ. Press 2016).
34
Daniel J. Solove, Privacy Self-Management and the Consent Dilemma, 126 Harv. L. Rev. 1880 (2013); Aleecia M.
McDonald & Lorrie Faith Cranor, The Cost of Reading Privacy Policies, 4 I/S J. L. Pol’y Info. Soc’y 543, 564 (2008);
James P. Nehf, Shopping for Privacy on the Internet, 41 J. Consumer Aff. 351 (2007); George R. Milne, Mary J.
Culnan, & Henry Greene, A Longitudinal Assessment of Online Privacy Notice Readability, 25 J. Pub. Pol’y
Marketing 238, 243 (2006) (based on the growing length and complexity of privacy policies, a user would have to
read eight pages of text per competitor to evaluate their privacy choices); Paul M. Schwartz, Internet Privacy and the
State, 32 Conn. L. Rev. 815 (2000).
35
Daniel P. O’Brien & Doug Smith, Privacy in Online Markets: A Welfare Analysis of Demand Rotations, FTC Bureau
of Economics Working Paper No. 323 (Jul. 2014).
36
Alessandro Acquisti, Curtis R. Taylor, & Liad Wagman, The Economics of Privacy, 54(2) J. Econ. Lit. 442 (Jun. 2016).
37
Joseph Farrell, Can Privacy Be Just Another Good, 10 J. Telecomm. High Tech. L. 251 (2012).
38
Paul Ohm, Branding Privacy, 97 Minn. L. Rev. 907 (2013)(describing the “privacy lurch”).
The Federal Trade Commission’s Inner Privacy Struggle 179

Network effects, lock in, and the power of platforms to shift user expectations enable dramatic
policy lurches. But the BE’s tools, forged in the era of valuing jewelry, the sizes of television
screens, and so on, need adaptation to be applied to the problems posed by internet services.
In fact, the BE approach militates against remedy, because of the bureau’s method for analysis
of marketplace effects of remedies. Simply put, remedies are unlikely to be effective by the time
the FTC gets involved, investigates a case, and litigates it. The delay involved in FTC processes
gives respondents time to establish their platform and shut out competitors. By the time these
steps are achieved, the BE is correct to conclude that remedies are likely to improve privacy
options in the marketplace because no competitors are left standing.

how academics could help shape the be’s privacy efforts


The BE is proud of its engagement with the academic community. Unlike BCP attorneys, BE
economists have biographies online that feature academic publications. BE economists also
have academic traditions, such as taking leave from the FTC to visit at a college. The BE holds
an annual conference on microeconomics open to outside academics. The President Trump
administration is likely to elevate the role of the BE, making it more central to case selection, but
also more public. The BE’s posture gives academics opportunities to shape and expand the
FTC’s privacy outlook.

Documenting the Market for Pro Privacy Practices


There are tremendous opportunities for research that would assist the BE and the American
consumer. Inherently, the BE’s monetary relief calculations are impaired because it perceives
there to be no market for pro privacy practices. Academics could document the contours of the
privacy market where it currently exists, most notably, in the privacy differential between free,
consumer oriented services and for pay, business oriented services.
One example comes from Google, which offers a free level of service for consumers and
another for businesses that is $5 a month. Google explains, “Google for Work does not scan your
data or email . . . for advertising purposes . . . The situation is different for our free offerings and
the consumer space.”39 Of course privacy is just one feature that flows from the $5 charge, yet it
serves as evidence that the market puts some value on the avoidance of communications
surveillance (Google’s representation concerns the actual scanning of data and not just absence
of advertising). Such surveillance must involve human review of e mail at times in order to train
advertising targeting systems. The inferences from automated scanning could contribute to
Google’s competitive intelligence.40 Those who owe confidentiality duties to customers or
clients need communications privacy, and so some portion of that $5 could be interpreted as
a valuation of privacy.
Elucidating areas where some valuation of privacy exists particularly in business to business
scenarios where actors actually read policies and have the resources and time to protect rights
could help establish a value for privacy.
Another source for harm signals comes from the plaintiff bar, which has developed methods
for measuring how consumers conceive of the value of personal information. For instance, in
one case involving the illegal sale of driver record information, an economist polled citizens to

39
Google, Google for Work Help: Privacy (2016), [Link] en.
40
Maurice E. Stucke & Allen P. Grunes, Big Data and Competition Policy (Oxford Univ. Press 2016).
180 Chris Jay Hoofnagle

explore what kind of discounts they would accept in renewing their driver’s license in exchange
for this information being sold to marketers. In the state in question, drivers had to pay a $50 fee
to renew their license. However, 60 percent of respondents said they would reject an offer of a
$50 discount on their license in exchange for allowing the sale of their name and address to
marketers.41 Meanwhile, the state was selling this same information at $0.01 per record.
This survey method represented a plausible, real life, bounded expense concerning infor
mation that is not even considered sensitive. Now, one may object to the survey as artificial
consumers, when presented in the moment with a $50 discount, may behave differently and
allow the sale of personal information. But on the other hand, given the prevalence of domestic
violence and stalking among other problems, it seems obvious that many drivers would be
willing to pay $0.01 to prevent the sale of this information to others. There is thus some value to
this information. There is also increased risk of harm to those whose home address is spread to
others indiscriminately. The data could be copied endlessly and resold to entities not in privity
with the state, making it impossible for people to trace stalkers or swindlers back to the sale of
personal information by the state.
Some economists have studied the value of privacy options to individuals. Perhaps the most
popular privacy option of all time was the FTC’s establishment of the Telemarketing Do Not
Call Registry. In the 1990s, technological advances in telemarketing made it easier for sales
callers to ring many numbers at the same time, changing the fundamental dynamics of
telemarketing. As Peter Swire explained, these calls externalized costs to consumers who were
displeased with the calling, but also may have reduced the value of having a phone in general,
because defensive techniques to avoid unwanted callers, such as call screening and not
answering the phone, could get in the way of desirable calls.42 One could also account for the
positive value to consumers from avoiding these calls. Professor Ivan Png estimated this value to
households as being between $13 and $98. Png’s low estimate for the welfare created by
telemarketing avoidance was $1.42 billion.43
Apart from telemarketing, there are many examples where individuals pay money in order to
have enhanced information privacy options. For instance, while many people consider address
information public, some homeowners take considerable expense to protect this information.
“Land trusts” are used extensively by the affluent to shield home addresses from real estate
websites and public records. Similarly, the private mailbox is a significant expense, often used to
shield home addresses from marketers and others. One’s listing in the phone book has been
public for decades, yet about 30 percent of Americans pay $1.25 to $5.50 a month to unlist this
information. The expenses from these interventions add up. Consider that paying the minimum
unlisting fee for 10 years would be $150. Private mailboxes can cost more than that in a single
year. These expenditures demonstrate that for tens of millions of Americans, privacy is worth real
money, even for the protection of “public” data.
Finally, sophisticated actors use legal agreements in order to prevent secondary use of
personal information. The New York Times reported in 2015 that Silicon Valley technology
executives who scoop up information with the most alacrity use nondisclosure agreements in
many contexts where domestic workers are employed.44

41
Richard Fresco v. Automotive Directions Inc, et al. 2004 WL 3671355 ([Link].)(expert affidavit of Henry Fishkind).
42
Peter P. Swire, Efficient Confidentiality for Privacy, Security, and Confidential Business Information, Brookings
Wharton Papers on Financial Services (2003).
43
I.P.L. Png, On the Value of Privacy from Telemarketing: Evidence from the “Do Not Call” Registry (white paper)
(Sept. 2007).
44
Matt Richtel, For Tech Titans, Sharing Has Its Limits BU4, N.Y. Times, Mar. 14, 2015.
The Federal Trade Commission’s Inner Privacy Struggle 181

More Emphasis on the FTC’s Civil Penalty Factors


A second area ripe for documenting injury in privacy cases comes from the economic dynamics
in the FTC’s civil penalty factors, which must be considered when the FTC seeks fines.45 The
factors inherently call for economic perspective and could be used more prominently in case
evaluation. This article largely is a critique of the BE’s emphasis on the second civil penalty
factor: the injury to the public from the illegal practice. Courts consider four other factors, three
of which could also benefit from academic analysis.
One factor concerns the “desire to eliminate the benefits derived by a violation.” Recall the
discussion earlier concerning the differences between physical world products and platform era
services. In an era of platforms, denying the benefits of an illegal practice is a much more
complex effort than addressing physical world swindles. A physical world swindle often can be
cured by the reputational effects of a FTC action combined with disgorgement and restitution to
victims. However, platform economy actors use a form of bait and switch that allows them to
benefit from the momentum gained from a large base of subscribers who took the bait.
Both Facebook and Google are platforms that benefitted from a bait and switch. Facebook
attracted a huge user base with promises of exclusivity and control but then relaxed these very
features. The company changed its disclosure settings, making user profiles dramatically more
public over time, while masking its own economic motives with claims that users wanted to be
“more open.” By the time Facebook made its major privacy changes in 2009, it had such a
command of the market and such powerful network effects that users could not defect.
Google announced its search engine wearing opposition to advertising and its influence on
search on its sleeve. The company’s founders promised revolutions in both search and advertis
ing. Google even presented its search service as more privacy protective than those of competi
tors because it did not take users’ browsing history into account when delivering search results.46
Consider how different the Google approach is today. It quietly started using behavioral data
in search without telling the public.47 It runs paid search ads prominently at the top of organic
search results mimicking the very thing it considered evil in the 1990s. Google even uses
television style commercials on YouTube but these new commercials are worse because they
can automatically pause if not kept in focus and because they track you individually.
Academics could provide research on just how much intervention is needed to address these
platform era bait and switches. Some of the tools used to police natural monopoly may be
appropriate.
The interventions may need to be severe to undo the momentum gained from platform status.
Consider the findings of a study written in part by two BE authors on the Suntasia Marketing
case. That company enticed consumers with “free” trial offers to reveal their checking account
numbers, but then Suntasia made many small, fraudulent charges on the checking accounts.
The court allowed Suntasia to continue business but, in the process, the court segmented

45
Several courts have approved a five-factor test for evaluating the reasonableness of FTC penalties: “(1) the good or bad
faith of the defendants; (2) the injury to the public; (3) the defendant’s ability to pay; (4) the desire to eliminate the
benefits derived by a violation; and (5) the necessity of vindicating the authority of the FTC.” United States v. Reader’s
Digest Ass’n, Inc. [1981] 662 F.2d 955, 967 (3d Cir.).
46
Chris Jay Hoofnagle, Beyond Google and Evil: How Policy Makers, Journalists and Consumers Should Talk Differently
about Google and Privacy, 14(4) First Monday (2009), [Link]
47
Recall that Google presented its search services, which did not track users over time, as a privacy-friendly alternative to
competitors. When Google changed strategies and used historical search data for targeting results, it did so secretly
and the shift was discovered by an outside analyst. Saul Hansell, Google Tries Tighter Aim for Web Ads, C1,
N. Y. Times, Jun. 27, 2008.
182 Chris Jay Hoofnagle

Suntasia’s consumers into two groups, thereby setting up a natural experiment. Some Suntasia
customers had to opt in to stay subscribed, while others were retained unless the customer opted
out. Almost all of the customers who were required to opt in let their subscriptions cancel. But
only about 40 percent of those given opt out notices canceled, and thus the remainder kept on
being charged for “essentially worthless” products. Minorities from low socioeconomic status
(SES) areas were 8 percent less likely to opt out than whites in high SES areas.48 These findings
speak to the idea that companies in continuous transactions with the consumer (such as
platforms or companies that possess personal information) may require dramatic intervention
to deny the benefits from deceptive practices.
Another civil penalty factor concerns whether the respondent company acted in good or bad
faith. This raises the need for research into what kinds of fines are enough to deter bad faith or
whether fines can deter at all. Deterrence may vary based on industry, and on the size and
maturity of the respondent company.
The final civil penalty factor concerns “the necessity of vindicating the authority of the FTC.”
Inherently, this factor considers respect for the law and for the consumer. The law presumes that
natural and fictitious people are rational actors and that they respond sensibly to incentives and
disincentives. Yet, we impose fines with almost no due process or economic analysis against
natural persons for many violations of the law. The criminal law imposes drastic penalties on
individuals even though many criminals lack the capacity to act rationally. Administrative
penalties, such as the $50 parking ticket for forgetting to pay a $1 meter fee, are keyed to
municipal revenue goals rather than economic loss to society or actual proof of harm. Oddly,
such a disproportionate penalty would never survive constitutional review if applied against a
company.
Turning to wrongdoing by companies, an economic analysis of harm and other factors is
appropriate. But there is something substantively unfair and strange in how these analyses result
in recommendations for no monetary penalties. The FTC need only make a “reasonable
approximation” when specifying monetary relief,49 and thus need not surmount a particularly
high threshold to find that damages are in order. In addition, companies receive ex ante legal
advice and engage in serious planning when deciding what to do with data. Many privacy lapses,
such as Facebook’s settings changes, are deliberate in a way that criminal acts and parking mater
lapses are not. It would seem that economic actors would make the best case for monetary
penalties in order to engender respect for the law.

Fostering A Market for Privacy


Finally, the BE could explore ways to foster a market for privacy. Part of that effort should
concern the FTC’s traditional approach of ensuring effective disclosures to consumers. But the
more difficult challenge comes in addressing industry players who do not have incentives to
fairly use data. For instance, data brokers engage in practices, such as reverse data appends, that
render consumers’ attempts at selective revelation ineffective. That is, reverse appends make it
impossible to avoid having a retailer learn personal information about a consumer. The BE
could use its empirical might to study how these information flows in the data broker market

48
Robert Letzler, Ryan Sandler, Ania Jaroszewicz, Issac T. Knowles, and Luke M. Olson, Knowing when to Quit:
Default Choices, Demographics and Fraud, 127 Econ. J. 2617–2640 (2017). doi:10.1111/ecoj.12377.
49
FTC v. [Link] Corp, 475 F. App’x 106, 110 (9th Cir. 2012).
The Federal Trade Commission’s Inner Privacy Struggle 183

undermine alternatives that could result in better incentives and business practices more in line
with consumer preferences.
Another area for rethinking BE approaches comes from behavioral economics. As early as
1969, Dorothy Cohen called for the creation of a “Bureau of Behavioral Studies,” with the
mission of gathering and analyzing data on “consumer buying behavior relevant to the regula
tion of advertising in the consumer interest.”50 The BE embraced this recommendation in
several areas,51 most visibly in false advertising. In the 1970s, the FTC began a program where
marketing professors were embedded in the BCP. This led to greater sophistication in the
interpretation of advertising, and, perhaps, the first agency use of copy testing (the evaluation of
consumer interpretation of advertising by survey and lab experiments) in a matter.52
Today, when analyzing what a person might understand from a marketing representation, the
FTC is quite humanistic in its outlook. It does not limit itself to disciplinary borders. It eschews
rational choice theories and the idea that the consumer reads the small print. The FTC focuses
on the overall impression of an advertisement. It acknowledges that consumers are not perfectly
informed, and that they have limited resources to investigate advertising claims. However, this
expansive view of consumer behavior and the subtleties of information products does not appear
to have informed the BE’s own privacy work.

conclusion
This chapter has provided an overview of the Bureau of Economics, explained its case evalu
ation role in relationship to the lawyers’ case selection role, and summarized reasons why BE
economists might conclude that there is no harm from many privacy disputes.
The BE is key to effective enforcement of consumer privacy. Academics and advocates should
pay more attention to this important institution because it shapes how privacy and security is
protected. In the President Trump administration, it is likely to have a more public role, and it
will perform cost benefit analysis in more privacy cases. Helping the BE see economic injury in
privacy and security violations could strengthen the agency’s agenda and introduce disgorge
ment and restitution in matters currently settled with no monetary damages. The BE could also
map an enforcement strategy that stimulates a market for privacy, one that helps consumers
assign a different value to the attention and data they pour into “free” online services.

50
Dorothy Cohen, The Federal Trade Commission and the Regulation of Advertising in the Consumer Interest, 33(1)
J. Mktg 40 (1969).
51
Consider the multidisciplinary approach taken in the FTC’s tome on information remedies. FTC, Consumer
Information Remedies: Policy Review Session (1979).
52
William L. Wilkie, My Memorable Experiences as a Marketing Academic at the Federal Trade Commission, 33(2)
J. Pub. Pol’y & Mktg 194 (2014).
10

Privacy and Human Behavior in the Information Age*

Alessandro Acquisti, Laura Brandimarte, and George Loewenstein

If this is the age of information, then privacy is the issue of our times. Activities that were once
private or shared with the few now leave trails of data that expose our interests, traits, beliefs, and
intentions. We communicate using e mails, texts, and social media; find partners on dating sites;
learn via online courses; seek responses to mundane and sensitive questions using search engines;
read news and books on the cloud; navigate streets with geotracking systems; and celebrate our
newborns, and mourn our dead, on social media profiles. Through these and other activities, we
reveal information both knowingly and unwittingly to one another, to commercial entities, and
to our governments. The monitoring of personal information is ubiquitous; its storage is so durable
as to render one’s past undeletable (1) a modern digital skeleton in the closet. Accompanying the
acceleration in data collection are steady advancements in the ability to aggregate, analyze, and
draw sensitive inferences from individuals’ data (2).
Both firms and individuals can benefit from the sharing of data once hidden and from the
application of increasingly sophisticated analytics to larger and more interconnected databases
(3). So too can society as a whole, for instance, when electronic medical records are combined
to observe novel drug interactions (4). On the other hand, the potential for personal data to be
abused for economic and social discrimination, hidden influence and manipulation, coer
cion, or censorship is alarming. The erosion of privacy can threaten our autonomy, not merely
as consumers but as citizens (5). Sharing more personal data does not necessarily always translate
into more progress, efficiency, or equality (6).
Because of the seismic nature of these developments, there has been considerable debate
about individuals’ ability to navigate a rapidly evolving privacy landscape, and about what, if
anything, should be done about privacy at a policy level. Some trust people’s ability to make self
interested decisions about information disclosing and withholding. Those holding this view tend
to see regulatory protection of privacy as interfering with the fundamentally benign trajectory of
information technologies and the benefits such technologies may unlock (7). Others are
concerned about the ability of individuals to manage privacy amid increasingly complex trade
offs. Traditional tools for privacy decision making such as choice and consent, according to this
perspective, no longer provide adequate protection (8). Instead of individual responsibility,
regulatory intervention may be needed to balance the interests of the subjects of data against
the power of commercial entities and governments holding that data.

* This chapter previously appeared as Acquisti A, Brandimarte L, Loewenstein G, Privacy and human behavior in the
age of information, Science vol. 347, no. 6221 (2015), 509–514.

184
Privacy and Human Behavior in the Information Age 185

Are individuals up to the challenge of navigating privacy in the information age? To address
this question, we review diverse streams of empirical privacy research from the social and
behavioral sciences. We highlight factors that influence decisions to protect or surrender privacy
and how, in turn, privacy protections or violations affect people’s behavior. Information tech
nologies have progressively encroached on every aspect of our personal and professional lives.
Thus, the problem of control over personal data has become inextricably linked to problems of
personal choice, autonomy, and socioeconomic power. Accordingly, this chapter focuses on the
concept of, and literature around, informational privacy (that is, privacy of personal data) but
also touches on other conceptions of privacy, such as anonymity or seclusion. Such notions all
ultimately relate to the permeable yet pivotal boundaries between public and private (9).
We use three themes to organize and draw connections between streams of privacy research
that, in many cases, have unfolded independently. The first theme is people’s uncertainty about
the nature of privacy trade offs, and their own preferences over them. The second is the powerful
context dependence of privacy preferences: The same person can in some situations be oblivious
to, but in other situations be acutely concerned about, issues of privacy. The third theme is the
malleability of privacy preferences, by which we mean that privacy preferences are subject to
influence by those possessing greater insight into their determinants. Although most individuals
are probably unaware of the diverse influences on their concern about privacy, entities whose
interests depend on information revelation by others are not. The manipulation of subtle factors
that activate or suppress privacy concern can be seen in myriad realms such as the choice of
sharing defaults on social networks, or the provision of greater control on social media which
creates an illusion of safety and encourages greater sharing.
Uncertainty, context dependence, and malleability are closely connected. Context
dependence is amplified by uncertainty. Because people are often “at sea” when it comes to
the consequences of, and their feelings about, privacy, they cast around for cues to guide their
behavior. Privacy preferences and behaviors are, in turn, malleable and subject to influence in
large part because they are context dependent and because those with an interest in information
divulgence are able to manipulate context to their advantage.

uncertainty
Individuals manage the boundaries between their private and public spheres in numerous ways:
via separateness, reserve, or anonymity (10); by protecting personal information; but also through
deception and dissimulation (11). People establish such boundaries for many reasons, including
the need for intimacy and psychological respite and the desire for protection from social
influence and control (12). Sometimes, these motivations are so visceral and primal that
privacy seeking behavior emerges swiftly and naturally. This is often the case when physical
privacy is intruded such as when a stranger encroaches on one’s personal space (13 15) or
demonstratively eavesdrops on a conversation. However, at other times (often including when
informational privacy is at stake), people experience considerable uncertainty about whether,
and to what degree, they should be concerned about privacy.
A first and most obvious source of privacy uncertainty arises from incomplete and asymmetric
information. Advancements in information technology have made the collection and usage of
personal data often invisible. As a result, individuals rarely have clear knowledge of what
information other people, firms, and governments have about them or how that information is
used and with what consequences. To the extent that people lack such information, or are aware
of their ignorance, they are likely to be uncertain about how much information to share.
186 Alessandro Acquisti, Laura Branimarte, and George Lowenstein

Two factors exacerbate the difficulty of ascertaining the potential consequences of privacy
behavior. First, whereas some privacy harms are tangible, such as the financial costs associated
with identity theft, many others, such as having strangers become aware of one’s life history, are
intangible. Second, privacy is rarely an unalloyed good; it typically involves trade offs (16). For
example, ensuring the privacy of a consumer’s purchases may protect her from price discrimin
ation but also deny her the potential benefits of targeted “offers and advertisements.”
Elements that mitigate one or both of these exacerbating factors, by either increasing the
tangibility of privacy harms or making trade offs explicit and simple to understand, will generally
affect privacy related decisions. This is illustrated by one laboratory experiment in which
participants were asked to use a specially designed search engine to find online merchants
and purchase from them, with their own credit cards, either a set of batteries or a sex toy (17).
When the search engine only provided links to the merchants’ sites and a comparison of the
products’ prices from the different sellers, a majority of participants did not pay any attention to
the merchants’ privacy policies; they purchased from those offering the lowest price. However,
when the search engine also provided participants with salient, easily accessible information
about the differences in privacy protection afforded by the various merchants, a majority of
participants paid a roughly 5 percent premium to buy products from (and share their credit card
information with) more privacy protecting merchants.
A second source of privacy uncertainty relates to preferences. Even when aware of the
consequences of privacy decisions, people are still likely to be uncertain about their own privacy
preferences. Research on preference uncertainty (18) shows that individuals often have little
sense of how much they like goods, services, or other people. Privacy does not seem to be an
exception. This can be illustrated by research in which people were asked sensitive and
potentially incriminating questions either point blank, or followed by credible assurances of
confidentiality (19). Although logically such assurances should lead to greater divulgence, they
often had the opposite effect because they elevated respondents’ privacy concerns, which
without assurances would have remained dormant.
The remarkable uncertainty of privacy preferences comes into play in efforts to measure
individual and group differences in preference for privacy (20). For example, Alan Westin (21)
famously used broad (that is, not contextually specific) privacy questions in surveys to cluster
individuals into privacy segments: privacy fundamentalists, pragmatists, and unconcerned.
When asked directly, many people fall into the first segment: They profess to care a lot about
privacy and express particular concern over losing control of their personal information or others
gaining unauthorized access to it (22, 23). However, doubts about the power of attitudinal scales
to predict actual privacy behavior arose early in the literature (24). This discrepancy between
attitudes and behaviors has become known as the “privacy paradox.”
In one early study illustrating the paradox, participants were first classified into categories of
privacy concern inspired by Westin’s categorization based on their responses to a survey dealing
with attitudes toward sharing data (25). Next, they were presented with products to purchase at a
discount with the assistance of an anthropomorphic shopping agent. Few, regardless of the
group they were categorized in, exhibited much reluctance to answering the increasingly
sensitive questions the agent plied them with.
Why do people who claim to care about privacy often show little concern about it in their daily
behavior? One possibility is that the paradox is illusory that privacy attitudes, which are defined
broadly, and intentions and behaviors, which are defined narrowly, should not be expected to be
closely related (26, 27). Thus, one might care deeply about privacy in general but, depending on
the costs and benefits prevailing in a specific situation, seek or not seek privacy protection (28).
Privacy and Human Behavior in the Information Age 187

This explanation for the privacy paradox, however, is not entirely satisfactory for two reasons. The
first is that it fails to account for situations in which attitude behavior dichotomies arise under high
correspondence between expressed concerns and behavioral actions. For example, one study
compared attitudinal survey answers to actual social media behavior (29). Even within the subset
of participants who expressed the highest degree of concern over strangers being able to easily find
out their sexual orientation, political views, and partners’ names, 48 percent did in fact publicly
reveal their sexual orientation online, 47 percent revealed their political orientation, and 21 percent
revealed their current partner’s name. The second reason is that privacy decision making is only in
part the result of a rational “calculus” of costs and benefits (16, 28); it is also affected by mispercep
tions of those costs and benefits, as well as social norms, emotions, and heuristics. Any of these factors
may affect behavior differently from how they affect attitudes. For instance, present bias can cause
even the privacy conscious to engage in risky revelations of information, if the immediate gratifica
tion from disclosure trumps the delayed, and hence discounted, future consequences (30).
Preference uncertainty is evident not only in studies that compare stated attitudes with
behaviors, but also in those that estimate monetary valuations of privacy. “Explicit” investigations
ask people to make direct trade offs, typically between privacy of data and money. For instance,
in a study conducted both in Singapore and the United States, students made a series of
hypothetical choices about sharing information with websites that differed in protection of
personal information and prices for accessing services (31). Using conjoint analysis, the authors
concluded that subjects valued protection against errors, improper access, and secondary use of
personal information between $30.49 and $44.62. Similarly to direct questions about attitudes
and intentions, such explicit investigations of privacy valuation spotlight privacy as an issue that
respondents should take account of and, as a result, give increased weight in their responses.
Implicit investigations, in contrast, infer valuations of privacy from day to day decisions in which
privacy is only one of many considerations and is typically not highlighted. Individuals engage in
privacy related transactions all the time, even when the privacy trade offs may be intangible or when
the exchange of personal data may not be a visible or primary component of a transaction. For
instance, completing a query on a search engine is akin to selling personal data (one’s preferences and
contextual interests) to the engine in exchange for a service (search results). “Revealed preference”
economic arguments would then conclude that because technologies for information sharing have
been enormously successful, whereas technologies for information protection have not, individuals
hold overall low valuations of privacy. However, that is not always the case: Although individuals at
times give up personal data for small benefits or discounts, at other times they voluntarily incur
substantial costs to protect their privacy. Context matters, as further discussed in the next section.
In fact, attempts to pinpoint exact valuations that people assign to privacy may be misguided, as
suggested by research calling into question the stability, and hence validity, of privacy estimates. In
one field experiment inspired by the literature on endowment effects (32), shoppers at a mall were
offered gift cards for participating in a nonsensitive survey. The cards could be used online or in
stores, just like debit cards. Participants were either given a $10 “anonymous” gift card (transac
tions done with that card would not be traceable to the subject) or a $12 trackable card (transac
tions done with that card would be linked to the name of the subject). Initially, half of the
participants were given one type of card, and half the other. Then, they were all offered the
opportunity to switch. Some shoppers, for example, were given the anonymous $10 card and were
asked whether they would accept $2 to “allow my name to be linked to transactions done with the
card”; other subjects were asked whether they would accept a card with $2 less value to “prevent
my name from being linked to transactions done with the card.” Of the subjects who originally
held the less valuable but anonymous card, five times as many (52.1 percent) chose it and kept it
188 Alessandro Acquisti, Laura Branimarte, and George Lowenstein

over the other card than did those who originally held the more valuable card (9.7 percent). This
suggests that people value privacy more when they have it than when they do not.
The consistency of preferences for privacy is also complicated by the existence of a powerful
countervailing motivation: the desire to be public, share, and disclose. Humans are social animals,
and information sharing is a central feature of human connection. Social penetration theory (33)
suggests that progressively increasing levels of self disclosure are an essential feature of the natural and
desirable evolution of interpersonal relationships from superficial to intimate. Such a progression is
only possible when people begin social interactions with a baseline level of privacy. Paradoxically,
therefore, privacy provides an essential foundation for intimate disclosure. Similar to privacy, self
disclosure confers numerous objective and subjective benefits, including psychological and physical
health (34, 35). The desire for interaction, socialization, disclosure, and recognition or fame (and,
conversely, the fear of anonymous unimportance) are human motives no less fundamental than the
need for privacy. The electronic media of the current age provide unprecedented opportunities for
acting on them. Through social media, disclosures can build social capital, increase self esteem (36),
and fulfill ego needs (37). In a series of functional magnetic resonance imaging experiments, self
disclosure was even found to engage neural mechanisms associated with reward; people highly value
the ability to share thoughts and feelings with others. Indeed, subjects in one of the experiments were
willing to forgo money in order to disclose about themselves (38).

context-dependence
Much evidence suggests that privacy is a universal human need (Box 1) (39). However, when people
are uncertain about their preferences they often search for cues in their environment to provide
guidance. And because cues are a function of context, behavior is as well. Applied to privacy, context
dependence means that individuals can, depending on the situation, exhibit anything ranging from
extreme concern to apathy about privacy. Adopting the terminology of Westin, we are all privacy
pragmatists, privacy fundamentalists, or privacy unconcerned, depending on time and place (40).

Box 1. Privacy: A Modern Invention?


Is privacy a modern, bourgeois, and distinctly Western invention? Or are privacy needs a universal
feature of human societies? Although access to privacy is certainly affected by socioeconomic
factors (87) (some have referred to privacy as a “luxury good” (15)), and privacy norms greatly differ
across cultures (65, 85), the need for privacy seems to be a universal human trait. Scholars have
uncovered evidence of privacy seeking behaviors across peoples and cultures separated by time and
space: from ancient Rome and Greece (39, 88) to preindustrialized Javanese, Balinese, and Tuareg
societies (89, 90). Privacy, as Irwin Altman (91) noted, appears to be simultaneously culturally
specific and culturally universal. Cues of a common human quest for privacy are also found in the
texts of ancient religions: The Quran (49:12) instructs against spying on one another (92); the
Talmud (Bava Batra 60a) advises home builders to position windows so that they do not directly
face those of one’s neighbors (93); and the Bible (Genesis 3:7) relates how Adam and Eve
discovered their nakedness after eating the fruit of knowledge and covered themselves in shame
from the prying eyes of God (94) (a discussion of privacy in Confucian and Taoist cultures is
available in [95]). Implicit in this heterogeneous selection of historical examples is the observation
that there exist multiple notions of privacy. Although contemporary attention focuses on
informational privacy, privacy has been also construed as territorial and physical, and linked to
concepts as diverse as surveillance, exposure, intrusion, insecurity, and appropriation, as well as
secrecy, protection, anonymity, dignity, or even freedom (a taxonomy is provided in [9]).
Privacy and Human Behavior in the Information Age 189

The way we construe and negotiate public and private spheres is context-dependent because
the boundaries between the two are murky (41): The rules people follow for managing privacy
vary by situation, are learned over time, and are based on cultural, motivational, and purely
situational criteria. For instance, we may usually be more comfortable sharing secrets with
friends, but at times we may reveal surprisingly personal information to a stranger on a plane
(42). The theory of contextual “integrity” posits that social expectations affect our beliefs
regarding what is private and what is public, and that such expectations vary with specific
contexts (43). Thus, seeking privacy in public is not a contradiction; individuals can manage
privacy even while sharing information, and even on social media (44). For instance, a longitu-
dinal study of actual disclosure behavior of online social network users highlighted that over
time, many users increased the amount of personal information revealed to their friends (those
connected to them on the network) while simultaneously decreasing the amounts revealed to
strangers (those unconnected to them) (Figure 10.1) (45).

Endogenous privacy behavior and exogenous shocks.


Percentage of profiles on the Carnegie Mellon University Facebook network
who revealed birthday and high school over time, 2005–2011.

1.0

0.8

0.6

0.4

0.2

0.0
2005 2006 2007 2008 2009 2010 2011

1.0

0.8

0.6

0.4

0.2

0.0
2005 2006 2007 2008 2009 2010 2011

Shares birthday on profile Shares high school on profile

figure 10.1 Endogenous privacy behavior and exogenous shocks. Privacy behavior is affected both
by endogenous motivations (for instance, subjective preferences) and exogenous factors (for instance,
changes in user interfaces). Over time, the percentage of members in the Carnegie Mellon University
Facebook network who chose to publicly reveal personal information decreased dramatically. For
instance, more than 80% of profiles publicly revealed their birthday in 2005, but less than 20% in 2011.
The decreasing trend is not uniform, however. After decreasing for several years, the percentage of
profiles that publicly revealed their high school roughly doubled between 2009 and 2010 – after
Facebook changed the default visibility settings for various fields on its profiles, including high school
(bottom), but not birthday (top) (45).
190 Alessandro Acquisti, Laura Branimarte, and George Lowenstein

The impact of cues on disclosure behavior.


Relative admission rates, by experimental condition, in an experiment testing the
impact of different survey interfaces on willingness to answer questions about the
subject’s engagement in various sensitive behaviors.

1.5
1.4
Relative Admission Rate

1.3
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
Professional Unprofessional

figure 10.2 The impact of cues on disclosure behavior. A measure of privacy behavior often used
in empirical studies is a subject’s willingness to answer personal, sometimes sensitive questions – for
instance, by admitting or denying having engaged in questionable behaviors. In an online experiment
(47), individuals were asked a series of intrusive questions about their behaviors, such as “Have you
ever tried to peek at someone else’s e-mail without them knowing?” Across conditions, the interface of
the questionnaire was manipulated to look more or less professional. The y-axis captures the mean
affirmative admission rates (AARs) to questions that were rated as intrusive (the proportion of
questions answered affirmatively) normed, question by question, on the overall average AAR for the
question. Subjects revealed more personal and even incriminating information on the website with a
more casual design, even though the site with the formal interface was judged by other respondents to
be much safer. The study illustrates how cues can influence privacy behavior in a fashion that is
unrelated, or even negatively related, to normative bases of decision-making.

The cues that people use to judge the importance of privacy sometimes result in sensible
behavior. For instance, the presence of government regulation has been shown to reduce
consumer concern and increase trust; it is a cue that people use to infer the existence of some
degree of privacy protection (46). In other situations, however, cues can be unrelated, or even
negatively related, to normative bases of decision-making. For example, in one online experi-
ment (47) individuals were more likely to reveal personal and even incriminating information on
a website with an unprofessional and casual design with the banner “How Bad R U” than on a
site with a formal interface – even though the site with the formal interface was judged by other
respondents to be much safer (Figure 10.2). Yet in other situations, it is the physical environment
that influences privacy concern and associated behavior (48), sometimes even unconsciously.
For instance, all else being equal, intimacy of self-disclosure is higher in warm, comfortable
rooms, with soft lighting, than in cold rooms with bare cement and overhead fluorescent
lighting (49).
Some of the cues that influence perceptions of privacy are one’s culture and the behavior of
other people, either through the mechanism of descriptive norms (imitation) or via reciprocity
Privacy and Human Behavior in the Information Age 191

(50). Observing as other people reveal information increases the likelihood that one will reveal it
oneself (51). In one study, survey takers were asked a series of sensitive personal questions
regarding their engagement in illegal or ethically questionable behaviors. After answering each
question, participants were provided with information, manipulated unknown to them, about
the percentage of other participants who in the same survey had admitted to having engaged in a
given behavior. Being provided with information that suggested that a majority of survey takers
had admitted a certain questionable behavior increased participants’ willingness to disclose their
engagement in other, also sensitive, behaviors. Other studies have found that the tendency to
reciprocate information disclosure is so ingrained that people will reveal more information even
to a computer agent that provides information about itself (52). Findings such as this may help to
explain the escalating amounts of self disclosure we witness online: If others are doing it, people
seem to reason unconsciously, doing so oneself must be desirable or safe.
Other people’s behavior affects privacy concerns in other ways, too. Sharing personal infor
mation with others makes them “co owners” of that information (53) and, as such, responsible
for its protection. Mismanagement of shared information by one or more co owners causes
“turbulence” of the privacy boundaries and, consequently, negative reactions, including anger
or mistrust. In a study of undergraduate Facebook users (54), for instance, turbulence of privacy
boundaries, as a result of having one’s profile exposed to unintended audiences, dramatically
increased the odds that a user would restrict profile visibility to friends only.
Likewise, privacy concerns are often a function of past experiences. When something in an
environment changes, such as the introduction of a camera or other monitoring devices, privacy
concern is likely to be activated. For instance, surveillance can produce discomfort (55) and
negatively affect worker productivity (56). However, privacy concern, like other motivations, is
adaptive; people get used to levels of intrusion that do not change over time. In an experiment
conducted in Helsinki (57), the installation of sensing and monitoring technology in households
led family members initially to change their behavior, particularly in relation to conversations,
nudity, and sex. And yet, if they accidentally performed an activity, such as walking naked into
the kitchen in front of the sensors, it seemed to have the effect of “breaking the ice”; participants
then showed less concern about repeating the behavior. More generally, participants became
inured to the presence of the technology over time.
The context dependence of privacy concern has major implications for the risks associated
with modern information and communication technology (58). With online interactions, we no
longer have a clear sense of the spatial boundaries of our listeners. Who is reading our blog post?
Who is looking at our photos online? Adding complexity to privacy decision making, boundaries
between public and private become even less defined in the online world (59), where we
become social media friends with our coworkers and post pictures to an indistinct flock of
followers. With different social groups mixing on the Internet, separating online and offline
identities and meeting our and others’ expectations regarding privacy becomes more difficult
and consequential (60).

malleability and influence


Whereas individuals are often unaware of the diverse factors that determine their concern about
privacy in a particular situation, entities whose prosperity depends on the revelation of infor
mation by others are much more sophisticated. With the emergence of the information age,
growing institutional and economic interests have developed around the disclosure of personal
information, from online social networks to behavioral advertising. It is not surprising, therefore,
192 Alessandro Acquisti, Laura Branimarte, and George Lowenstein

Changes in Facebook default profile settings over time, 2005–2014.


Degree of visibility of different fields of Facebook profiles based on default settings.

Networks Wall
Networks Wall

Extended Profile
The Entire Internet The Entire Internet
Contact Contact Data

Facebook Facebook

Friends Likes
Friends
Friends F
Friends
Visible Visible
User
U User
Not Visible
Not Visible

Names Names
Birthday Birthday

Photos Photos
Basic Profile Basic Profile
Data Data
Gender Picture Gender Picture

figure 10.3 Changes in Facebook default profile visibility settings over time (2005–2014). Over
time, Facebook profiles included an increasing amount of fields and, therefore, types of data. In
addition, default visibility settings became more revelatory between 2005 (top) and 2014 (bottom),
disclosing more personal information to larger audiences, unless the user manually overrode the
defaults (fields such as “Likes” and “Extended Profile Data” did not exist in 2005). “Basic profile data”
includes hometown, current city, high school, school (status, concentration, secondary
concentration), interested in, relationship, workplace, about you, and quotes. Examples of “Extended
profile data” include life events such as new job, new school, engagement, expecting a baby, moved,
bought a home, and so forth. “Picture” refers to the main profile image. “Photos” refers to the
additional images that users might have shared in their account. “Names” refers to the real name, the
username, and the user ID. This figure is based on the authors’ data and the original visualization
created by M. McKeon, available at [Link]

that some entities have an interest in, and have developed expertise in, exploiting behavioral and
psychological processes to promote disclosure (61). Such efforts play on the malleability of
privacy preferences, a term we use to refer to the observation that various, sometimes subtle,
factors can be used to activate or suppress privacy concerns, which in turn affect behavior.
Default settings are an important tool used by different entities to affect information disclos-
ure. A large body of research has shown that default settings matter for decisions as important as
organ donation and retirement savings (62). Sticking to default settings is convenient, and people
often interpret default settings as implicit recommendations (63). Thus, it is not surprising that
default settings for one’s profile’s visibility on social networks (64), or the existence of opt-in or
opt-out privacy policies on websites (65), affect individuals’ privacy behavior (Figure 10.3).
In addition to default settings, websites can also use design features that frustrate or even
confuse users into disclosing personal information (66), a practice that has been referred to as
“malicious interface design” (67). Another obvious strategy that commercial entities can use to
avoid raising privacy concerns is not to “ring alarm bells” when it comes to data collection.
When companies do ring them – for example, by using overly fine-tuned personalized adver-
tisements – consumers are alerted (68) and can respond with negative “reactance” (69).
Various so-called “antecedents” (70) affect privacy concerns and can be used to influence
privacy behavior. For instance, trust in the entity receiving one’s personal data soothes concerns.
Privacy and Human Behavior in the Information Age 193

Moreover, because some interventions that are intended to protect privacy can establish trust,
concerns can be muted by the very interventions intended to protect privacy. Perversely,
62 percent of respondents to a survey believed (incorrectly) that the existence of a privacy policy
implied that a site could not share their personal information without permission (40), which
suggests that simply posting a policy that consumers do not read may lead to misplaced feelings
of being protected.
Control is another feature that can inculcate trust and produce paradoxical effects. Perhaps
because it is not a controversial concept, control has been one of the capstones of the focus of
both industry and policy makers in attempts to balance privacy needs against the value of
sharing. Control over personal information is often perceived as a critical feature of privacy
protection (39). In principle, it does provide users with the means to manage access to their
personal information. Research, however, shows that control can reduce privacy concern (46),
which in turn can have unintended effects. For instance, one study found that participants who
were provided with greater explicit control over whether and how much of their personal
information researchers could publish ended up sharing more sensitive information with a
broader audience the opposite of the ostensible purpose of providing such control (71).
Similar to the normative perspective on control, increasing the transparency of firms’ data
practices would seem to be desirable. However, transparency mechanisms can be easily
rendered ineffective. Research has highlighted not only that an overwhelming majority of
Internet users do not read privacy policies (72), but also that few users would benefit from doing
so; nearly half of a sample of online privacy policies were found to be written in language
beyond the grasp of most Internet users (73). Indeed, and somewhat amusingly, it has been
estimated that the aggregate opportunity cost if US consumers actually read the privacy policies
of the sites they visit would be $781 billion per year (74).
Although uncertainty and context dependence lead naturally to malleability and manipula
tion, not all malleability is necessarily sinister. Consider monitoring. Although monitoring can
cause discomfort and reduce productivity, the feeling of being observed and accountable can
induce people to engage in prosocial behavior or (for better or for worse) adhere to social norms
(75). Prosocial behavior can be heightened by monitoring cues as simple as three dots in a
stylized face configuration (76). By the same token, the depersonalization induced by computer
mediated interaction (77), either in the form of lack of identifiability or of visual anonymity (78),
can have beneficial effects, such as increasing truthful responses to sensitive surveys (79, 80).
Whether elevating or suppressing privacy concerns is socially beneficial critically depends, yet
again, on context (a meta analysis of the impact of de identification on behavior is provided in
(81)). For example, perceptions of anonymity can alternatively lead to dishonest or prosocial
behavior. Illusory anonymity induced by darkness caused participants in an experiment (82) to
cheat in order to gain more money. This can be interpreted as a form of disinhibition effect (83),
by which perceived anonymity licenses people to act in ways that they would otherwise not even
consider. In other circumstances, though, anonymity leads to prosocial behaviour for instance,
higher willingness to share money in a dictator game, when coupled with the priming of
religiosity (84).

conclusions
Norms and behaviors regarding private and public realms greatly differ across cultures (85).
Americans, for example, are reputed to be more open about sexual matters than are the Chinese,
whereas the latter are more open about financial matters (such as income, cost of home, and
194 Alessandro Acquisti, Laura Branimarte, and George Lowenstein

possessions). And even within cultures, people differ substantially in how much they care about
privacy and what information they treat as private. And as we have sought to highlight in this
chapter, privacy concerns can vary dramatically for the same individual, and for societies,
over time.
If privacy behaviors are culture and context dependent, however, the dilemma of what to
share and what to keep private is universal across societies and over human history. The task of
navigating those boundaries, and the consequences of mismanaging them, have grown increas
ingly complex and fateful in the information age, to the point that our natural instincts seem not
nearly adequate.
In this chapter, we used three themes to organize and draw connections between the social
and behavioral science literatures on privacy and behavior. We end with a brief discussion of the
reviewed literature’s relevance to privacy policy. Uncertainty and context dependence imply that
people cannot always be counted on to navigate the complex trade offs involving privacy in a
self interested fashion. People are often unaware of the information they are sharing, unaware of
how it can be used, and even in the rare situations when they have full knowledge of the
consequences of sharing, uncertain about their own preferences. Malleability, in turn, implies
that people are easily influenced in what and how much they disclose. Moreover, what they
share can be used to influence their emotions, thoughts, and behaviors in many aspects of their
lives, as individuals, consumers, and citizens. Although such influence is not always or neces
sarily malevolent or dangerous, relinquishing control over one’s personal data and over one’s
privacy alters the balance of power between those holding the data and those who are the
subjects of that data.
Insights from the social and behavioral empirical research on privacy reviewed here suggest
that policy approaches that rely exclusively on informing or “empowering” the individual are
unlikely to provide adequate protection against the risks posed by recent information technolo
gies. Consider transparency and control, two principles conceived as necessary conditions for
privacy protection. The research we highlighted shows that they may provide insufficient
protections and even backfire when used apart from other principles of privacy protection.
The research reviewed here suggests that if the goal of policy is to adequately protect privacy
(as we believe it should be), then we need policies that protect individuals with a minimal
requirement of informed and rational decision making policies that include a baseline
framework of protection, such as the principles embedded in the so called fair information
practices (86). People need assistance and even protection to aid in navigating what is otherwise
a very uneven playing field. As highlighted by our discussion, a goal of public policy should be to
achieve an equity of power between individuals, consumers, and citizens on the one hand and,
on the other, the data holders such as governments and corporations that currently have the
upper hand. To be effective, privacy policy should protect real people who are naïve,
uncertain, and vulnerable and should be sufficiently flexible to evolve with the emerging
unpredictable complexities of the information age.

acknowledgments
We are deeply grateful to the following individuals: R. Gross and F. Stutzman for data analysis;
V. Marotta, V. Radhakrishnan, and S. Samat for research; W. Harsch for graphic design; and
A. Adams, I. Adjerid, R. Anderson, E. Barr, C. Bennett, R. Boehme, R. Calo, J. Camp, F. Cate,
J. Cohen, D. Cole, M. Culnan, R. De Wolf, J. Donath, S. Egelman, N. Ellison, A. Forget,
U. Gasser B. Gellman, J. Graves, J. Grimmelmann, J. Grossklags, S. Guerses, J. Hancock,
Privacy and Human Behavior in the Information Age 195

E. Hargittai, W. Hartzog, J. Hong, C. Hoofnagle, J. P. Hubaux, A. Joinson, J. King,


B. Knijnenburg, A. Kobsa, P. Leon, M. Madden, I. Meeker, D. Mulligan, C. Olivola,
E. Peer, S. Petronio, S. Preibusch, J. Reidenberg, S. Romanosky, M. Rotenberg,
I. Rubinstein, N. Sadeh, A. Sasse, F. Schaub, P. Shah, R. E. Smith, S. Spiekermann,
J. Staddon, L. Strahilevitz, P. Swire, O. Tene, E. VanEpps, J. Vitak, R. Wash, A. Woodruff,
H. Xu, and E. Zeide for enormously valuable comments and suggestions.

references
1. V. Mayer Schönberger, Delete: The Virtue of Forgetting in the Digital Age (Princeton Univ. Press,
Princeton, 2011).
2. L. Sweeney, Int. J. Uncert. Fuzziness Knowl. Based Syst. 10, 557 570 (2002).
3. A. McAfee, E. Brynjolfsson, Harv. Bus. Rev. 90, 60 66, 68, 128 (2012).
4. N. P. Tatonetti, P. P. Ye, R. Daneshjou, R. B. Altman, Sci. Transl. Med. 4, 125ra31 (2012).
5. J. E. Cohen, Stanford Law Rev. 52, 1373 1438 (2000).
6. K. Crawford, K. Miltner, M. L. Gray, Int. J. Commun. 8, 1663 1672 (2014).
7. R. A. Posner, Am. Econ. Rev. 71, 405 409 (1981).
8. D. J. Solove, Harv. Law Rev. 126, 1880 1903 (2013).
9. D. J. Solove, Univ. Penn. L. Rev. 154, 477 564 (2006).
10. F. Schoeman, Ed., Philosophical Dimensions of Privacy: An Anthology (Cambridge Univ. Press, New
York, 1984).
11. B. M. DePaulo, C. Wetzel, R. Weylin Sternglanz, M. J. W. Wilson, J. Soc. Issues 59, 391 410 (2003).
12. S. T. Margulis, J. Soc. Issues 59, 243 261 (2003).
13. E. Goffman, Relations in Public: Microstudies of the Public Order (Harper & Row, New York, 1971).
14. E. Sundstrom, I. Altman, Hum. Ecol. 4, 47 67 (1976).
15. B. Schwartz, Am. J. Sociol. 73, 741 752 (1968).
16. R. S. Laufer, M. Wolfe, J. Soc. Issues 33, 22 42 (1977).
17. J. Y. Tsai, S. Egelman, L. Cranor, A. Acquisti, Inf. Syst. Res. 22, 254 268 (2011).
18. P. Slovic, Am. Psychol. 50, 364 371 (1995).
19. E. Singer, H. Hippler, N. Schwarz, Int. J. Public Opin. Res. 4, 256 268 (1992).
20. V. P. Skotko, D. Langmeyer, Sociometry 40, 178 182 (1977).
21. A. Westin, Harris Louis & Associates, Harris Equifax Consumer Privacy Survey (Tech. rep. 1991).
22. M. J. Culnan, P. K. Armstrong, Organ. Sci. 10, 104 115 (1999).
23. H. J. Smith, S. J. Milberg, S. J. Burke, Manage. Inf. Syst. Q. 20, 167 196 (1996).
24. B. Lubin, R. L. Harrison, Psychol. Rep. 15, 77 78 (1964).
25. S. Spiekermann, J. Grossklags, B. Berendt, E Privacy in 2nd Generation E Commerce: Privacy Prefer
ences versus Actual Behavior (Third ACM Conference on Electronic Commerce, Tampa, 2001),
pp. 38 47.
26. P. A. Norberg, D. R. Horne, D. A. Horne, J. Consum. Aff. 41, 100 126 (2007).
27. I. Ajzen, M. Fishbein, Psychol. Bull. 84, 888 918 (1977).
28. P. H. Klopfer, D. I. Rubenstein, J. Soc. Issues 33, 52 65 (1977).
29. A. Acquisti, R. Gross, in Privacy Enhancing Technologies, G. Danezis, P. Golle Eds. (Springer, New
York, 2006), pp. 36 58.
30. A. Acquisti, Privacy in Electronic Commerce and the Economics of Immediate Gratification (Fifth ACM
Conference on Electronic Commerce, New York, 2004), pp. 21 29.
31. I. Hann, K. Hui, S. T. Lee, I. P. L. Png, J. Manage. Inf. Syst. 24, 13 42 (2007).
32. A. Acquisti, L. K. John, G. Loewenstein, J. Legal Stud. 42, 249 274 (2013).
33. I. Altman, D. Taylor, Social Penetration: The Development of Interpersonal Relationships (Holt,
Rinehart & Winston, New York, 1973).
34. J. Frattaroli, Psychol. Bull. 132, 823 865 (2006).
35. J. W. Pennebaker, Behav. Res. Ther. 31, 539 548 (1993).
36. C. Steinfield, N. B. Ellison, C. Lampe, J. Appl. Dev. Psychol. 29, 434 445 (2008).
37. C. L. Toma, J. T. Hancock, Pers. Soc. Psychol. Bull. 39, 321 331 (2013).
196 Alessandro Acquisti, Laura Branimarte, and George Lowenstein

38. D. I. Tamir, J. P. Mitchell, Proc. Natl. Acad. Sci. U.S.A. 109, 8038 8043 (2012).
39. A. Westin, Privacy and Freedom (Athenäum, New York, 1967).
40. C. J. Hoofnagle, J. M. Urban, Wake Forest Law Rev. 49, 261 321 (2014).
41. G. Marx, Ethics Inf. Technol. 3, 157 169 (2001).
42. J. W. Thibaut, H. H. Kelley, The Social Psychology of Groups (Wiley, Oxford, 1959).
43. H. Nissenbaum, Privacy in Context: Technology, Policy, and the Integrity of Social Life (Stanford Univ.
Press, Redwood City, 2009).
44. d. Boyd, It’s Complicated: The Social Lives of Networked Teens (Yale Univ. Press, New Haven, 2014).
45. F. Stutzman, R. Gross, A. Acquisti, J. Priv. Confidential. 4, 7 41 (2013).
46. H. Xu, H. H. Teo, B. C. Tan, R. Agarwal, J. Manage. Inf. Syst. 26, 135 174 (2009).
47. L. K. John, A. Acquisti, G. Loewenstein, J. Consum. Res. 37, 858 873 (2011).
48. I. Altman, The Environment and Social Behavior: Privacy, Personal Space, Territory, and Crowding
(Cole, Monterey, 1975).
49. A. L. Chaikin, V. J. Derlega, S. J. Miller, J. Couns. Psychol. 23, 479 481 (1976).
50. V. J. Derlega, A. L. Chaikin, J. Soc. Issues 33, 102 115 (1977).
51. A. Acquisti, L. K. John, G. Loewenstein, J. Mark. Res. 49, 160 174 (2012).
52. Y. Moon, J. Consum. Res. 26, 323 339 (2000).
53. S. Petronio, Boundaries of Privacy: Dialectics of Disclosure (SUNY Press, Albany, 2002).
54. F. Stutzman, J. Kramer Duffield, Friends Only: Examining a Privacy Enhancing Behavior in Facebook
(SIGCHI Conference on Human Factors in Computing Systems, ACM, Atlanta, 2010), pp. 1553 1562.
55. T. Honess, E. Charman, Closed Circuit Television in Public Places: Its Acceptability and Perceived
Effectiveness (Police Research Group, London, 1992).
56. M. Gagné, E. L. Deci, J. Organ. Behav. 26, 331 362 (2005).
57. A. Oulasvirta et al., Long Term Effects of Ubiquitous Surveillance in the Home (ACM Conference on
Ubiquitous Computing, Pittsburgh, 2012), pp. 41 50.
58. L. Palen, P. Dourish, Unpacking “Privacy” For a Networked World (SIGCHI Conference on Human
Factors in Computing Systems, ACM, Fort Lauderdale, 2003), pp. 129 136.
59. Z. Tufekci, Bull. Sci. Technol. Soc. 28, 20 36 (2008).
60. J. A. Bargh, K. Y. A. McKenna, G. M. Fitzsimons, J. Soc. Issues 58, 33 48 (2002).
61. R. Calo, Geo. Wash. L. Rev. 82, 995 1304 (2014).
62. E. J. Johnson, D. Goldstein, Science 302, 1338 1339 (2003).
63. C. R. McKenzie, M. J. Liersch, S. R. Finkelstein, Psychol. Sci. 17, 414 420 (2006).
64. R. Gross, A. Acquisti, Information Revelation and Privacy in Online Social Networks (ACM Workshop
Privacy in the Electronic Society, New York, 2005), pp. 71 80.
65. E. J. Johnson, S. Bellman, G. L. Lohse, Mark. Lett. 13, 5 15 (2002).
66. W. Hartzog, Am. Univ. L. Rev. 60, 1635 1671 (2010).
67. G. Conti, E. Sobiesk, Malicious Interface Design: Exploiting the User (19th International Conference
on World Wide Web, ACM, Raleigh, 2010), pp. 271 280.
68. A. Goldfarb, C. Tucker, Mark. Sci. 30, 389 404 (2011).
69. T. B. White, D. L. Zahay, H. Thorbjørnsen, S. Shavitt, Mark. Lett. 19, 39 50 (2008).
70. H. J. Smith, T. Dinev, H. Xu, Manage. Inf. Syst. Q. 35, 989 1016 (2011).
71. L. Brandimarte, A. Acquisti, G. Loewenstein, Soc. Psychol. Personal. Sci. 4, 340 347 (2013).
72. C. Jensen, C. Potts, C. Jensen, Int. J. Hum. Comput. Stud. 63, 203 227 (2005).
73. C. Jensen, C. Potts, Privacy Policies as Decision Making Tools: An Evaluation of Online Privacy Notices
(SIGCHI Conference on Human factors in computing systems, ACM, Vienna, 2004), pp. 471 478.
74. A. M. McDonald, L. F. Cranor, I/S: J. L. Policy Inf. Society. 4, 540 565 (2008).
75. C. Wedekind, M. Milinski, Science 288, 850 852 (2000).
76. M. Rigdon, K. Ishii, M. Watabe, S. Kitayama, J. Econ. Psychol. 30, 358 367 (2009).
77. S. Kiesler, J. Siegel, T. W. McGuire, Am. Psychol. 39, 1123 1134 (1984).
78. A. N. Joinson, Eur. J. Soc. Psychol. 31, 177 192 (2001).
79. S. Weisband, S. Kiesler, Self Disclosure on Computer Forms: Meta Analysis and Implications (SIGCHI
Conference Conference on Human Factors in Computing Systems, ACM, Vancouver, 1996),
pp. 3 10.
80. R. Tourangeau, T. Yan, Psychol. Bull. 133, 859 883 (2007).
81. T. Postmes, R. Spears, Psychol. Bull. 123, 238 259 (1998).
Privacy and Human Behavior in the Information Age 197

82. C. B. Zhong, V. K. Bohns, F. Gino, Psychol. Sci. 21, 311 314 (2010).
83. J. Suler, Cyberpsychol. Behav. 7, 321 326 (2004).
84. A. F. Shariff, A. Norenzayan, Psychol. Sci. 18, 803 809 (2007).
85. B. Moore, Privacy: Studies in Social and Cultural History (Armonk, New York, 1984).
86. Records, Computers and the Rights of Citizens (Secretary’s Advisory Committee, US Dept. of Health,
Education and Welfare, Washington, DC, 1973).
87. E. Hargittai, in Social Stratification, D. Grusky Ed. (Westview, Boulder, 2008), pp. 936 113.
88. P. Ariès, G. Duby (Eds.), A History of Private Life: From Pagan Rome to Byzantium (Harvard Univ.
Press, Cambridge, 1992).
89. R. F. Murphy, Am. Anthropol. 66, 1257 1274 (1964).
90. A. Westin, in Philosophical Dimensions of Privacy: An Anthology, F.D. Schoeman Ed. (Cambridge
Univ. Press, Cambridge, 1984), pp. 56 74.
91. I. Altman, J. Soc. Issues 33, 66 84 (1977).
92. M. A. Hayat, Inf. Comm. Tech. L. 16, 137 148 (2007).
93. A. Enkin, “Privacy,” [Link]/2012/07/privacy (2014).
94. J. Rykwert, Soc. Res. (New York) 68, 29 40 (2001).
95. C. B. Whitman, in Individualism and Holism: Studies in Confucian and Taoist Values, D. J. Munro,
Ed. (Center for Chinese Studies, Univ. Michigan, Ann Arbor, 1985), pp. 85 100.
11

Privacy, Vulnerability, and Affordance

Ryan Calo

A person without privacy is vulnerable. But what is it to be vulnerable? And what role does
privacy or privacy law play in vulnerability?
This chapter, adapted from the Clifford Symposium at DePaul University, begins to unpack
the complex, sometimes contradictory relationship between privacy and vulnerability. I begin by
exploring how the law conceives of vulnerability essentially, as a binary status meriting special
consideration where present. Recent literature recognizes vulnerability not as a status but as a
state a dynamic and manipulable condition that everyone experiences to different degrees and
at different times.
I then discuss various ways in which vulnerability and privacy intersect. I introduce an analytic
distinction between vulnerability rendering, i.e., making a person more vulnerable, and the
exploitation of vulnerability whether manufactured or native. I also describe the relationship
between privacy and vulnerability as a vicious or virtuous circle. The more vulnerable a person
is, the less privacy they tend to enjoy; meanwhile, a lack of privacy opens the door to greater
vulnerability and exploitation.
Privacy can protect against vulnerability but can also be invoked to engender it. I next describe
how privacy supports the creation and exploitation of vulnerability in ways literal, rhetorical, and
conceptual. An abuser may literally use privacy to hide his abuse from law enforcement.
A legislature or group may invoke privacy rhetorically to justify discrimination, for instance,
against the transgender individuals who wish to use the bathroom consistent with their gender
identity.1 And courts obscure vulnerability conceptually when they decide a case on the basis of
privacy instead of the value that is more centrally at stake.
Finally, building on previous work, I offer James Gibson’s theory of affordances as a theoret
ical lens by which to analyze the complex relationships that privacy mediates. Privacy under
stood as an affordance permits a more nuanced understanding of privacy and vulnerability and
could perhaps lead to wiser privacy law and policy.