Next month, I’m giving a keynote talk at The Future of the Future: The Ethics and Implications of AI, an event at UC Irvine that features Bruce Sterling, Rose Eveleth, David Kaye, and many others!
Continue reading “Machine learning is innately conservative and wants you to either act like everyone else, or never change”
Tag: how to lie with statistics
The UK unemployment rate is at least three times the official rate
The UK — like most countries — excludes “inactive workers” (students, new parents, people who don’t want a job) from its unemployment figures, but “inactive” is such a slippery concept that it can paper over huge cracks in the labor market.
Continue reading “The UK unemployment rate is at least three times the official rate”
Automatically generate datasets that teach people how (not) to create statistical mirages
FJ Anscome’s classic, oft-cited 1973 paper “Graphs in Statistical Analysis” showed that very different datasets could produce “the same summary statistics (mean, standard deviation, and correlation) while producing vastly different plots” — Anscome’s point being that you can miss important differences if you just look at tables of data, and these leap out when you use graphs to represent the same data.
Continue reading “Automatically generate datasets that teach people how (not) to create statistical mirages”
Cybercrime, patent-theft numbers are total bullshit
In case there was any doubt in your mind, the alleged $1T cost to America from cyberwar and the $250B cost to America from “cyber-theft of Intellectual property” are both total bullshit. Pro Publica breaks it down.
One of the figures Alexander attributed to Symantec — the $250 billion in annual losses from intellectual property theft — was indeed mentioned in a Symantec report, but it is not a Symantec number and its source remains a mystery.
McAfee’s trillion-dollar estimate is questioned even by the three independent researchers from Purdue University whom McAfee credits with analyzing the raw data from which the estimate was derived. “I was really kind of appalled when the number came out in news reports, the trillion dollars, because that was just way, way large,” said Eugene Spafford, a computer science professor at Purdue.
Spafford was a key contributor to McAfee’s 2009 report, “Unsecured Economies: Protecting Vital Information” (PDF). The trillion-dollar estimate was first published in a news release that McAfee issued to announce the report; the number does not appear in the report itself. A McAfee spokesman told ProPublica the estimate was an extrapolation by the company, based on data from the report. McAfee executives have mentioned the trillion-dollar figure on a number of occasions, and in 2011 McAfee published it once more in a new report, “Underground Economies: Intellectual Capital and Sensitive Corporate Data Now the Latest Cybercrime Currency” (PDF).
In addition to the three Purdue researchers who were the report’s key contributors, 17 other researchers and experts were listed as contributors to the original 2009 report, though at least some of them were only interviewed by the Purdue researchers. Among them was Ross Anderson, a security engineering professor at University of Cambridge, who told ProPublica that he did not know about the $1 trillion estimate before it was announced. “I would have objected at the time had I known about it,” he said. “The intellectual quality of this ($1 trillion number) is below abysmal.”
Does Cybercrime Really Cost $1 Trillion?
(via /.)