What Happens When We Let Industry and Government Collect All the Data They Want

What's to come?
Nov. 7 2014 10:10 AM

Big Data and the Underground Railroad

Industry and government say “collect everything.” History suggests this is a bad idea.

Members of the Japanese-American Mochida family awaiting re-location to a camp, Hayward, California.
Members of the Japanese-American Mochida family awaiting re-location to a camp, Hayward, California.

Photo by Dorothea Lange/Getty Images

The Virginia Gazette, Williamsburg, September 14, 1769.
The Virginia Gazette, Williamsburg, Sept. 14, 1769.

Virginia Historical Society/Library of Congress.

In the fall of 1769, Thomas Jefferson lost a slave. His name was Sandy, and he was a runaway. Sandy was “about 35 years of age.” He worked as a shoemaker. Jefferson described him as “artful and knavish.” He was also “something of a horse jockey.”

Jefferson criticized slavery. Yet when he signed the Declaration of Independence in 1776, Jefferson owned almost 200 human beings. When Sandy went missing, he owned about 20; losing even one was significant. So Jefferson used the best available technology to find Sandy: the newspaper ad.

Sandy was caught and later sold for 100 pounds. Around the turn of the century, however, things slowly started to change. A secret network was built to help people like Sandy. Over time, tens of thousands of runaway slaves would escape bondage on the Underground Railroad.

Advertisement

How many of them would have made it in the age of big data?

There is a booming debate around what big data means for vulnerable communities. Industry groups argue, in good faith, that it will be a tool for empowering the disadvantaged. Others are skeptical. Algorithms have learned that workers with longer commutes quit their jobs sooner. Is it fair to turn away job applicants with long commutes if that disproportionately hurts blacks and Latinos? Is it legal for a company to assign you a credit score based on the creditworthiness of your neighbors? Are big data algorithms as neutral and accurate as they seem—and if they’re not, are our discrimination laws up to the challenge?

These questions need answers. Most of the questions, however, focus on how our data should be used. There’s been far less attention to a growing effort to change how our data is collected.

For years, efforts to protect privacy have focused on giving people the ability to choose what data is collected about them. Now, industry—with the support of some leaders in government—wants to shift that focus. Businesses say that in our data-saturated world, giving consumers meaningful control over data collection is next to impossible. They argue that we should ramp down efforts to give individuals control over the initial collection of their data, and instead let industry collect as much personal information as possible.

Privacy protections? They would come after the fact, through “use restrictions” that would prohibit certain uses of data that society deemed harmful. We used to try to protect people at each stage of data processing—collection, analysis, sharing. Now, it’s collect first and ask questions later.

This isn’t a fringe argument. It was endorsed by the World Economic Forum of Davos as well as the president’s own Council of Advisors on Science and Technology, which issued a report on the subject in May. “The beneficial uses of near-ubiquitous data collection are large, and they fuel an increasingly important set of economic activities,” the president’s council wrote. “[A] policy focus on limiting data collection will not be a broadly applicable or scalable strategy—nor one likely to achieve the right balance between beneficial results and unintended negative consequences (such as inhibiting economic growth).”

As Chris Jay Hoofnagle wisely explains in Slate, industry’s noisy embrace of ubiquitous collection is really an attempt to deregulate data privacy. Unfortunately, deregulation will hurt some much more than others.

Davos and the president’s council are basically saying that it’s OK to vacuum up data, so long as you prohibit certain harmful uses of it. The problem is that harmful uses of data are often recognized as such only long after the fact. Our society has been especially slow to condemn uses of data that hurt racial and ethnic minorities, the LGBT community, and other “undesirables.”

In the spring of 1940, Japanese Americans received visits from census examiners. With war burning across Europe and Asia—and with a growing alliance between Japan and Germany—these visits could not have been comfortable. Yet by and large, Japanese-Americans cooperated with the census. After all, by federal law, census data was subject to strict use restrictions: The Census Bureau was required to keep personal information confidential.

Their trust was misplaced. In 1942, Congress lifted the confidentiality provisions of the census, letting the Census Bureau share detailed data with other government agencies “for use in connection with the conduct of the war.” The War Department would go on to use detailed census data to track Japanese Americans and round them up for internment camps.

  Slate Plus
Slate Archives
Nov. 11 2014 12:25 PM Slate Voice: “Stop Dressing Like a Slob When You’re Traveling” J. Bryan Lowder defends looking nice on a flight or train.