Surveillance Nation
From
MIT’s Technology Review: April
2003
Route 9 is an old two-lane
highway that cuts across Massachusetts from Boston in the east to Pittsfield in
the west. Near the small city of Northampton, the highway crosses the wide
Connecticut River. The Calvin Coolidge Memorial Bridge, named after the
president who once served as Northampton's mayor, is a major regional traffic
link. When the state began a long-delayed and still-ongoing reconstruction of
the bridge in the summer of 2001, traffic jams stretched for kilometers into the
bucolic New England countryside.
In a project aimed at alleviating
drivers' frustration, the University of Massachusetts Transportation Center,
located in nearby Amherst, installed eight shoe-size digital surveillance
cameras along the roads leading to the bridge. Six are mounted on utility poles
and the roofs of local businesses. Made by Axis Communications in Sweden, they
are connected to dial-up modems and transmit images of the roadway before them
to a Web page, which commuters can check for congestion before tackling the
road. According to Dan Dulaski, the system's technical manager, running the
entire webcam system--power, phone, and Internet fees--costs just $600 a
month.
The other two cameras in the Coolidge Bridge project are a
little less routine. Built by Computer Recognition Systems in Wokingham,
England, with high-quality lenses and fast shutter speeds (1/10,000 second),
they are designed to photograph every car and truck that passes by. Located
eight kilometers apart, at the ends of the zone of maximum traffic congestion,
the two cameras send vehicle images to attached computers, which use special
character- recognition software to decipher vehicle license plates. The license
data go to a server at the company's U.S. office in Cambridge, MA, about 130
kilometers away. As each license plate passes the second camera, the server
ascertains the time difference between the two readings. The average of the
travel durations of all successfully matched vehicles defines the likely travel
time for crossing the bridge at any given moment, and that information is posted
on the traffic watch Web page.
To local residents, the traffic data
are helpful, even vital: police use the information to plan emergency routes.
But as the computers calculate traffic flow,
they are also making a record of all cars that cross the bridge-when they do so,
their average speed, and (depending on lighting and weather conditions) how many
people are in each car.
Trying to avoid provoking privacy
fears, Keith Fallon, a Computer Recognition Systems project engineer, says,
"we're not saving any of the information we capture. Everything is deleted
immediately." But the company could change its mind and start saving the data at
any time. No one on the road would know.
Road tools
Web-accessible video cameras installed near Northampton, MA, by the University
of Massachusetts Transportation Center overlook the Calvin Coolidge Memorial
Bridge on Route 9. Two additional cameras photograph individual cars crossing
the bridge and send the images to computers that isolate plates and use machine
vision algorithms to read the plate numbers. Once a plate has passed both
cameras, the car’s travel time is computed.
The Coolidge
Bridge is just one of thousands of locations around the planet where citizens
are crossing-willingly, more often than not -into a world of networked, highly
computerized surveillance. According to a January report by J.P. Freeman, a
security market-research firm in Newtown, CT, 26 million surveillance cameras have already been
installed worldwide, and more than 11 million of them are in the United States.
In heavily monitored London, England, Hull University criminologist Clive Norris
has estimated, the average person is filmed by more than 300 cameras each
day.
The $150 million-a-year remote digital-surveillance-
camera market will grow, according to Freeman, at an annual clip of 40 to 50
percent for the next 10 years. But astonishingly, other, nonvideo forms of
monitoring will increase even faster. In a process that mirrors the unplanned
growth of the Internet itself, thousands of personal, commercial, medical,
police, and government databases and monitoring systems will intersect and
entwine. Ultimately, surveillance will become so ubiquitous, networked, and
searchable that unmonitored public space will effectively cease to
exist.
This prospect what science fiction writer David Brin calls "the
transparent society"-may sound too distant to be worth thinking about. But even
the farsighted Brin underestimated how quickly technological advances-more
powerful microprocessors, faster network transmissions, larger hard drives,
cheaper electronics, and more sophisticated and powerful software-would make
universal surveillance possible.
It's not all about Big Brother or Big
Business, either. Widespread electronic scrutiny is usually denounced as a
creation of political tyranny or corporate greed. But the rise of omnipresent
surveillance will be driven as much by ordinary citizens' understandable--even
laudatory--desires for security, control, and comfort as by the imperatives of
business and government. "Nanny cams," global-positioning locators, police and
home security networks, traffic jam monitors, medical-device radio-frequency
tags, small-business webcams: the list of monitoring devices employed by and for
average Americans is already long, and it will only become longer. Extensive
surveillance, in short, is coming into being because people like and want
it.
"Almost all of the pieces for a surveillance society are already
here' " says Gene Spafford, director of Purdue University's Center for Education
and Research in Information Assurance and Security. "It's just a matter of
assembling them." Unfortunately, he says,
ubiquitous surveillance faces intractable social and technological problems that
could well reduce its usefulness or even make it dangerous. As a result, each
type of monitoring may be beneficial in itself, at least for the people who put
it in place, but the
collective result could be calamitous.
To begin with,
surveillance data from multiple sources are being combined into large databases.
For example, businesses track employees' car,
computer, and telephone use to evaluate their job performance; similarly, the
U.S. Defense Department's experimental Total Information Awareness project has
announced plans to sift through information about millions of people to find
data that identify criminals and terrorists.
But many of
these merged pools of data are less reliable than small-scale, localized
monitoring efforts; big databases are harder to comb for bad entries, and their
conclusions are far more difficult to verify. In addition, the inescapable
nature of surveillance can itself create alarm, even among its beneficiaries.
"Your little camera network may seem like a good idea to you," Spafford
says. "Living with everyone else's could be a
nightmare."
THE SURVEILLANCE AD-HOCRACY
Last
October deadly snipers terrorized Washington, DC, and the surrounding suburbs,
killing 10 people. For three long weeks, law enforcement agents seemed helpless
to stop the murderers, who struck at random and then vanished into the area's
snarl of highways. Ultimately, two alleged killers were arrested, but only
because their taunting messages to the authorities had inadvertently provided
clues to their identification.
In the not-too-distant future,
according to advocates of policing technologies, such unstoppable rampages may
become next to impossible, at least in populous areas. By combining police
cameras with private camera networks like that on Route 9, video coverage will
become so complete that any snipers who waged an attack--and all the people near
the crime scene--would be trackable from camera to camera until they could be
stopped and interrogated.
The
unquestionable usefulness and sheer affordability of these extensive
video-surveillance systems suggest that they will propagate rapidly. But
despite the relentlessly increasing capabilities of such systems, video
monitoring is still but a tiny part--less than 1 percent--of surveillance
overall, says Carl Botan, a Purdue center researcher who has studied this
technology for 15 years.
Examples are legion. By 2006, for instance,
law will require that every U.S. cell phone be designed to report its precise
location during a 911 call; wireless carriers plan to use the same technology to
offer 24-hour location-based services, including tracking of people and
vehicles. To prevent children from wittingly or unwittingly calling up porn
sites, the Seattle company N2H2 provides Web filtering and monitoring services
for 2,500 schools serving 16 million students. More than a third of all large
corporations electronically review the computer files used by their employees,
according to a recent American Management Association survey. Seven of the 10
biggest supermarket chains use discount cards to monitor customers' shopping
habits: tailoring product offerings to customers' wishes is key to survival in
that brutally competitive business. And as part of a new, federally mandated
tracking system, the three major U.S. automobile manufacturers plan to put
special radio transponders known as radio frequency identification tags in every
tire sold in the nation. Far exceeding congressional requirements, according to
a leader of the Automotive Industry Action Group, an industry think tank, the
tags can be read on vehicles going as fast as 160 kilometers per hour from a
distance of 4.5 meters.
Many if not most of today's surveillance
networks were set up by government and big business, but in years to come
individuals and small organizations will set the pace of growth. Future sales of
Net-enabled surveillance cameras, in the view of Fredrik Nilsson, Axis
Communications' director of business development, will be driven by
organizations that buy more than eight but fewer than 30 cameras-condo
associations, church groups, convenience store owners, parent-teacher
associations, and anyone else who might like to check what is happening in one
place while he is sitting in another. A dozen companies already help working
parents monitor their children's nannies and day-care centers from the office;
scores more let them watch backyards, school buses, playgrounds, and their own
living rooms. Two new startups--Wherify Wireless in Redwood Shores, CA, and
Peace of Mind at Light Speed in Westport, CT-are introducing bracelets and other
portable devices that continuously beam locating signals to satellites so that
worried moms and dads can always find their children.
As thousands of
ordinary people buy monitoring devices and services, the unplanned result will
be an immense, overlapping grid of surveillance systems, created unintentionally
by the same ad-hocracy that caused the Internet to explode. Meanwhile, the
computer networks on which monitoring data are stored and manipulated continue
to grow faster, cheaper, smarter, and able to store information in greater
volume for longer times. Ubiquitous digital surveillance will marry widespread
computational power-with startling results.
The factors driving the
growth of computing potential are well known. Moore's law-which roughly equates
to the doubling of processor speed every 18 months-seems likely to continue its
famous march. Hard drive capacity is rising even faster. It has doubled every
year for more than a decade, and this should go on "as far as the eye can see,"
according to Robert M. Wise, director of product marketing for the desktop
product group at Maxtor, a hard drive manufacturer. Similarly, according to a
2001 study by a pair of AT&T Labs researchers, network transmission capacity
has more than doubled annually for the last dozen years, a tendency that should
continue for at least another decade and will keep those powerful processors and
hard drives well fed with fresh data.
Today a company or agency with a
$10 million hardware budget can buy processing power equivalent to 2,000
workstations, two petabytes of hard drive space (two million gigabytes, or
50,000 standard 40-gigabyte hard drives like those found on today's PCs), and a
two-gigabit Internet connection (more than 2,000 times the capacity of a typical
home broadband connection). If current trends continue, simple arithmetic
predicts that in 20 years the same purchasing power will buy the processing
capability of 10 million of today's workstations, 200 exabytes (200 million
gigabytes) of storage capacity, and 200 exabits (200 million megabits) of
bandwidth. Another way of saying this is that by 2023 large organizations will
be able to devote the equivalent of a contemporary PC to monitoring every single
one of the 330 million people who will then be living in the United
States
One of the first applications for this combination of
surveillance and computational power, says Raghu. Ramakrishnan,a database
researcher at the University of Wisconsin-Madison, will be continuous intensive
monitoring of buildings, offices, and stores: the spaces where middle-class
people spend most of their lives. Surveillance in the workplace is common now:
in 2001, according to the American Management Association survey, 77.7 percent of major U.S.
corporations electronically monitored their employees, and that statistic
had more than doubled since 1997 (see "Eye on Employees," p. 39). But much more
is on the way. Companies like Johnson Controls and Siemens, Ramakrishnan says,
are already "doing simplistic kinds of 'asset tracking, ' as they call it.' They
use radio frequency identification tags to monitor the locations of people as
well as inventory. In January, Gillette began attaching such tags to 500 million
of its Mach 3 Turbo razors. Special "smart shelves" at Wal-Mart stores will
record the removal of razors by shoppers, thereby alerting stock clerks whenever
shelves need to be refilled-and effectively transforming Gillette customers into
walking radio beacons. In the future, such tags will be used by hospitals to
ensure that patients and staff maintain quarantines, by law offices to keep
visitors from straying into rooms containing clients' confidential papers, and
in kindergartens to track toddlers.
By employing multiple, overlapping
types of monitoring, Ramakrishnan says, managers will be able to "keep track of
people, objects, and environmental levels throughout a whole complex."
Initially, these networks will be installed for "such mundane things as trying
to figure out when to replace the carpets or which areas of lawn get the most
traffic so you need to spread some grass seed preventively." But as computers
and monitoring equipment become cheaper and more powerful, managers will use
surveillance data to construct complex, multidimensional records of how spaces
are used. The models will be analyzed to improve efficiency and security-and
they will be sold to other businesses or governments. Over time, the thousands
of individual monitoring schemes inevitably will merge together and feed their
data into large commercial and state-owned networks. When surveillance databases
can describe or depict what every individual is doing at a particular time,
Ramakrishnan says, they will be providing humankind with the digital equivalent
of an ancient dream: being "present, in effect, almost anywhere and
anytime."
GARBAGE IN, GARBAGE OUT
In 1974 Francis
Ford Coppola wrote and directed The Conversation, which starred Gene Hackman as
Harry Caul, a socially maladroit surveillance expert. In this remarkably
prescient movie, a mysterious organization hires Caul to record a quiet
discussion that will take place in the middle of a crowd in San Francisco's
Union Square. Caul deploys three microphones: one in a bag carried by a
confederate and two directional mikes installed on buildings overlooking the
area. Afterward Caul discovers that each of the three recordings is plagued by
background noise and distortions, but by combining the different sources, he is
able to piece together the conversation. Or, rather, he thinks he has pieced it
together. Later, to his horror, Caul learns that he misinterpreted a crucial
line, a discovery that leads directly to the movie's chilling
denouement.
The Conversation illustrates a central dilemma for
tomorrow's surveillance society. Although much of the explosive growth in
monitoring is being driven by consumer demand, that growth has not yet been
accompanied by solutions to the classic difficulties computer systems have
integrating disparate sources of information and arriving at valid conclusions.
Data quality problems that cause little inconvenience on a local scale, when
Wal-Mart's smart shelves misread a razor's radio frequency identification
tag-have much larger consequences when organizations assemble big databases from
many sources and attempt to draw conclusions about, say, someone's capacity for
criminal action. Such problems, in the long run, will play a large role in
determining both the technical and social impact of surveillance.
The experimental and controversial Total
Information Awareness program of the Defense Advanced Research Projects Agency
exemplifies these issues. By merging records from corporate, medical, retail,
educational, travel, telephone, and even veterinary sources, as well as such
"biometric" data as fingerprints, iris and retina scans, DNA tests, and
facial--characteristic measurements, the program is intended to create an
unprecedented repository of information about both U.S. citizens and foreigners
with U.S. contacts. Program director John M. Poindexter has explained
that analysts will use custom data-mining techniques to sift through the mass of
information, attempting to "detect, classify and identify foreign terrorists" in
order to "preempt and defeat terrorist acts"-a virtual Eye of Sauron, in
critics' view, constructed from telephone bills and shopping preference
cards.
In February Congress required
the Pentagon to obtain its specific approval before implementing Total
Information Awareness in the United States (though certain actions are allowed
on foreign soil). But President George W. Bush had already announced that
he was creating an apparently similar effort, the Terrorist Threat Integration
Center, to be led by the Central Intelligence Agency. Regardless of the fate of
these two programs, other equally sweeping attempts to pool monitoring data are
proceeding apace. Among these initiatives is Regulatory DataCorp, a for-profit
consortium of 19 top financial institutions worldwide. The consortium, which was
formed last July, combines members' customer data in an effort to combat "money
laundering, fraud, terrorist financing, organized crime, and corruption." By
constantly poring through more than 20,000 sources of public information about
potential wrongdoings, from newspaper articles and Interpol warrants to
disciplinary actions by the U.S. Securities and Exchange Commission the
consortium's Global Regulatory Information Database will, according to its
owner, help clients "know their customers."
Equally important in the
long run are the databases that will be created by the nearly spontaneous
aggregation of scores or hundreds of smaller databases. "What seem to be small
scale, discrete systems end up being combined into large databases," says Marc
Rotenberg, executive director of the Electronic Privacy Information Center, a
nonprofit research organization in Washington, DC. He points to the recent,
voluntary efforts of merchants in Washington's affluent Georgetown district.
They are integrating their in-store closed-circuit television networks making
the combined results available to city police. In Rotenberg's view, the
collection and consolidation of individual surveillance networks into big
government and industry programs "is a strange mix of public and private, and
it's not something that the legal system has encountered much before."
Managing the sheer size of these aggregate surveillance databases, surprisingly,
will not pose insurmountable technical difficulties. Most personal data are
either very compact or easily compressible. Financial, medical, and shopping
records can be represented as strings of text that are easily stored and
transmitted; as a general rule, the records do not grow substantially over
time.
Even biometric records are no strain on computing systems. To
identify people, genetic testing firms typically need stretches of DNA that can
be represented in just one kilobyte-the size of a short email message.
Fingerprints, iris scans, and other types of biometric data consume little more.
Other forms of data can be preprocessed in much the way that the cameras on
Route 9 transform multimegabyte images of cars into short strings of text with
license plate numbers and times. (For investigators, having a video of suspects
driving down a road usually is not as important as simply knowing that they were
there at a given time.) To create a digital dossier for every individual in the
United States as programs like Total Information Awareness and would require
only "a couple terabytes of well-defined information" would be needed, says
Jeffrey Ullman, a former Stanford University database researcher. "I don't think
that's really stressing the capacity of [even today's] databases!
Instead, argues Rajeev Motwani, another member of Stanford's database group, the
real challenge for large surveillance databases will be the seemingly simple
task of gathering valid data. Computer scientists use the term GIGO--garbage in,
garbage out--to describe situations in which erroneous input creates erroneous
output. Whether people are building bombs or buying bagels, governments and
corporations try to predict their behavior by integrating data from sources as
disparate as electronic toll-collection sensors, library records, restaurant
credit-card receipts, and grocery store customer cards-to say nothing of the
Internet, surely the world's largest repository of personal information.
Unfortunately, all these sources are full of errors, as are financial and
medical records. Names are misspelled and digits transposed; address and e-mail
records become outdated when people move and switch Internet service providers;
and formatting differences among databases cause information loss and distortion
when they are merged. "It is routine to find in large customer databases
defective records, records with at least one major error or omission at rates of
at least 20 to 35 percent," says Larry English of Information Impact, a database
consulting company in Brentwood, TN.
Unfortunately, says Motwani,
"data cleaning is a major open problem in the research community. We are still
struggling to get a formal technical definition of the problem." Even when the original data are
correct, he argues, merging them can introduce errors where none had existed
before. Worse, none of these worries about the garbage going into the
system even begin to address the still larger problems with the garbage going
out.
THE DISSOLUTION OF PRIVACY
Almost every
computer-science student takes a course in algorithms. "Algorithms are sets of
specified, repeatable rules or procedures for accomplishing tasks such as
sorting numbers; they are, so to speak, the engines that make programs run.
Unfortunately, innovations in algorithms are not subject to Moore's law, and
progress in the field is notoriously sporadic. "There are certain areas in
algorithms we basically can't do better and others where creative work will have
to be done," Ullman says. Sifting through large surveillance databases for
information, he says, will essentially be "a problem in research in algorithms.
We need to exploit some of the stuff that's been done in the data-mining
community recently and do it much, much better."
Working with
databases requires users to have two mental models. One is a model of the data.
Teasing out answers to questions from the popular search engine Google, for
example, is easier if users grasp the varieties and types of data on the
Internet--Web pages with words and pictures, whole documents in a multiplicity
of formats, downloadable software and media files--and how they are stored. In
exactly the same way, extracting information from surveillance databases will
depend on a user's knowledge of the system. "It's a chess game;' Ullman says.
"An unusually smart analyst will get things that a not-so-smart one will
not."
Second, and more important according to Spafford, effective use
of big surveillance databases will depend on having a model of what one is
looking for. This factor is especially crucial, he says, when trying to predict
the future, a goal of many commercial and government projects. For this reason,
what might be called reactive searches that scan recorded data for specific
patterns are generally much more likely to obtain useful answers than proactive
searches that seek to get ahead of things. If, for instance, police in the
Washington sniper investigation had been able to tap into a pervasive network of
surveillance cameras, they could have tracked people seen near the crime scenes
until they could be stopped and questioned: a reactive process. But it is
unlikely that police would have been helped by proactively asking surveillance
databases for the names of people in the Washington area with the requisite
characteristics (family difficulties, perhaps, or military training and a recent
penchant for drinking) to become snipers.
In many cases, invalid answers are harmless.
If Victoria's Secret mistakenly mails 1 percent of its spring catalogs to people
with no interest in lingerie, the price paid by all parties is
small. But if a national
terrorist tracking system has the same 1 percent error rate, it will produce
millions of false alarms, wasting huge amounts of investigators' time
and, worse, labeling many
innocent U.S. citizens as suspects. "A 99 percent hit rate is great for
advertising," Spafford says, "but terrible for spotting terrorism."
Because no system can have a success rate of 100 percent, analysts can try to
decrease the likelihood that surveillance databases will identify blameless
people as possible terrorists. By making the criteria for flagging suspects more
stringent, officials can raise the bar, and fewer ordinary citizens will be
wrongly fingered. Inevitably, however, that will mean also that the "borderline"
terrorists, those who don't match all the search criteria but still have lethal
intentions, might be overlooked as well. For both types of error, the potential
consequences are alarming.
Yet none of these concerns will stop the
growth of surveillance, says Ben Shneiderman, a computer scientist at the
University of Maryland. Its potential benefits are simply too large. An example
is what Shneiderman, in his recent book Leonardo's Laptop: Human Needs and the
New Computing Technologies, calls the World
Wide Med: a global, unified database that makes every patient's complete medical
history instantly available to doctors through the Internet, replacing today's
scattered sheaves of paper records (see "Paperless Medicine," p. 58). "The
idea," he says, "is that if you're brought to an ER anywhere in the world, your
medical records pop up in 30 seconds." Similar programs are already
coming into existence. Backed by the Centers for Disease Control and Prevention,
a team based at Harvard Medical School is planning to monitor the records of 20
million walk-in hospital patients throughout the United States for clusters of
symptoms associated with bioterror agents. Given the huge number of lost or
confused medical records, the benefits of such plans are clear. But because doctors would be continually adding
information to medical histories, the system would be monitoring patients' most
intimate personal data.
The network, therefore, threatens to violate patient confidentiality on a global
scale.
In Shneiderman's view, such trade--offs are
inherent to surveillance. The collective by-product of thousands of
unexceptionable, even praiseworthy efforts to gather data could be something
nobody wants: the demise of privacy. "These net-works are growing much faster
than people realize, " he says. "We need to pay attention to what we're doing
right now."
In The Conversation, surveillance expert Harry Caul is
forced to confront the trade-offs of his profession directly. The conversation
in Union Square provides information that he uses to try to stop a murder.
Unfortunately, his faulty interpretation of its meaning prevents him from
averting tragedy. Worse still, we see in scene after scene that even the expert
snoop is unable to avoid being monitored and recorded. At the movie's intense,
almost wordless climax, Caul rips his home apart in a futile effort to find the
electronic bugs that are hounding him.
The Conversation foreshadowed a
view now taken by many experts: surveillance cannot be stopped. There
is no possibility of "opting out." The question instead is how to use
technology, policy, and shared societal values to guide the spread of
surveillance-by the government, by corporations, and perhaps most of all by our
own unwitting and enthusiastic participation-while limiting its downside.
Next month: how surveillance technology is changing our definition of
privacy and why the keys to preserving it may be in the technology
itself.
TECHNOLOGY REVIEW April 2003