Surveillance Nation
From 
MIT’s Technology Review: April 
2003
  
  Route 9 is an old two-lane 
highway that cuts across Massachusetts from Boston in the east to Pittsfield in 
the west. Near the small city of Northampton, the highway crosses the wide 
Connecticut River. The Calvin Coolidge Memorial Bridge, named after the 
president who once served as Northampton's mayor, is a major regional traffic 
link. When the state began a long-delayed and still-ongoing reconstruction of 
the bridge in the summer of 2001, traffic jams stretched for kilometers into the 
bucolic New England countryside.
  In a project aimed at alleviating 
drivers' frustration, the University of Massachusetts Transportation Center, 
located in nearby Amherst, installed eight shoe-size digital surveillance 
cameras along the roads leading to the bridge. Six are mounted on utility poles 
and the roofs of local businesses. Made by Axis Communications in Sweden, they 
are connected to dial-up modems and transmit images of the roadway before them 
to a Web page, which commuters can check for congestion before tackling the 
road. According to Dan Dulaski, the system's technical manager, running the 
entire webcam system--power, phone, and Internet fees--costs just $600 a 
month.
  The other two cameras in the Coolidge Bridge project are a 
little less routine. Built by Computer Recognition Systems in Wokingham, 
England, with high-quality lenses and fast shutter speeds (1/10,000 second), 
they are designed to photograph every car and truck that passes by. Located 
eight kilometers apart, at the ends of the zone of maximum traffic congestion, 
the two cameras send vehicle images to attached computers, which use special 
character- recognition software to decipher vehicle license plates. The license 
data go to a server at the company's U.S. office in Cambridge, MA, about 130 
kilometers away. As each license plate passes the second camera, the server 
ascertains the time difference between the two readings. The average of the 
travel durations of all successfully matched vehicles defines the likely travel 
time for crossing the bridge at any given moment, and that information is posted 
on the traffic watch Web page.
  To local residents, the traffic data 
are helpful, even vital: police use the information to plan emergency routes. 
But as the computers calculate traffic flow, 
they are also making a record of all cars that cross the bridge-when they do so, 
their average speed, and (depending on lighting and weather conditions) how many 
people are in each car.
  Trying to avoid provoking privacy 
fears, Keith Fallon, a Computer Recognition Systems project engineer, says, 
"we're not saving any of the information we capture. Everything is deleted 
immediately." But the company could change its mind and start saving the data at 
any time. No one on the road would know.
Road tools
  
Web-accessible video cameras installed near Northampton, MA, by the University 
of Massachusetts Transportation Center overlook the Calvin Coolidge Memorial 
Bridge on Route 9. Two additional cameras photograph individual cars crossing 
the bridge and send the images to computers that isolate plates and use machine 
vision algorithms to read the plate numbers. Once a plate has passed both 
cameras, the car’s travel time is computed.
 
 The Coolidge 
Bridge is just one of thousands of locations around the planet where citizens 
are crossing-willingly, more often than not -into a world of networked, highly 
computerized surveillance. According to a January report by J.P. Freeman, a 
security market-research firm in Newtown, CT, 26 million surveillance cameras have already been 
installed worldwide, and more than 11 million of them are in the United States. 
In heavily monitored London, England, Hull University criminologist Clive Norris 
has estimated, the average person is filmed by more than 300 cameras each 
day.
  The $150 million-a-year remote digital-surveillance- 
camera market will grow, according to Freeman, at an annual clip of 40 to 50 
percent for the next 10 years. But astonishingly, other, nonvideo forms of 
monitoring will increase even faster. In a process that mirrors the unplanned 
growth of the Internet itself, thousands of personal, commercial, medical, 
police, and government databases and monitoring systems will intersect and 
entwine. Ultimately, surveillance will become so ubiquitous, networked, and 
searchable that unmonitored public space will effectively cease to 
exist.
  This prospect what science fiction writer David Brin calls "the 
transparent society"-may sound too distant to be worth thinking about. But even 
the farsighted Brin underestimated how quickly technological advances-more 
powerful microprocessors, faster network transmissions, larger hard drives, 
cheaper electronics, and more sophisticated and powerful software-would make 
universal surveillance possible.
  It's not all about Big Brother or Big 
Business, either. Widespread electronic scrutiny is usually denounced as a 
creation of political tyranny or corporate greed. But the rise of omnipresent 
surveillance will be driven as much by ordinary citizens' understandable--even 
laudatory--desires for security, control, and comfort as by the imperatives of 
business and government. "Nanny cams," global-positioning locators, police and 
home security networks, traffic jam monitors, medical-device radio-frequency 
tags, small-business webcams: the list of monitoring devices employed by and for 
average Americans is already long, and it will only become longer. Extensive 
surveillance, in short, is coming into being because people like and want 
it.
  "Almost all of the pieces for a surveillance society are already 
here' " says Gene Spafford, director of Purdue University's Center for Education 
and Research in Information Assurance and Security. "It's just a matter of 
assembling them." Unfortunately, he says, 
ubiquitous surveillance faces intractable social and technological problems that 
could well reduce its usefulness or even make it dangerous. As a result, each 
type of monitoring may be beneficial in itself, at least for the people who put 
it in place, but the 
collective result could be calamitous.
  To begin with, 
surveillance data from multiple sources are being combined into large databases. 
For example, businesses track employees' car, 
computer, and telephone use to evaluate their job performance; similarly, the 
U.S. Defense Department's experimental Total Information Awareness project has 
announced plans to sift through information about millions of people to find 
data that identify criminals and terrorists.
  But many of 
these merged pools of data are less reliable than small-scale, localized 
monitoring efforts; big databases are harder to comb for bad entries, and their 
conclusions are far more difficult to verify. In addition, the inescapable 
nature of surveillance can itself create alarm, even among its beneficiaries. 
"Your little camera network may seem like a good idea to you," Spafford 
says. "Living with everyone else's could be a 
nightmare."
THE SURVEILLANCE AD-HOCRACY
  Last 
October deadly snipers terrorized Washington, DC, and the surrounding suburbs, 
killing 10 people. For three long weeks, law enforcement agents seemed helpless 
to stop the murderers, who struck at random and then vanished into the area's 
snarl of highways. Ultimately, two alleged killers were arrested, but only 
because their taunting messages to the authorities had inadvertently provided 
clues to their identification.
  In the not-too-distant future, 
according to advocates of policing technologies, such unstoppable rampages may 
become next to impossible, at least in populous areas. By combining police 
cameras with private camera networks like that on Route 9, video coverage will 
become so complete that any snipers who waged an attack--and all the people near 
the crime scene--would be trackable from camera to camera until they could be 
stopped and interrogated.
  The 
unquestionable usefulness and sheer affordability of these extensive 
video-surveillance systems suggest that they will propagate rapidly. But 
despite the relentlessly increasing capabilities of such systems, video 
monitoring is still but a tiny part--less than 1 percent--of surveillance 
overall, says Carl Botan, a Purdue center researcher who has studied this 
technology for 15 years.
  Examples are legion. By 2006, for instance, 
law will require that every U.S. cell phone be designed to report its precise 
location during a 911 call; wireless carriers plan to use the same technology to 
offer 24-hour location-based services, including tracking of people and 
vehicles. To prevent children from wittingly or unwittingly calling up porn 
sites, the Seattle company N2H2 provides Web filtering and monitoring services 
for 2,500 schools serving 16 million students. More than a third of all large 
corporations electronically review the computer files used by their employees, 
according to a recent American Management Association survey. Seven of the 10 
biggest supermarket chains use discount cards to monitor customers' shopping 
habits: tailoring product offerings to customers' wishes is key to survival in 
that brutally competitive business. And as part of a new, federally mandated 
tracking system, the three major U.S. automobile manufacturers plan to put 
special radio transponders known as radio frequency identification tags in every 
tire sold in the nation. Far exceeding congressional requirements, according to 
a leader of the Automotive Industry Action Group, an industry think tank, the 
tags can be read on vehicles going as fast as 160 kilometers per hour from a 
distance of 4.5 meters.
  Many if not most of today's surveillance 
networks were set up by government and big business, but in years to come 
individuals and small organizations will set the pace of growth. Future sales of 
Net-enabled surveillance cameras, in the view of Fredrik Nilsson, Axis 
Communications' director of business development, will be driven by 
organizations that buy more than eight but fewer than 30 cameras-condo 
associations, church groups, convenience store owners, parent-teacher 
associations, and anyone else who might like to check what is happening in one 
place while he is sitting in another. A dozen companies already help working 
parents monitor their children's nannies and day-care centers from the office; 
scores more let them watch backyards, school buses, playgrounds, and their own 
living rooms. Two new startups--Wherify Wireless in Redwood Shores, CA, and 
Peace of Mind at Light Speed in Westport, CT-are introducing bracelets and other 
portable devices that continuously beam locating signals to satellites so that 
worried moms and dads can always find their children.
  As thousands of 
ordinary people buy monitoring devices and services, the unplanned result will 
be an immense, overlapping grid of surveillance systems, created unintentionally 
by the same ad-hocracy that caused the Internet to explode. Meanwhile, the 
computer networks on which monitoring data are stored and manipulated continue 
to grow faster, cheaper, smarter, and able to store information in greater 
volume for longer times. Ubiquitous digital surveillance will marry widespread 
computational power-with startling results.
  The factors driving the 
growth of computing potential are well known. Moore's law-which roughly equates 
to the doubling of processor speed every 18 months-seems likely to continue its 
famous march. Hard drive capacity is rising even faster. It has doubled every 
year for more than a decade, and this should go on "as far as the eye can see," 
according to Robert M. Wise, director of product marketing for the desktop 
product group at Maxtor, a hard drive manufacturer. Similarly, according to a 
2001 study by a pair of AT&T Labs researchers, network transmission capacity 
has more than doubled annually for the last dozen years, a tendency that should 
continue for at least another decade and will keep those powerful processors and 
hard drives well fed with fresh data.
  Today a company or agency with a 
$10 million hardware budget can buy processing power equivalent to 2,000 
workstations, two petabytes of hard drive space (two million gigabytes, or 
50,000 standard 40-gigabyte hard drives like those found on today's PCs), and a 
two-gigabit Internet connection (more than 2,000 times the capacity of a typical 
home broadband connection). If current trends continue, simple arithmetic 
predicts that in 20 years the same purchasing power will buy the processing 
capability of 10 million of today's workstations, 200 exabytes (200 million 
gigabytes) of storage capacity, and 200 exabits (200 million megabits) of 
bandwidth. Another way of saying this is that by 2023 large organizations will 
be able to devote the equivalent of a contemporary PC to monitoring every single 
one of the 330 million people who will then be living in the United 
States
 One of the first applications for this combination of 
surveillance and computational power, says Raghu. Ramakrishnan,a database 
researcher at the University of Wisconsin-Madison, will be continuous intensive 
monitoring of buildings, offices, and stores: the spaces where middle-class 
people spend most of their lives. Surveillance in the workplace is common now: 
in 2001, according to the American Management Association survey, 77.7 percent of major U.S. 
corporations electronically monitored their employees, and that statistic 
had more than doubled since 1997 (see "Eye on Employees," p. 39). But much more 
is on the way. Companies like Johnson Controls and Siemens, Ramakrishnan says, 
are already "doing simplistic kinds of 'asset tracking, ' as they call it.' They 
use radio frequency identification tags to monitor the locations of people as 
well as inventory. In January, Gillette began attaching such tags to 500 million 
of its Mach 3 Turbo razors. Special "smart shelves" at Wal-Mart stores will 
record the removal of razors by shoppers, thereby alerting stock clerks whenever 
shelves need to be refilled-and effectively transforming Gillette customers into 
walking radio beacons. In the future, such tags will be used by hospitals to 
ensure that patients and staff maintain quarantines, by law offices to keep 
visitors from straying into rooms containing clients' confidential papers, and 
in kindergartens to track toddlers.
  By employing multiple, overlapping 
types of monitoring, Ramakrishnan says, managers will be able to "keep track of 
people, objects, and environmental levels throughout a whole complex." 
Initially, these networks will be installed for "such mundane things as trying 
to figure out when to replace the carpets or which areas of lawn get the most 
traffic so you need to spread some grass seed preventively." But as computers 
and monitoring equipment become cheaper and more powerful, managers will use 
surveillance data to construct complex, multidimensional records of how spaces 
are used. The models will be analyzed to improve efficiency and security-and 
they will be sold to other businesses or governments. Over time, the thousands 
of individual monitoring schemes inevitably will merge together and feed their 
data into large commercial and state-owned networks. When surveillance databases 
can describe or depict what every individual is doing at a particular time, 
Ramakrishnan says, they will be providing humankind with the digital equivalent 
of an ancient dream: being "present, in effect, almost anywhere and 
anytime."
GARBAGE IN, GARBAGE OUT
  In 1974 Francis 
Ford Coppola wrote and directed The Conversation, which starred Gene Hackman as 
Harry Caul, a socially maladroit surveillance expert. In this remarkably 
prescient movie, a mysterious organization hires Caul to record a quiet 
discussion that will take place in the middle of a crowd in San Francisco's 
Union Square. Caul deploys three microphones: one in a bag carried by a 
confederate and two directional mikes installed on buildings overlooking the 
area. Afterward Caul discovers that each of the three recordings is plagued by 
background noise and distortions, but by combining the different sources, he is 
able to piece together the conversation. Or, rather, he thinks he has pieced it 
together. Later, to his horror, Caul learns that he misinterpreted a crucial 
line, a discovery that leads directly to the movie's chilling 
denouement.
  The Conversation illustrates a central dilemma for 
tomorrow's surveillance society. Although much of the explosive growth in 
monitoring is being driven by consumer demand, that growth has not yet been 
accompanied by solutions to the classic difficulties computer systems have 
integrating disparate sources of information and arriving at valid conclusions. 
Data quality problems that cause little inconvenience on a local scale, when 
Wal-Mart's smart shelves misread a razor's radio frequency identification 
tag-have much larger consequences when organizations assemble big databases from 
many sources and attempt to draw conclusions about, say, someone's capacity for 
criminal action. Such problems, in the long run, will play a large role in 
determining both the technical and social impact of surveillance.
  The experimental and controversial Total 
Information Awareness program of the Defense Advanced Research Projects Agency 
exemplifies these issues. By merging records from corporate, medical, retail, 
educational, travel, telephone, and even veterinary sources, as well as such 
"biometric" data as fingerprints, iris and retina scans, DNA tests, and 
facial--characteristic measurements, the program is intended to create an 
unprecedented repository of information about both U.S. citizens and foreigners 
with U.S. contacts. Program director John M. Poindexter has explained 
that analysts will use custom data-mining techniques to sift through the mass of 
information, attempting to "detect, classify and identify foreign terrorists" in 
order to "preempt and defeat terrorist acts"-a virtual Eye of Sauron, in 
critics' view, constructed from telephone bills and shopping preference 
cards.
  In February Congress required 
the Pentagon to obtain its specific approval before implementing Total 
Information Awareness in the United States (though certain actions are allowed 
on foreign soil). But President George W. Bush had already announced that 
he was creating an apparently similar effort, the Terrorist Threat Integration 
Center, to be led by the Central Intelligence Agency. Regardless of the fate of 
these two programs, other equally sweeping attempts to pool monitoring data are 
proceeding apace. Among these initiatives is Regulatory DataCorp, a for-profit 
consortium of 19 top financial institutions worldwide. The consortium, which was 
formed last July, combines members' customer data in an effort to combat "money 
laundering, fraud, terrorist financing, organized crime, and corruption." By 
constantly poring through more than 20,000 sources of public information about 
potential wrongdoings, from newspaper articles and Interpol warrants to 
disciplinary actions by the U.S. Securities and Exchange Commission the 
consortium's Global Regulatory Information Database will, according to its 
owner, help clients "know their customers."
  Equally important in the 
long run are the databases that will be created by the nearly spontaneous 
aggregation of scores or hundreds of smaller databases. "What seem to be small 
scale, discrete systems end up being combined into large databases," says Marc 
Rotenberg, executive director of the Electronic Privacy Information Center, a 
nonprofit research organization in Washington, DC. He points to the recent, 
voluntary efforts of merchants in Washington's affluent Georgetown district. 
They are integrating their in-store closed-circuit television networks making 
the combined results available to city police. In Rotenberg's view, the 
collection and consolidation of individual surveillance networks into big 
government and industry programs "is a strange mix of public and private, and 
it's not something that the legal system has encountered much before."
  
Managing the sheer size of these aggregate surveillance databases, surprisingly, 
will not pose insurmountable technical difficulties. Most personal data are 
either very compact or easily compressible. Financial, medical, and shopping 
records can be represented as strings of text that are easily stored and 
transmitted; as a general rule, the records do not grow substantially over 
time.
  Even biometric records are no strain on computing systems. To 
identify people, genetic testing firms typically need stretches of DNA that can 
be represented in just one kilobyte-the size of a short email message. 
Fingerprints, iris scans, and other types of biometric data consume little more. 
Other forms of data can be preprocessed in much the way that the cameras on 
Route 9 transform multimegabyte images of cars into short strings of text with 
license plate numbers and times. (For investigators, having a video of suspects 
driving down a road usually is not as important as simply knowing that they were 
there at a given time.) To create a digital dossier for every individual in the 
United States as programs like Total Information Awareness and would require 
only "a couple terabytes of well-defined information" would be needed, says 
Jeffrey Ullman, a former Stanford University database researcher. "I don't think 
that's really stressing the capacity of [even today's] databases!
  
Instead, argues Rajeev Motwani, another member of Stanford's database group, the 
real challenge for large surveillance databases will be the seemingly simple 
task of gathering valid data. Computer scientists use the term GIGO--garbage in, 
garbage out--to describe situations in which erroneous input creates erroneous 
output. Whether people are building bombs or buying bagels, governments and 
corporations try to predict their behavior by integrating data from sources as 
disparate as electronic toll-collection sensors, library records, restaurant 
credit-card receipts, and grocery store customer cards-to say nothing of the 
Internet, surely the world's largest repository of personal information. 
Unfortunately, all these sources are full of errors, as are financial and 
medical records. Names are misspelled and digits transposed; address and e-mail 
records become outdated when people move and switch Internet service providers; 
and formatting differences among databases cause information loss and distortion 
when they are merged. "It is routine to find in large customer databases 
defective records, records with at least one major error or omission at rates of 
at least 20 to 35 percent," says Larry English of Information Impact, a database 
consulting company in Brentwood, TN.
  Unfortunately, says Motwani, 
"data cleaning is a major open problem in the research community. We are still 
struggling to get a formal technical definition of the problem." Even when the original data are 
correct, he argues, merging them can introduce errors where none had existed 
before. Worse, none of these worries about the garbage going into the 
system even begin to address the still larger problems with the garbage going 
out.
THE DISSOLUTION OF PRIVACY
  Almost every 
computer-science student takes a course in algorithms. "Algorithms are sets of 
specified, repeatable rules or procedures for accomplishing tasks such as 
sorting numbers; they are, so to speak, the engines that make programs run. 
Unfortunately, innovations in algorithms are not subject to Moore's law, and 
progress in the field is notoriously sporadic. "There are certain areas in 
algorithms we basically can't do better and others where creative work will have 
to be done," Ullman says. Sifting through large surveillance databases for 
information, he says, will essentially be "a problem in research in algorithms. 
We need to exploit some of the stuff that's been done in the data-mining 
community recently and do it much, much better."
  Working with 
databases requires users to have two mental models. One is a model of the data. 
Teasing out answers to questions from the popular search engine Google, for 
example, is easier if users grasp the varieties and types of data on the 
Internet--Web pages with words and pictures, whole documents in a multiplicity 
of formats, downloadable software and media files--and how they are stored. In 
exactly the same way, extracting information from surveillance databases will 
depend on a user's knowledge of the system. "It's a chess game;' Ullman says. 
"An unusually smart analyst will get things that a not-so-smart one will 
not."
  Second, and more important according to Spafford, effective use 
of big surveillance databases will depend on having a model of what one is 
looking for. This factor is especially crucial, he says, when trying to predict 
the future, a goal of many commercial and government projects. For this reason, 
what might be called reactive searches that scan recorded data for specific 
patterns are generally much more likely to obtain useful answers than proactive 
searches that seek to get ahead of things. If, for instance, police in the 
Washington sniper investigation had been able to tap into a pervasive network of 
surveillance cameras, they could have tracked people seen near the crime scenes 
until they could be stopped and questioned: a reactive process. But it is 
unlikely that police would have been helped by proactively asking surveillance 
databases for the names of people in the Washington area with the requisite 
characteristics (family difficulties, perhaps, or military training and a recent 
penchant for drinking) to become snipers.
  In many cases, invalid answers are harmless. 
If Victoria's Secret mistakenly mails 1 percent of its spring catalogs to people 
with no interest in lingerie, the price paid by all parties is 
small. But if a national 
terrorist tracking system has the same 1 percent error rate, it will produce 
millions of false alarms, wasting huge amounts of investigators' time 
and, worse, labeling many 
innocent U.S. citizens as suspects. "A 99 percent hit rate is great for 
advertising," Spafford says, "but terrible for spotting terrorism."
  
Because no system can have a success rate of 100 percent, analysts can try to 
decrease the likelihood that surveillance databases will identify blameless 
people as possible terrorists. By making the criteria for flagging suspects more 
stringent, officials can raise the bar, and fewer ordinary citizens will be 
wrongly fingered. Inevitably, however, that will mean also that the "borderline" 
terrorists, those who don't match all the search criteria but still have lethal 
intentions, might be overlooked as well. For both types of error, the potential 
consequences are alarming.
  Yet none of these concerns will stop the 
growth of surveillance, says Ben Shneiderman, a computer scientist at the 
University of Maryland. Its potential benefits are simply too large. An example 
is what Shneiderman, in his recent book Leonardo's Laptop: Human Needs and the 
New Computing Technologies, calls the World 
Wide Med: a global, unified database that makes every patient's complete medical 
history instantly available to doctors through the Internet, replacing today's 
scattered sheaves of paper records (see "Paperless Medicine," p. 58). "The 
idea," he says, "is that if you're brought to an ER anywhere in the world, your 
medical records pop up in 30 seconds." Similar programs are already 
coming into existence. Backed by the Centers for Disease Control and Prevention, 
a team based at Harvard Medical School is planning to monitor the records of 20 
million walk-in hospital patients throughout the United States for clusters of 
symptoms associated with bioterror agents. Given the huge number of lost or 
confused medical records, the benefits of such plans are clear. But because doctors would be continually adding 
information to medical histories, the system would be monitoring patients' most 
intimate personal data. 
The network, therefore, threatens to violate patient confidentiality on a global 
scale.
   In Shneiderman's view, such trade--offs are 
inherent to surveillance. The collective by-product of thousands of 
unexceptionable, even praiseworthy efforts to gather data could be something 
nobody wants: the demise of privacy. "These net-works are growing much faster 
than people realize, " he says. "We need to pay attention to what we're doing 
right now."
  In The Conversation, surveillance expert Harry Caul is 
forced to confront the trade-offs of his profession directly. The conversation 
in Union Square provides information that he uses to try to stop a murder. 
Unfortunately, his faulty interpretation of its meaning prevents him from 
averting tragedy. Worse still, we see in scene after scene that even the expert 
snoop is unable to avoid being monitored and recorded. At the movie's intense, 
almost wordless climax, Caul rips his home apart in a futile effort to find the 
electronic bugs that are hounding him.
  The Conversation foreshadowed a 
view now taken by many experts: surveillance cannot be stopped. There 
is no possibility of "opting out." The question instead is how to use 
technology, policy, and shared societal values to guide the spread of 
surveillance-by the government, by corporations, and perhaps most of all by our 
own unwitting and enthusiastic participation-while limiting its downside. 
Next month: how surveillance technology is changing our definition of 
privacy and why the keys to preserving it may be in the technology 
itself.
TECHNOLOGY REVIEW April 2003