Database privacy. Maureen Mitchell and her husband discovered
the hard way how un-private databases can be. In March 2000, their
identities
were stolen over the Internet, and the thieves made
large purchases, procured bank loans, and opened credit accounts in
their name. Although they had never conducted financial transactions
on the Web and always carefully guarded their credit data, their data
was available online.
Computerized databases store a great deal of personal and financial
information, including names, addresses, credit card numbers and
Social Security numbers, as well as individuals' employers. In these
days of ubiquitous Web services, a lot of this information is
available to anyone with a Web browser. Because of the Web's
anonymous nature, it's just as easy for a thief to use information
fraudulently and make purchases online using someone else's credit
card number.
Let's take an example of an item widely used, but important
to safeguard: your Social Security number. A Social Security number is often
used as identification, especially for financial transactions over
the phone. Anyone who knows your name and discovers your Social
Security number can impersonate you. Since SSNs are (unfortunately)
used as a key in
so many databases, by giving yours out, you provide access not only to
the requested information, but also to a lot of other information as
well.
Many businesses freely share consumers' Social Security numbers and
other private information. However, law enforcement rarely prosecutes
such crimes as cybertheft, due to the difficulty of locating the
criminal, leaving victims to repair the damage themselves. The
Internet has at least made more people aware of privacy issues, as
consumers recognize the danger of companies opening up databases of
private information for others to use.
Social Security numbers are not the only personal information that can
hurt you if it falls into the wrong hands. Enough
personal information is available on the Web to keep an army of
stalkers supplied for years to come. Databases were a problem long before
the Web. In most states, it used to be possible to find out the home
address of the owner of a car, given a license-plate number. In 1989,
actress Rebecca Schaeffer was murdered
by someone who found out her home address from California Department of
Motor Vehicles records. Congress
subsequently banned routine disclosure of this information.
In 1996, Lexis-Nexis announced the
P-TRAK Person Locator File, "a quick, convenient search [that]
provides up to three addresses, as well as aliases, maiden names, and
Social Security numbers." After a barrage of complaints, it
dropped
plans to offer it. This episode mirrored a similar controversy
six years earlier over the abortive plans of Lotus Development to offer
a database called
Marketplace: Households, detailing consumer buying habits for
the benefit of prospective marketers.
Sometimes even medical information may be passed along to others
without consent. Kaiser Permanente accidentally sent
hundreds of emails with sensitive personal medical information to
the wrong members on August 2, 2000. A "technological glitch" while
upgrading the company's Web site was blamed for misdirecting 858
emails from nurses and pharmacists to 19 Kaiser members. Some messages
contained subscribers' names, phone numbers, and medical account
numbers. One of the more sensitive messages was a response to a
subscriber's question about a sexually transmitted disease.
Web privacy. The World-Wide Web has brought a greater
sense of urgency to privacy concerns. Not only can the Web
disseminate information in preexisting databases, it can also
gather
new information on individuals and their habits. If you've ever
authored a Web page, sent e-mail to a Usenet list, or anonymously
ftp'd a file from a server, a record of that is available in a log
file or a search engine somewhere. Would you want a prospective
employer to see what you do privately? That information may be only a
few mouse clicks away.
Some information collected over the Web is fairly innocuous. When
you go to Netscape, for example, and look up the weather for a
particular place, they can make an inference that you live there and
show you banner ads for concerts or other local events. In the
process, they probably pass the information on to DoubleClick, which provides the
ads. Companies "buy" keywords in search engines. When you
look up a term on AltaVista, they pass the search information off to
DoubleClick. When you search for General Motors, that information is
sent to GM. If you're from a university, they probably won't
pay any attention. But if you are from the Attorney General's
office, they will know by the IP address.
Many Web sites collect personal information and pass it to
advertisers. They are generally free to sell customer data at their
own discretion because few rules prevent them from doing so. Among the
Web site owners that sell personal information are marketing
companies, retail stores, and even the federal government. For
example, anyone who registered a domain name until recently had to
provide a contract name, billing address, and phone number to Network Solutions.
Recently acquired by VeriSign,
Network Solutions is now aggressively marketing this data to direct
mailers. Although Network Solutions claims they remove email addresses
from the data and do not permit the information to be used for email
marketing, this data could be used for something other than simply
obtaining an address, privacy advocates say. And not only businesses
are at risk. Since many of the registrants are small "mom and pop"
vendors, a lot of personal data is also being disclosed.
Late in 1999, privacy concerns suddenly became more real to consumers
when DoubleClick purchased the
direct-marketing company Abacus Direct. Abacus owns a database of the
purchasing habits of 90% of American households. Clearly DoubleClick was interested in
correlating this information with consumers' on-line behavior, allowing
advertising to be targeted even better. Once the Internet bubble burst, disclosure of
such data became commonplace, as Web companies felt pressure to
find new sources of revenue.
Even the government is involved in collecting private
information. In June 2000 it was revealed that White
House drug office-operated Web sites were collecting personal data
from site visitors. The Clinton administration took steps to ensure
privacy of Internet users visiting federal government Web
sites. Privacy advocates were enraged the government was using Web
sites to surreptitiously collect information from its citizens and
urged Congress to hold hearings. It did, and the resulting Congressional
review found 13 agencies that were secretly using technology to
track the Internet habits of visitors, and one of them provided that
information to a private company that compiled reports for the
agency.
Moreover, privacy standards established by various governments are
not strict enough prevent Web sites from gathering personal data
supposed to be protected, says Anna Fielder, director of the Consumers
International Office for Developed and Transition Economies. "Privacy
is recognized as a fundamental human right, yet we've found that too
many companies collect a lot of unnecessary, very personal information
about their customers," notes Fielder, "and because of inadequate
implementation of existing government measures people don't have
control on their data."
A coalition of some of the largest Internet companies, however,
argues otherwise. The Online Privacy Alliance
argues that it is sufficient to enforce
existing laws and that new regulations could cost
90 of the largest financial institutions $17 billion, and raise
costs to consumers by $1 billion.
Putting two and two together. You've probably seen one of those
Web-generated maps showing all restaurants in a particular area. That's
an example of a mashup site
Mashup sites raise a number of privacy problems, including the fact that
the site does not own the data it delivers, and must rely upon the accuracy
and goodwill of the sites whose information it aggregates. Perhaps more
serious, as more information is placed on the Web, a comprehensive dossier
can be built on almost any individual. Want to know how much someone paid
for their house, how often they vote, whether they have a criminal record?
All that information is available on the Web. Knowledgeable Web users
already have access, but the day is coming when it is available to anyone
who visits a mashup site.
Mashups are an example of what can be done with Web-services
programming. Web
services support computer-to-computer transactions over the Web. Web
services are still in their infancy, and many Web-services programmers do
not appreciate the security problems they raise. Symantec reported
that almost 70% of all security vulnerabilities in the last half of 2005
were associated with Web applications. If this seems a serious problem,
things are only going to get worse. The semantic web is
designed to allow computers to analyze data on the Web and search for
information much the same way a human would. Effectively, this would turn
the Web into a giant mashup site. Although a working semantic Web is many
years off, privacy researchers are already warning of its
devastating impact on privacy.
"Google hacking." The scariest part of the story, however, may
not be the data that users consciously place on the Web, but the data that
is there by accident. Database management programs such as FileMaker Pro
provide easy ways to Web-enable databases. In some cases, novice users publish
more than they mean to. Passwords and medical information are just two
examples. Hackers are well
aware of these vulnerabilities. They hunt for pages with telltale
words or phrases indicative of private information. To prevent Webmasters
from seeing their tracks, they use the cached copies that Google maintains
to allow access when the main sites are down. This technique has become
known as "Google
hacking."
Every Webmaster should know about these attacks and be prepared to
defend against them. Recommendations include changing
default error messages so that they do not reveal what kind of a Web
server is being run, and using HTTP-vulnerability scanning tools. Include
the NOINDEX
metatag to prevent search engines from indexing a page that is not
intended for public consumption. Use a
robots.txt file to direct Web crawlers to ignore part or all
of your site. If you need to remove content that was inadvertently
placed on the Web, see these
instructions to remove it from Google's index. However, be aware
that Google is not the only search engine that crawls the Web.
Internet connections and privacy. The Web's infrastructure
lets advertisers gather personal data from Internet users. Internet
protocol addresses are easily manipulated for data about users' online
activities. That data is more valuable if users share personal
information on a Web-based form, letting Web sites tie the data to
user names. Cell-phone users participating with Sprint
PCS' wireless Internet service expose their cell-phone numbers on
the Web with each new page they visit. Remote host identifiers can
leak users' personal data, including users' employers, and their
names.
High-speed Internet connections pose risks too. When you're
online, it's only a matter of minutes or hours until someone probes
you. If you've dialed in via a modem, you're only on line for a
short while, and always at a different IP address. Therefore, hackers
have a limited window of time to find your computer and attempt to
gain access. The "always on" feature of broadband leaves an open
door for a hacker to locate your computer, test your system and
try to find a weak point in your security.
Microsoft File and Print Sharing is one of the main ways hackers
can break in. If you're on a public network, you should install a firewall
to shut off Internet access when you're not using the computer. A
firewall can be either hardware or software. A very good software
firewall is ZoneAlarm, which
is free for individual use; however it is not as configurable as some
others. For one thing, you cannot allow cookies from some Web sites
and not from others.
Cookies. Since about 1996, one of the highest-profile Web
privacy issues has been cookies. Your browser
provides information to a server it communicates with via a cookie, a
small file in a user's filesystem that records certain information
about the user. Attached to users' hard drives as they surf the Web,
cookies may hold a user ID and password to enable subsequent logins to
a Web site to proceed immediately. Cookies are frequently used to
target ads at users based on topics they have browsed. Information
from cookies can be recorded in Web server log files, so the user may
lose control over this information. It's even possible to use a
single cookie to record users' visits to many different Websites,
opening new opportunities for targeted advertising. An example is the
list of book suggestions conveyed by Internet booksellers: "Readers
who bought this book also bought ..."
In May 2000, Microsoft acknowledged an Internet
Explorer flaw that let hackers read contents of cookies on a
victim's system. E-commerce sites use cookies to track customer
purchases. Coding in a Web address can trick the user's browser into
responding to a request for a cookie as if the request came from an
authorized site. Then hackers can get personal information given to
e-commerce sites and access Web-based email accounts and information
about past browsing.
Users can opt out, either by setting their Web browser not
to accept cookies, or to warn them before accepting cookies. But this
also turns off the helpful uses of cookies. And it doesn't help Web
"newbies," unware they are being tracked.
"Web Bugs Make Cookies Look Good Enough to Eat" "Web
bugs" are small scripts Web sites can use to copy information from
hard drives and ship it to third-party sites while evading nearly all
firewalls and leaving without footprints. Web bugs, or clear GIFs, are
images
embedded in HTML-enhanced commercial emails or Web page software
code that help transmit data to a remote computer when the page is
viewed. Web bugs monitor who's reading. These stealth tools build
online profiles and count the number of times a page is accessed. In
March 2001 Privacy Council CEO Gary Clayton displayed Web bug
technology before Congressionial lawmakers, stunned as Clayton stole
over 1,000 names and addresses from a hard drive's address book. This
is similar to commercial espionage. However, most people have no idea
Web bugs exist.
Toysrus.com
was accused of using Web bugs to compile personal profiles of its
online shoppers for an marketing agency. Soon afterwards, the company
says, it ceased the practice. Recently it was alleged that the
signup page for the National Do-Not-Call Registry contained a Web
bug that reported
traffic on the site to AT&T, which has commercial reasons for being
interested in that information. Guidelines from the Privacy Foundation
demand that Internet advertising companies and Web sites using Web
bugs must use icons to indicate their presence and identify the
company harvesting data. Visitors must be allowed to opt out of Web
bug data collection. These guidelines prohibit using Web bugs to
gather data related to children, sex, medical issues, and financial or
employment matters.
In addition to cookies and web bugs, there are other ways of
tracking your Internet habits. Serial numbers in processors can track
Web surfers' online activities. In 2000, Intel announced plans to
phase out serial numbers in processors since the number could track
Web surfers' online activities.
Java applets, which are run through a Web browser, can't access
files on your system unless you grant permission. However, if you
grant Java permission, Java
applets can invade your privacy by inspecting or changing files on
the client file system. These applets can use network connections to
circumvent file protections or privacy expectations.
Also, software vendors gather information about the computers their
software is running on. The vendor reads your configuration files,
checks for enough memory, and verifies if your computer has things
like video adapters, etc. When you install software, vendors check for
any previous versions of the software being installed. In fact,
Microsoft even checks for unauthorized software during software
installation.
Proposed legislation. Privacy standards established by
government are not strict enough to stop Web sites from gathering
personal data. Few Web sites
follow standards to protect users' private personal information, and
most laws can't stop violators. More than two-thirds of
consumer-oriented sites collect personal data, and nearly all ask for
information that could easily identify the user, according to the study
of 751 Web sites by Consumers International. Consumers want better
privacy protection.
Based on these concerns, some privacy advocates want the government
to step in and stop companies from tracking users' on-line
visits. As is expected, Web marketers
do not agree, saying the industry
itself will take steps to protect privacy since privacy worries
already make consumers wary of
buying on-line.
The World Wide Web Consortium's Platform
for Privacy Preferences Project (P3P) may offer relief to the
cookies' threat to consumers' online privacy. Several companies have
products supporting P3P, letting Web browsers automatically compare a
Web site's privacy policy with a user's privacy preferences. P3P will
be tested in the summer of 2000, but will be ineffective unless
adopted by many browsers and Web sites. At an interoperability session
in June 2000, client P3P products were presented by the Electronic
Network Consortium, Engage Technologies, Microsoft, and others. IE 6.0
will have this capability turned on.
Legislation has also been enacted to safeguard information in
databases and on the Web. In late 1999, President Clinton
signed legislation governing financial information sharing between
non-affiliated companies. The federal legislation permits states to go
above and beyond the privacy protections called for in the federal
legislation by passing measures of their own. Washington, California,
Vermont, and Minnesota are among a small group of states that bear
watching as the state-led privacy movement heats up.
In January 2001, the Consumer
Internet Privacy Enhancement Act was introduced, which would make
it unlawful for a commercial Web site operator to collect personally
identifiable information online from a Web site user without
notice. The bill establishes a civil penalty for violations of up to
$500,000. The bill requires all commercial websites that collect
personally identifiable information to define what types of
information are being collected, how the information will be used,
what entities are collecting the information, whether the information
is required to use the site, and the methods used to secure personal
information. "While the Internet has opened up an entirely new world,"
Representative Chris Cannon said, "it has also created problems we
have never before encountered." He also adds, "Consumers shouldn't
have to reveal their life story every time they surf the web."
However, there are potential downsides of this privacy
legislation. Legislating Internet privacy may give consumers fewer
choices and less privacy. Tech-industry
heavyweights prefer to regulate themselves rather than have
privacy laws imposed on them. Many privacy groups support an "opt-in"
approach to online privacy where consumers must consent before a
company can do anything with personal information. In contrast, the
tech industry wants an "opt-out" approach where consumers must
specifically request that companies stop using their personal
data. The opt-out policy plays right into companies' hands. If a user
clicks on a link to opt out of a spammer's list of email addresses,
the spammer knows this user's email address and can then sell it to
other spammers. An opt-out policy also allows companies to continue
using HTML bugs that tell when a consumer has opened email and give
them access to information on that user's hard drive.
There could be other downsides to privacy laws. The tech industry
says privacy laws will increase companies' marketing costs. Letting
users decide whether to let companies share their personal data with
marketers will add
$1 billion to the $15 billion catalog and Internet apparel
retailer market because consumers will probably not share data,
increasing the cost to collect the marketing databases they now easily
build. In addition, privacy law critics argue regulations will give
consumers too much control, create competing state laws, and eliminate
businesses' ability to gather information to tailor marketing toward
consumers.
If advertising can be better
tuned to consumers' interests, it could become less intrusive, not
more. No longer would users be bothered by mounds of junk mail. But
worries remain. Once all that information is available in one place,
the temptation to misuse it might be overwhelming. Web sites, which
store users' personal and financial information, raise serious privacy
concerns. Be careful of posting information to Web sites. A safe site
should provide a clear privacy policy and use an encrypted connection
when users transmit sensitive information.
Ethical implications. Ethically speaking, there are good
and bad reasons for privacy. If you're worried about people
discovering what you are doing, is that because you're doing something
unethical? Surely, it's unethical to use privacy to let you "get away
with" unethical activity. But privacy is also a shield against others'
unethical actions. You have a right to keep information private that
could harass, inconvenience or threaten you. You should have the right
to control how information about you is used. Balancing these
interests with the benefits of gathering and using information is
always difficult. There is a need for those who understand the
technology to appreciate its implications.