Lecture 12: Database and Web Privacy

Database privacy. Maureen Mitchell and her husband discovered the hard way how un-private databases can be. In March 2000, their identities were stolen over the Internet, and the thieves made large purchases, procured bank loans, and opened credit accounts in their name. Although they had never conducted financial transactions on the Web and always carefully guarded their credit data, their data was available online.

Computerized databases store a great deal of personal and financial information, including names, addresses, credit card numbers and Social Security numbers, as well as individuals' employers. In these days of ubiquitous Web services, a lot of this information is available to anyone with a Web browser. Because of the Web's anonymous nature, it's just as easy for a thief to use information fraudulently and make purchases online using someone else's credit card number.

Let's take an example of an item widely used, but important to safeguard: your Social Security number. A Social Security number is often used as identification, especially for financial transactions over the phone. Anyone who knows your name and discovers your Social Security number can impersonate you. Since SSNs are (unfortunately) used as a key in so many databases, by giving yours out, you provide access not only to the requested information, but also to a lot of other information as well.

Many businesses freely share consumers' Social Security numbers and other private information. However, law enforcement rarely prosecutes such crimes as cybertheft, due to the difficulty of locating the criminal, leaving victims to repair the damage themselves. The Internet has at least made more people aware of privacy issues, as consumers recognize the danger of companies opening up databases of private information for others to use.

Social Security numbers are not the only personal information that can hurt you if it falls into the wrong hands. Enough personal information is available on the Web to keep an army of stalkers supplied for years to come. Databases were a problem long before the Web. In most states, it used to be possible to find out the home address of the owner of a car, given a license-plate number. About 10 years ago, actress Rebecca Schaefer was murdered by someone who found out her home address from California Department of Motor Vehicles records. Congress subsequently banned routine disclosure of this information.

In 1996, Lexis-Nexis announced the P-TRAK Person Locator File, "a quick, convenient search [that] provides up to three addresses, as well as aliases, maiden names, and Social Security numbers." After a barrage of complaints, it dropped plans to offer it. This episode mirrored a similar controversy six years earlier over the abortive plans of Lotus Development to offer a database called Marketplace: Households, detailing consumer buying habits for the benefit of prospective marketers.

Sometimes even medical information may be passed along to others without consent. Kaiser Permanente accidentally sent hundreds of emails with sensitive personal medical information to the wrong members on August 2, 2000. A "technological glitch" while upgrading the company's Web site was blamed for misdirecting 858 emails from nurses and pharmacists to 19 Kaiser members. Some messages contained subscribers' names, phone numbers, and medical account numbers. One of the more sensitive messages was a response to a subscriber's question about a sexually transmitted disease.

Web privacy. The World-Wide Web has brought a greater sense of urgency to privacy concerns. Not only can the Web disseminate information in preexisting databases, it can also gather new information on individuals and their habits. If you've ever authored a Web page, sent e-mail to a Usenet list, or anonymously ftp'd a file from a server, a record of that is available in a log file or a search engine somewhere. Would you want a prospective employer to see what you do privately? That information may be only a few mouse clicks away.

Some information collected over the Web is fairly innocuous. When you go to Netscape, for example, and look up the weather for a particular place, they can make an inference that you live there and show you banner ads for concerts or other local events. In the process, they probably pass the information on to DoubleClick, which provides the ads. Companies "buy" keywords in search engines. When you look up a term on AltaVista, they pass the search information off to DoubleClick. When you search for General Motors, that information is sent to GM. If you're from a university, they probably won't pay any attention. But if you are from the Attorney General's office, they will know by the IP address.

Many Web sites collect personal information and pass it to advertisers. They are generally free to sell customer data at their own discretion because few rules prevent them from doing so. Among the Web site owners that sell personal information are marketing companies, retail stores, and even the federal government. For example, anyone who registered a domain name until recently had to provide a contract name, billing address, and phone number to Network Solutions. Recently acquired by VeriSign, Network Solutions is now aggressively marketing this data to direct mailers. Although Network Solutions claims they remove email addresses from the data and do not permit the information to be used for email marketing, this data could be used for something other than simply obtaining an address, privacy advocates say. And not only businesses are at risk. Since many of the registrants are small "mom and pop" vendors, a lot of personal data is also being disclosed.

Late in 1999, privacy concerns suddenly became more real to consumers when DoubleClick purchased the direct-marketing company Abacus Direct. Abacus owns a database of the purchasing habits of 90% of American households. Clearly DoubleClick was interested in correlating this information with consumers' on-line behavior, allowing advertising to be targeted even better. Once the Internet bubble burst, disclosure of such data became commonplace, as Web companies felt pressure to find new sources of revenue.

Even the government is involved in collecting private information. In June 2000 it was revealed that White House drug office-operated Web sites were collecting personal data from site visitors. The Clinton administration took steps to ensure privacy of Internet users visiting federal government Web sites. Privacy advocates were enraged the government was using Web sites to surreptitiously collect information from its citizens and urged Congress to hold hearings. It did, and the resulting Congressional review found 13 agencies that were secretly using technology to track the Internet habits of visitors, and one of them provided that information to a private company that compiled reports for the agency.

Few US Web sites follow standards to protect users' private personal information, and most laws and guidelines can't stop violators. More than two-thirds of consumer-oriented sites collect personal data, and nearly all ask for information that could easily identify the user, according to Consumers International's study of 751 Web sites. Only a few sites let users decide if their names go on a site's mailing list or if their personal data is passed to third parties.

Moreover, privacy standards established by various governments are not strict enough prevent Web sites from gathering personal data supposed to be protected, says Anna Fielder, director of the Consumers International Office for Developed and Transition Economies. "Privacy is recognized as a fundamental human right, yet we've found that too many companies collect a lot of unnecessary, very personal information about their customers," notes Fielder, "and because of inadequate implementation of existing government measures people don't have control on their data."

A coalition of some of the largest Internet companies, however, argues otherwise. The Online Privacy Alliance argues that it is sufficient to enforce existing laws and that new regulations could cost 90 of the largest financial institutions $17 billion, and raise costs to consumers by $1 billion.

Internet connections and privacy. The Web's infrastructure lets advertisers gather personal data from Internet users. Internet protocol addresses are easily manipulated for data about users' online activities. That data is more valuable if users share personal information on a Web-based form, letting Web sites tie the data to user names. Cell-phone users participating with Sprint PCS' wireless Internet service expose their cell-phone numbers on the Web with each new page they visit. Remote host identifiers can leak users' personal data, including users' employers, and their names.

High-speed Internet connections pose risks too. When you're online, it's only a matter of minutes or hours until someone probes you. If you've dialed in via a modem, you're only on line for a short while, and always at a different IP address. Therefore, hackers have a limited window of time to find your computer and attempt to gain access. The "always on" feature of broadband leaves an open door for a hacker to locate your computer, test your system and try to find a weak point in your security.

Microsoft File and Print Sharing is one of the main ways hackers can break in. If you're on a public network, you should install a firewall to shut off Internet access when you're not using the computer. A firewall can be either hardware or software. A very good software firewall is ZoneAlarm, which is free for individual use; however it is not as configurable as some others. For one thing, you cannot allow cookies from some Web sites and not from others.

Cookies. Since about 1996, one of the highest-profile Web privacy issues has been cookies. Your browser provides information to a server it communicates with via a cookie, a small file in a user's filesystem that records certain information about the user. Attached to users' hard drives as they surf the Web, cookies may hold a user ID and password to enable subsequent logins to a Web site to proceed immediately. Cookies are frequently used to target ads at users based on topics they have browsed. Information from cookies can be recorded in Web server log files, so the user may lose control over this information. It's even possible to use a single cookie to record users' visits to many different Websites, opening new opportunities for targeted advertising. An example is the list of book suggestions conveyed by Internet booksellers: "Readers who bought this book also bought ..."

In May 2000, Microsoft acknowledged an Internet Explorer flaw that let hackers read contents of cookies on a victim's system. E-commerce sites use cookies to track customer purchases. Coding in a Web address can trick the user's browser into responding to a request for a cookie as if the request came from an authorized site. Then hackers can get personal information given to e-commerce sites and access Web-based email accounts and information about past browsing.

Users can opt out, either by setting their Web browser not to accept cookies, or to warn them before accepting cookies. But this also turns off the helpful uses of cookies. And it doesn't help Web "newbies," unware they are being tracked.

"Web Bugs Make Cookies Look Good Enough to Eat" "Web bugs" are small scripts Web sites can use to copy information from hard drives and ship it to third-party sites while evading nearly all firewalls and leaving without footprints. Web bugs, or clear GIFs, are images embedded in HTML-enhanced commercial emails or Web page software code that help transmit data to a remote computer when the page is viewed. Web bugs monitor who's reading. These stealth tools build online profiles and count the number of times a page is accessed. In March 2001 Privacy Council CEO Gary Clayton displayed Web bug technology before Congressionial lawmakers, stunned as Clayton stole over 1,000 names and addresses from a hard drive's address book. This is similar to commercial espionage. However, most people have no idea Web bugs exist.

Toysrus.com was accused of using Web bugs to compile personal profiles of its online shoppers for an marketing agency. Soon afterwards, the company says, it ceased the practice. Guidelines from the Privacy Foundation demand that Internet advertising companies and Web sites using Web bugs must use icons to indicate their presence and identify the company harvesting data. Visitors must be allowed to opt out of Web bug data collection. These guidelines prohibit using Web bugs to gather data related to children, sex, medical issues, and financial or employment matters.

In addition to cookies and web bugs, there are other ways of tracking your Internet habits. Serial numbers in processors can track Web surfers' online activities. In 2000, Intel announced plans to phase out serial numbers in processors since the number could track Web surfers' online activities.

Java applets, which are run through a Web browser, can't access files on your system unless you grant permission. However, if you grant Java permission, Java applets can invade your privacy by inspecting or changing files on the client file system. These applets can use network connections to circumvent file protections or privacy expectations.

Also, software vendors gather information about the computers their software is running on. The vendor reads your configuration files, checks for enough memory, and verifies if your computer has things like video adapters, etc. When you install software, vendors check for any previous versions of the software being installed. In fact, Microsoft even checks for unauthorized software during software installation.

Proposed legislation. Privacy standards established by government are not strict enough to stop Web sites from gathering personal data. Few Web sites follow standards to protect users' private personal information, and most laws can't stop violators. More than two-thirds of consumer-oriented sites collect personal data, and nearly all ask for information that could easily identify the user, according to the study of 751 Web sites by Consumers International. Consumers want better privacy protection.

Based on these concerns, some privacy advocates want the government to step in and stop companies from tracking users' on-line visits. As is expected, Web marketers do not agree, saying the industry itself will take steps to protect privacy since privacy worries already make consumers wary of buying on-line.

The World Wide Web Consortium's Platform for Privacy Preferences Project (P3P) may offer relief to the cookies' threat to consumers' online privacy. Several companies have products supporting P3P, letting Web browsers automatically compare a Web site's privacy policy with a user's privacy preferences. P3P will be tested in the summer of 2000, but will be ineffective unless adopted by many browsers and Web sites. At an interoperability session in June 2000, client P3P products were presented by the Electronic Network Consortium, Engage Technologies, Microsoft, and others. IE 6.0 will have this capability turned on.

Legislation has also been enacted to safeguard information in databases and on the Web. In late 1999, President Clinton signed legislation governing financial information sharing between non-affiliated companies. The federal legislation permits states to go above and beyond the privacy protections called for in the federal legislation by passing measures of their own. Washington, California, Vermont, and Minnesota are among a small group of states that bear watching as the state-led privacy movement heats up.

In January 2001, the Consumer Internet Privacy Enhancement Act was introduced, which would make it unlawful for a commercial Web site operator to collect personally identifiable information online from a Web site user without notice. The bill establishes a civil penalty for violations of up to $500,000. The bill requires all commercial websites that collect personally identifiable information to define what types of information are being collected, how the information will be used, what entities are collecting the information, whether the information is required to use the site, and the methods used to secure personal information. "While the Internet has opened up an entirely new world," Representative Chris Cannon said, "it has also created problems we have never before encountered." He also adds, "Consumers shouldn't have to reveal their life story every time they surf the web."

However, there are potential downsides of this privacy legislation. Legislating Internet privacy may give consumers fewer choices and less privacy. Tech-industry heavyweights prefer to regulate themselves rather than have privacy laws imposed on them. Many privacy groups support an "opt-in" approach to online privacy where consumers must consent before a company can do anything with personal information. In contrast, the tech industry wants an "opt-out" approach where consumers must specifically request that companies stop using their personal data. The opt-out policy plays right into companies' hands. If a user clicks on a link to opt out of a spammer's list of email addresses, the spammer knows this user's email address and can then sell it to other spammers. An opt-out policy also allows companies to continue using HTML bugs that tell when a consumer has opened email and give them access to information on that user's hard drive.

There could be other downsides to privacy laws. The tech industry says privacy laws will increase companies' marketing costs. Letting users decide whether to let companies share their personal data with marketers will add $1 billion to the $15 billion catalog and Internet apparel retailer market because consumers will probably not share data, increasing the cost to collect the marketing databases they now easily build. In addition, privacy law critics argue regulations will give consumers too much control, create competing state laws, and eliminate businesses' ability to gather information to tailor marketing toward consumers.

If advertising can be better tuned to consumers' interests, it could become less intrusive, not more. No longer would users be bothered by mounds of junk mail. But worries remain. Once all that information is available in one place, the temptation to misuse it might be overwhelming. Web sites, which store users' personal and financial information, raise serious privacy concerns. Be careful of posting information to Web sites. A safe site should provide a clear privacy policy and use an encrypted connection when users transmit sensitive information.

Ethical implications. Ethically speaking, there are good and bad reasons for privacy. If you're worried about people discovering what you are doing, is that because you're doing something unethical? Surely, it's unethical to use privacy to let you "get away with" unethical activity. But privacy is also a shield against others' unethical actions. You have a right to keep information private that could harass, inconvenience or threaten you. You should have the right to control how information about you is used. Balancing these interests with the benefits of gathering and using information is always difficult. There is a need for those who understand the technology to appreciate its implications.