Tools Used for Collecting
Online Data
An Educational Service of
the American Library Association
Office for Information
Technology Policy
Prepared by Leslie Harris
& Associates www.lharris.com in
conjunction with OITP staff
www.ala.org/oitp
------------------------------------------------------
Many web sites collect
information online, ranging from merely tracking how many hits a web page
receives in a given day or month, to more sophisticated analysis of the user's
Internet connection, computer, software, cookies, and stealth data recording
software. Because web sites collect
personally identifiable information from patrons accessing the Internet from
their libraries, it is useful for librarians to understand the variety of
technologies used by web site administrators to collect information from its
visitors.
As an initial matter, web
site administrators have the ability to collect certain basic information about
the users that visit their web sites.
Administrators may record the IP address (a set of four numbers, each
between zero and 255, separated by periods that uniquely identifies a computer
or other hardware device on the Internet) of each computer that accesses their
web sites. Additionally, web site
administrators may discern the path a user takes through a web site - in other
words, they can record the sites from which the user enters and to which the
user exits.
Many web sites use slightly
more complex technologies such as cookies to analyze traffic and purchase
patterns and to customize users' online experiences. A cookie is a text-only string of data containing information
that is unique to you that is entered into the memory of your browser and sent
back to a web server when you revisit a web site. Cookies contain information such as log-in or registration
information, online "shopping cart" information (your online buying
patterns in a certain retail site), user preferences, and the last web site visited. Some web sites use "session" or
"transient" cookies that track users only for a short period of time
(generally, one "session") and are stored in temporary memory files
on the user's computer. Other web sites
use "persistent cookies," which may track a user's Internet habits
over an extended period of time.
Persistent cookies permit a user to be "remembered" by a web
site from one visit to the next, and remain on a user's hard drive until they
are either erased, or expire.
Another technology that is
often used is stealth data recording software, a technology that is installed
without the user's knowledge to record personally identifiable information
"behind the scenes" and send it to a third party. Stealth software may either be an independent
program or a program that is embedded within another software application. Generally, a user unknowingly installs
stealth software at the same time he or she installs a third-party application
on a computer (either from a CD-ROM or from the Internet). Sometimes, stealth software is installed on
a computer during an online transaction, such as purchasing software or
clothing from a web site. Once
installed, stealth software tracks personally identifiable information and
periodically sends that information to a third party.
The use of these technologies
is of particular concern when used to conduct data mining. Data mining is the practice of aggregating
information about consumers' preferences and interests from a variety of
sources, including cookies, stealth data software, voluntary purchases, and
mailing lists, with the purpose of creating comprehensive profiles. Most often the profiles are used for
targeted advertisements, but federal and local governments are also
increasingly relying on data mining to assemble profiles to investigate criminal
and fraudulent activities.
-----------------------------------------------------
Further information:
CDT Consumer Privacy Guide:
http://www.consumerprivacyguide.org
Webopedia Definition &
Links:
http://www.webopedia.com/TERM/c/cookie.html
NYT Article "Fighting
to Make a City's Cookie Files Public" on a legal battle over whether
"cookie" files are public records. (Site requires registration, and
cookie acceptance)
http://www.nytimes.com/library/cyber/law/121897law.html
Cookie Central - Frequently
Asked Questions About Cookies:
http://www.cookiecentral.com/faq/
Microsoft/Internet Explorer
Information on Cookies:
http://www.microsoft.com/info/cookies.htm
Netscape Tech Support,
"Cookies: What They Are and How They
Work":
http://help.netscape.com/kb/consumer/19970226-2.html
"A recipe for cookie
management: Integrate an easy-to-use library for client-side cookie
handling" (highly technical article on using java for cookie management)
http://www.javaworld.com/javaworld/jw-04-2002/jw-0426-cookie.html
-----------------------------------------------------
Copyright 2002, American
Library Association, Office for
Information Technology
Policy
Disclaimer
This Online Privacy Tutorial
is a service of the American Library Association. The content of this tutorial
is primarily the work of Leslie Harris & Associates in Washington, DC. The
views expressed in these messages are not necessarily the views of ALA or
Leslie Harris & Associates. This tutorial is for information only and will
not necessarily provide answers to concerns that arise in any particular
situation. This service is not legal advice and does not include many of the
technical details arising under certain laws. If you are seeking legal advice
to address specific privacy issues, you should consult an attorney licensed to
practice in your state.