Summary Cookies, those little text files that HTTP servers write to your hard disk, have the potential of reducing the client's overhead while using very little disk space themselves. However, some users may be concerned that a foreign server is writing data to their disks. Over time, however, cookies may well end up saving far more overhead than they require. (2,100 Words) |
Do you remember the old Cookie Monster virus?
The "I wanna cookie!" screen
message was satisfied easily with "chocolate chip" or "oatmeal."
Cookies in the context of HTTP servers are not so straightforward or whimsical. In fact, it's more like "don't ask, don't tell." As a client, you don't ask and the HTTP server won't tell before it goes ahead and gives you one. So, you might want to learn a little more about these Magic Cookies that are being used on your browser with increasing frequency. Cookies are pieces of information that an HTTP server, such as Netscape Commerce Server, can store on your machine. All a cookie really is is a name and a value. Cookies are sent by a server to your browser and your browser stores them by writing them to a file on your disk. The next time you reconnect to that server, your browser sends back some or all of the cookies that the server originally sent to you.
Why do I want to use cookies? All that sounds terribly abstract. What are some real-world applications for cookies? A prime example is Netscape's "personalize your Web page" option. When you tell the server what you part icular preferences are, the server encodes all that information and saves them as cookies on your hard drive. The next time you connect to the Netscape server, it looks for cookies that indicate your preferences. If it finds such cookies, it will configure the page accordingly. If not, you get the default page. This is a very easy way to allow a user to customize their environment without forcing the overhead onto the servers.
How can I use cookies? Once you have that database, you can use cookies to transmit the user demographics to advertisers when the user clicks through an ad banner. This is a valid use of cookies, although the webmaster has to weigh the benefit to advertisers versus the possible offense to the client. One use that is a clear benefit to the client is maintaining a shopping list of items as the client moves through an online catalog. As the client browses each page of the catalog, each selection is encoded as a cookie on the client side. At the end of the session, the server collects the cookies, uses that information to fill out the invoice form and requests the necessary credit card information.
What's in that cookie? |
# Netscape HTTP Cookie File # http://www.netscape.com/newsref/std/cookie_spec.html # This is a generated file! Do not edit. netscape.com TRUE / FALSE 946684799 NETSCAPE_ID c98ffb1e,c68818dc infoseek.com TRUE / FALSE 859574919 InfoseekUserId D2679D5862DEA4FE cgi.netscape.com FALSE / FALSE 946684799 NETSCAPE_VERIFY c65ff94b,c6a6abcb adobe.com TRUE / FALSE 946684799 INTERSE 123.123.123.1231212183113897
Each line of the file describes one cookie. The exact format of the
file de pends on the syntax of the cookie spec. As the example
illustrates, generally the first column is the domain of the server
that gave you the cookie, the next to last column is the name of the
cookie and the last column is the value of the cookie. So, in this
file, there are cookies from Netscape, Adobe, and Infoseek. Each of
them has sent down some sort of code that identifies this particular
machine.
However, the Internet is full of people who want to know what their software is doing. Currently, the user of a browser receives no notification that a server has written information to their hard disk. The concerns this raises are valid and significant. So, to assess the risk involved, we first need to look at the Cookie syntax.
Syntax "This is the format a CGI script would use to add to the HTTP headers a new piece of data which is to be stored by the client for later retrieval.Unsurprisingly, NAME is the name of the cookie, VALUE is the data contained in that cookie. And the browser can throw away = the cookie anytime after DATE. For security, only servers that match DOMAIN will get this cookie. PATH is another restriction on where the cookie is sent. When the browser is about to request a URL on a server that matches DOMAIN, then the browser checks to see if that the path in that URL matches PATH. If so, the cookie is sent along with the URL request. If not, no cookie is sent. SECURE is a flag that, if present, indicates that the cookie is only to be sent if the connection is secure.
Each time a browser requests a URL from any server, the browser checks
all its cookies for potential DOMAIN and PATH matches. Those lucky
cookies that match are included with the request in the format:
How big is the Cookie Jar If a browser runs out of cookie jar space, it deletes the least recently used cookies. The expires value is a guideline--browsers can delete cookies earlier if they run out of space. However, the cookie cannot be given out after the expiration date.
Security Issues According to Frank Chen, security product manager at Netscape Communications Corp., the information maintained by cookies is no different than the data that could be captured at the server. No new information collection capabilities are added with cookies. He said that as a browser-user, he would be more nervous about the potential for server administrators to track his movements through their servers, to later mine that data for patterns, rather than about the information trackable via cookies. For those who remain concerned about cookies and privacy, the Netscape Navigator Ver. 3.0 addresses this worry. This version, including the currently available beta 4, has an option (under Options/Network Preferences/Protocols) to show an allow/deny alert whenever a server tries to set a cookie. This gives you full control over server access to your cookie file, and, if nothing else, gives you an idea of who is offering you cookies, and what's in them.
Example Setting the Cookies This script sets the cookies Name and Color.
The expires date format looks little funky, but it conforms to the
USENET date standards. After invoking this script from your browser,
you will be able to find the cookies in your cookie file. (You may have
to quit the browser first.)
Reading the Cookies When any browser connects to a server and runs a CGI
script, that script can access a set of environment variables. Although
the exact implementation of these variables is platform-dependent, all
the cookies are stored in a variable with a name like HTTP_COOKIE. A
Unix shell script that would display the cookies that were set by the
above script is:
#!/bin/sh echo "Content-type: text/html" echo " echo "These are your cookies:" echo $HTTP_COOKIE That's sort of the "hello, world" of cookies. A cookie doesn't really do anything itself. But the good news is that it doesn't take too much more to make it useful. On the server side, decide what you need to keep track of, assign appropriate codes to these states and write a small bit of code to determine what the cookie means when it comes back. And that's all there is to it. |
About the author
Charles Rejonis is a computer scientist in Germantown, Md,
he can be reached at rejonis@cs.stanford.edu.