March 07, 2001, 12:00 AM — "The dirty secret of eBusiness is that despite sophisticated and pricey
traffic analysis software, it is impossible to know even the basics in
precise detail." So writes Brian Caulfield in this month's eCompany Now
magazine, and he is spot on regarding counting hits, page views,
visitors, impressions or whatever you want to call those people coming
to your site. This presents a problem. Without precise counts, you
can't justify advertising rates, revenues, or the resources your
The problem: No two tools agree on the same number of visitors, even
when starting from the same information. Every Web server produces this
information and stores it in a visitor log file. It tracks everyone
that visits you site by IP address, and lists the time of day and pages
each particular IP address viewed.
To demonstrate the log analyzer's variations, I ran one of my own Web
site's access logs through half a dozen different analyzer programs. To
my surprise, the programs achieved little agreement on just about any
metric: total number of hits, proportion of Microsoft or Netscape
browser users, and pages most often accessed. Sometimes, the numbers
differed by more than five percent. Why the difference? I don't know.
Log analyzer makers should agree on a common counting metric among
them, and indeed a few years ago I informally tried to organize people
from various companies to this end. I was spectacularly unsuccessful.
The log file's records are just the beginning of the problem though.
Matching the IP address in the log with an actual person presents one
of the biggest issues. No one-to-one correspondence occurs, and
sometimes you just have to guess to determine the overall unique number
of visitors. Search engines indexing your site also present problems.
You don't want to count all those hits as "real" visits to your site,
unless you are interested in inflating your numbers. If a significant
number of visitors originate from places like AOL or MSN, then your
totals will be undercounted -- these places use a proxy and caching
servers to save on outbound bandwidth. A visitor requesting a page that
has already been viewed by another AOL user won't view the page from
your site, but rather from AOL's cache. Hence, you don't get to count
that visitor in your tally. Finally, some visitors turn off cookies or
graphics, which can also affect your numbers.
Confounding the problem, bigger Web service bureaus use completely
different methods for counting your visitors than your log analyzers.
Media Metrix and NetRatings, for example, use old-world counting
methods similar to TV ratings services that use real people with
diaries to track their viewing habits. Neither of these companies track
users at schools, libraries, or other public places