Counting Hits

"The dirty secret of eBusiness is that despite sophisticated and pricey

traffic analysis software, it is impossible to know even the basics in

precise detail." So writes Brian Caulfield in this month's eCompany Now

magazine, and he is spot on regarding counting hits, page views,

visitors, impressions or whatever you want to call those people coming

to your site. This presents a problem. Without precise counts, you

can't justify advertising rates, revenues, or the resources your

storefront needs.

The problem: No two tools agree on the same number of visitors, even

when starting from the same information. Every Web server produces this

information and stores it in a visitor log file. It tracks everyone

that visits you site by IP address, and lists the time of day and pages

each particular IP address viewed.

To demonstrate the log analyzer's variations, I ran one of my own Web

site's access logs through half a dozen different analyzer programs. To

my surprise, the programs achieved little agreement on just about any

metric: total number of hits, proportion of Microsoft or Netscape

browser users, and pages most often accessed. Sometimes, the numbers

differed by more than five percent. Why the difference? I don't know.

Log analyzer makers should agree on a common counting metric among

them, and indeed a few years ago I informally tried to organize people

from various companies to this end. I was spectacularly unsuccessful.

The log file's records are just the beginning of the problem though.

Matching the IP address in the log with an actual person presents one

of the biggest issues. No one-to-one correspondence occurs, and

sometimes you just have to guess to determine the overall unique number

of visitors. Search engines indexing your site also present problems.

You don't want to count all those hits as "real" visits to your site,

unless you are interested in inflating your numbers. If a significant

number of visitors originate from places like AOL or MSN, then your

totals will be undercounted -- these places use a proxy and caching

servers to save on outbound bandwidth. A visitor requesting a page that

has already been viewed by another AOL user won't view the page from

your site, but rather from AOL's cache. Hence, you don't get to count

that visitor in your tally. Finally, some visitors turn off cookies or

graphics, which can also affect your numbers.

Confounding the problem, bigger Web service bureaus use completely

different methods for counting your visitors than your log analyzers.

Media Metrix and NetRatings, for example, use old-world counting

methods similar to TV ratings services that use real people with

diaries to track their viewing habits. Neither of these companies track

users at schools, libraries, or other public places

Top 10 Hot Internet of Things Startups
Join the discussion
Be the first to comment on this article. Our Commenting Policies