Online Advertising, Web Technology

Web Analytics Comparisons – Accurate Data

Published: 07 Jul 2009

The Web Analytics topic hasn’t come around in a while but today I was revisited by the evils of logfile analysis and its over-inflated traffic numbers! Clients often ask me which of their web statistics data tools is telling the truth and accurate. The main issue is that whether you’re looking at logfile analysis reports from WebTrends, or a client-side tracker like Google Analytics, both are correct in their own way you just have to understand their methodologies. It’s important to get your head around them because the variance can be as much as 30 percent!

The main difference is that server logfiles track all server activity from browser based surfing through to Googlebot visits and non HTML file downloads (such as images viewed via Google images),  analysis of those gives a picture of both real AND automated visits. Client-side scripts only count data for browsers used by real people viewing HTML pages and therefore give the most accurate picture.

Generally, most people want to know which is the best tool to quote for online advertising purposes or for analysis of real visitor browsing behaviour. To do that correctly you really have to turn to a client-side tracking application like Google Analytics, this ensures that the spiders and robots are discounted, cutting page view data reported by log files by around 30% (in my experience) and affecting session time (although I’ve not figured out whether it inflates or deflates it yet as who knows how long a crawler stays on a site each day – possibly just seconds, or once a month for 20 minutes).

Digital ad servers like Doubleclick, which themselves use client-side code to serve adverts, will produce ad impression data which excludes automated crawlers – this is why Google Analytics data will usually be quite accurate for selling advertising. There’s not really much point in giving the outside world a false view of how busy a website is from logfile analysis if their expectations on ad delivery will be dashed when a campaign is initiated.

With that in mind it is sensible to take around 30% off of logfile sourced page impression and visitor data. 30% is my average, I’ve seen some sites with a 15% discrepancy and some with 45%, hence I play it safe at around half way.

Leave a Reply

Google Adwords Qualified
British Interactive Media Association Member
Yell.com Listing
Firetop Linkedin Profile
Publisher Services Directory

Call: +44 (0)845 226 3232 (Local Rate)

Email: info@firetop.co.uk