Analytics: Why you still need those log files

Analytics

Ian Lurie Dec 31 2009

You may think that, with Google Analytics, you don’t need your server log files any more. Or, as someone who runs a web site for your business, you may not even know what log files are. They’re redundant, right?

Wrong.

You still need those log files!

What’s a log file?

Whoa there, you say. Ian, you’re going geeky. What the heck is a log file?
It’s your server’s record of who’s come to your site, when, and exactly what they looked at. It’s incredibly detailed, showing:

  • Where folks came from
  • What browser they were using
  • Exactly which files they looked at
  • How long it took to load each file
  • And a whole bunch of other nerdy stuff

If you were to look in a raw log file, you’d see something like this:

a log file

I know: Blech. But stay with me.

There’s gold in them thar files

Say you want to find every page on your site that received zero visitors from organic search in the past 2 years. That’s a useful statistic: Those pages may be invisible to search engines, or they may be really poorly optimized. A list of organic search outcasts could keep you busy for quite a while.

You could use Google Analytics to find those pages – I’ll be writing about that next week. But only if you’ve been using Google Analytics for 2 years, and only if you had it configured correctly, and only if you can rely on Google Analytics’ organic click detection (which isn’t always 100%).

Or, you can find a geek like me, have them write a script, and zip through all of your files in a few minutes. Then you get a nice, neat list like this:

no organic results

That’s every page and image that hasn’t received a click from an organic search result.

If you have this data, the cool scenarios are endless:

  • You need a quick list of all of the bots that are hitting your site. That could be key if your bandwidth usage is spiking and you don’t know why. Write a script to grab all of the spiders that hit your site, see if any of them are, say, crawling every MP3 file you’ve got, and then block ‘em.
  • Check to see what product images folks viewed most often before making a purchase. Yep, you can do that.
  • Trace one user’s path through your site.
  • Trace the path of all users who visited a certain page. Possible in some analytics packages, but it can get pretty klutzy.

And so on.

Log files are your idiot insurance

If someone, say, deletes your Google Analytics account, your log files are your fallback.

Log files are constant

If you switch from one analytics platform to another, you can see a big rise or fall in traffic or pageviews. Different platforms use different measurement methods, and that makes for inconsistencies.

But that’s really hard to explain to your boss when she asks you why the new site lost 30% of the old site’s traffic.

Analyze the log files, instead, and you can get a consistent view of traffic that spans both the old and the new analytics packages. Problem solved.

The short version…

…make sure you’re keeping those log files around. And make sure they’re recording every available variable, including:

  • Referrer
  • User agent
  • Referring keyword
  • File name

Most important, don’t let your webmaster delete old log files. Lots of hosting companies do this to ‘save space’. But it’s pretty easy to compress old log files to a far smaller size. Even if your log files are huge, you can store them on a humungous hard drive or use a service like Amazon S3.

Don’t. Delete. Your. Log. Files.

If this is all total gibberish to you, don’t worry about it. Just make sure you keep your log files around, and that they’re complete. Wave at your IT guy and say ‘make it so’.

You’ll thank me later.

tags : conversation marketing

related articles

2 Comments

  1. Kim

    Thanks for another helpful post, Ian.
    I’ve read other bloggers who that say it’s important to keep the log files, but no one seems to say ‘why’ or ‘how’ or shows what they look like. It helps to know what I’m looking for.
    So now that you have convinced me, I’ve got to go figure out where (and IF) GoDaddy keeps my logs. Wish me luck.
    Happy New Year!

  2. The problem is, HTTP being what it is, without a means of tracking the session you won’t get useful info relating to individual user journeys after the fact. You don’t get /great/ journey info. from GA, but it’s better than nothing.

Comments are closed.