Analytics: Why you still need those log files
Ian Lurie Dec 31 2009
You may think that, with Google Analytics, you don’t need your server log files any more. Or, as someone who runs a web site for your business, you may not even know what log files are. They’re redundant, right?
You still need those log files!
What’s a log file?
Whoa there, you say. Ian, you’re going geeky. What the heck is a log file?
It’s your server’s record of who’s come to your site, when, and exactly what they looked at. It’s incredibly detailed, showing:
- Where folks came from
- What browser they were using
- Exactly which files they looked at
- How long it took to load each file
- And a whole bunch of other nerdy stuff
If you were to look in a raw log file, you’d see something like this:
I know: Blech. But stay with me.
There’s gold in them thar files
Say you want to find every page on your site that received zero visitors from organic search in the past 2 years. That’s a useful statistic: Those pages may be invisible to search engines, or they may be really poorly optimized. A list of organic search outcasts could keep you busy for quite a while.
You could use Google Analytics to find those pages – I’ll be writing about that next week. But only if you’ve been using Google Analytics for 2 years, and only if you had it configured correctly, and only if you can rely on Google Analytics’ organic click detection (which isn’t always 100%).
Or, you can find a geek like me, have them write a script, and zip through all of your files in a few minutes. Then you get a nice, neat list like this:
That’s every page and image that hasn’t received a click from an organic search result.
If you have this data, the cool scenarios are endless:
- You need a quick list of all of the bots that are hitting your site. That could be key if your bandwidth usage is spiking and you don’t know why. Write a script to grab all of the spiders that hit your site, see if any of them are, say, crawling every MP3 file you’ve got, and then block ’em.
- Check to see what product images folks viewed most often before making a purchase. Yep, you can do that.
- Trace one user’s path through your site.
- Trace the path of all users who visited a certain page. Possible in some analytics packages, but it can get pretty klutzy.
And so on.
Log files are your idiot insurance
If someone, say, deletes your Google Analytics account, your log files are your fallback.
Log files are constant
If you switch from one analytics platform to another, you can see a big rise or fall in traffic or pageviews. Different platforms use different measurement methods, and that makes for inconsistencies.
But that’s really hard to explain to your boss when she asks you why the new site lost 30% of the old site’s traffic.
Analyze the log files, instead, and you can get a consistent view of traffic that spans both the old and the new analytics packages. Problem solved.
The short version…
…make sure you’re keeping those log files around. And make sure they’re recording every available variable, including:
- User agent
- Referring keyword
- File name
Most important, don’t let your webmaster delete old log files. Lots of hosting companies do this to ‘save space’. But it’s pretty easy to compress old log files to a far smaller size. Even if your log files are huge, you can store them on a humungous hard drive or use a service like Amazon S3.
Don’t. Delete. Your. Log. Files.
If this is all total gibberish to you, don’t worry about it. Just make sure you keep your log files around, and that they’re complete. Wave at your IT guy and say ‘make it so’.
You’ll thank me later.
CEO & Founder
Ian Lurie is CEO and founder of Portent. He's recorded training for Lynda.com, writes regularly for the Portent Blog and has been published on AllThingsD, Forbes.com and TechCrunch. Ian speaks at conferences around the world, including SearchLove, MozCon, SIC and ad:Tech. Follow him on Twitter at portentint. He also just published a book about strategy for services businesses: One Trick Ponies Get Shot, available on Kindle. Read More