You may think that, with Google Analytics, you don’t need your server log files any more. Or, as someone who runs a web site for your business, you may not even know what log files are. They’re redundant, right?
You still need those log files!
If you don’t know what a log file is, read this.
There’s gold in them thar files
Say you want to find every page on your site that received zero visitors from organic search in the past 2 years. That’s a useful statistic: Those pages may be invisible to search engines, or they may be really poorly optimized. A list of organic search outcasts could keep you busy for quite a while.
You could use Google Analytics to find those pages – I’ll be writing about that next week. But only if you’ve been using Google Analytics for 2 years, and only if you had it configured correctly, and only if you can rely on Google Analytics’ organic click detection (which isn’t always 100%).
Or, you can find a geek like me, have them write a script, and zip through all of your files in a few minutes. Then you get a nice, neat list like this:
That’s every page and image that hasn’t received a click from an organic search result.
If you have this data, the cool scenarios are endless:
- You need a quick list of all of the bots that are hitting your site. That could be key if your bandwidth usage is spiking and you don’t know why. Write a script to grab all of the spiders that hit your site, see if any of them are, say, crawling every MP3 file you’ve got, and then block ’em.
- Check to see what product images folks viewed most often before making a purchase. Yep, you can do that.
- Trace one user’s path through your site.
- Trace the path of all users who visited a certain page. Possible in some analytics packages, but it can get pretty klutzy.
And so on.
Log files are your idiot insurance
If someone, say, deletes your Google Analytics account, your log files are your fallback.
Log files are constant
If you switch from one analytics platform to another, you can see a big rise or fall in traffic or pageviews. Different platforms use different measurement methods, and that makes for inconsistencies.
But that’s really hard to explain to your boss when she asks you why the new site lost 30% of the old site’s traffic.
Analyze the log files, instead, and you can get a consistent view of traffic that spans both the old and the new analytics packages. Problem solved.
The short version…
…make sure you’re keeping those log files around. And make sure they’re recording every available variable, including:
- User agent
- Referring keyword
- File name
Most important, don’t let your webmaster delete old log files. Lots of hosting companies do this to ‘save space’. But it’s pretty easy to compress old log files to a far smaller size. Even if your log files are huge, you can store them on a humungous hard drive or use a service like Amazon S3.
Don’t. Delete. Your. Log. Files.
If this is all total gibberish to you, don’t worry about it. Just make sure you keep your log files around, and that they’re complete. Wave at your IT guy and say ‘make it so’.
You’ll thank me later.
Thanks for another helpful post, Ian.
I’ve read other bloggers who that say it’s important to keep the log files, but no one seems to say ‘why’ or ‘how’ or shows what they look like. It helps to know what I’m looking for.
So now that you have convinced me, I’ve got to go figure out where (and IF) GoDaddy keeps my logs. Wish me luck.
Happy New Year!
The problem is, HTTP being what it is, without a means of tracking the session you won’t get useful info relating to individual user journeys after the fact. You don’t get /great/ journey info. from GA, but it’s better than nothing.