Ian Lurie // Mar 3 2010
Yesterday I explained why, in internet marketing, attribution is a mind-sucking pain in the ass. Not very satisfying, I know, because all I said was “This is really difficult” without providing any solutions. I was channeling Republicans. Tonight I’ll channel Democrats and provide a hopelessly complex yet plausible solution upon which no one will ever agree.
You can provide accurate attribution data. It’s hard, but it’s doable. All you need are:
I’ll be talking about this at SMX West tomorrow. I’m on the Analytics Action Plans for SEO and PPC panel. I’ll have 11 minutes to cover this as part of my presentation on analytics-driven SEO. I recommend recording it and playing it back in slow motion.
Your server logs are a fantastic data store. There are lots of great things you can learn from them. For our purposes, here, though, we’re going to focus on attribution.
First, you need to figure out what counts as a conversion on your site. I wrote about that many moons ago here. A conversion might be an e-mail newsletter signup, or a purchase, or something as simple as a visitor reading one specific page.
When a visitor completes a purchase, they land on a ‘thank you’ confirmation page. If they sign up for your newsletter, they land on a different page that says ‘thanks’. Those pages are goal pages. Every time a visitor views one, they’ve done something you wanted them to do.
I’m going to use one particular page on my blog – the Speaking page – as my goal page. The page filename is ‘/colloquium’. I just want folks to read it, nothing fancy.
Get your server log files. You don’t need all of them! On a site that gets more than 10,000 visits a day, a year’s worth of log files could clog every tube on the interwebs. Instead, choose a time period you want to survey, and get a limited set of logs.
If you manage your own server, this part’s easy – you should already know where those log files are.
If you’re using a service like Media Temple, then you can download your log files via your control panel.
And, if you’re at a humungous company with an IT department, you can ask them to give you access to the logs.
I’m going to use Conversation Marketing’s log files: 2 days worth. On to step 4.
The raw log files are an ugly blob of text. You’ll need some kind of tool to help you sift through the data. My favorites include:
For now, I’m going to go with Excel. It’s a hell of a lot easier, and my log file is really small.
I’m also going to assume you know what the log file is, and how to interpret it. If you don’t, do some homework first.
There’s a lot of stuff in the log file that you don’t need. I usually remove:
All you need is the visitor IP address, the referring page, the requested page, the cookie, and the date/time. The more you clean up, the easier it’ll be to work with.
In Excel, I sort by cookie and then IP address, and then date. Then I use conditional formatting to highlight any page request for the goal page. The result looks like this:
See that? Each bolded list is one visitor’s many visits to my site. Note how these visitors came from many different places over time: The same person came to my site from ‘Facepunch.com’ (?!!), Facebook.com, Cracked Magazine, and then a combination of PPC and organic search results. This is a very complete picture of their interaction with my site.
Now you have a true picture of how each conversion unfolded. Look at this sample – the stuff outlined in red is the referring history for a single visitor, ending in a conversion (since /colloquium is my conversion page):
If I was using a typical analytics tool and applied the default last- or first-click attribution, the 2nd conversion would’ve been attributed to Facepunch.com or a Google Organic search on ‘how to write better headlines’.
With my quick-and-dirty log file analysis, I can see the attribution picture is much more complex. This person came to my site from lots of different referrers and keywords before they converted. Every one of those referrers/keywords played a part, so it’s important that I not rule them out.
It’s ugly, but it’s also full attribution: If I see that 4 visitors came to my site from a search for ‘ustream tutorial’ at some point, and then later converted off a different search, I’ll still note ‘ustream tutorial’ as a valuable phrase. If I were using a tool like Google Analytics, I would’ve missed that, because ‘ustream tutorial’ wasn’t the first or last click before conversion.
If I process entire log files this way, I can start to get a more accurate picture of which pages truly contribute to conversions. I can collate the data and find patterns like:
Those are great insights. If you just go with typical analytics tools and first- or last-click attribution, you won’t learn any of that.
So, learn to love your log files. It’s not the easiest thing in the world, but it pays off in the long run.
I hope this was helpful. It’s super, super geeky stuff. If you liked this piece and found it useful, let me know in a comment. If there’s a part of this that needs more explanation, let me know that in a comment, too. If you know of another tool that will do all of this work in 2-3 clicks, don’t tell me – it’ll just make me sad.
Ian Lurie is founder and CEO of Portent Inc., an internet marketing agency that has provided internet marketing, including PPC, SEO, social and analytics services, since 1995. Read More