Google Webmaster Tools Query Data is Worthless

gwt error histogram Analytics

Ian Lurie Jun 12 2013

The short version, if you want to skip me ranting like a lunatic: Google Webmaster Tools query data is, as far as I can tell, completely, 100% useless. It’s not a good ‘relative comparison.’ It’s so wrong that it might actually be a bad idea to use it at all.

Now, here’s the whole, tragic story:

What happened

I’m a pretty empirical guy. I like my numbers. They’re comforting.

So, when the realization dawns that data we all depend on is a big, fat chunk of steaming camel manure, I get a little…

Perturbed.

This week, Google Webmaster Tools has me reaching for the Valium, when a little bit of math proved GWT query data is said pile of recycled camel snacks.

How it happened

I built a tool that downloads Google Webmaster Tools web queries every night: A little bit of geeky goodness that made me smile, and let us archive query data for more than 90, 60 or 30 days, or 12 hours or whatever the next policy change brings.

Then I pulled all that data together and, in what may be the single stupidest idea ever, decided to compare the Webmaster Tools query data to analytics query data.

Here’s what I did:

  1. Picked 5 clients with overall organic traffic ranging from 3,000 visits/month up to 2,000,000 visits/month.
  2. Imported the Google Webmaster Tools (GWT – I’m sick of typing it) data into Excel.
  3. Dumped all terms with ‘<10’ clicks.
  4. Did the same with the analytics search data.
  5. Calculated the average ‘not provided’ impact for each client.
  6. Used that impact to attempt to adjust the GWT click numbers.
  7. Measured the percentage error by keyword.

The result

Oh. My. Gods. The best result was an average 40% error. The worst? A client using Omniture who had a whopping 149% average error. Here’s a histogram. It’s not pretty:

gwt error histogram

When I saw this result, I tried a few different things:

I pulled GWT query data that includes all queries, instead of just web queries. That made the result even worse. The average percent error rose, with the best result at 45%, and the worst at over 170%.

Next, I checked data accuracy by date for a single domain. It turns out that GWT query data isn’t even consistently wrong. Over time, data accuracy for a single domain fluctuates wildly:

gwt errors across a single domain

That’s one domain, measured every few days.

Random points

  1. There were an equal number of instances where GWT was too high and too low. So this isn’t about GWT or the analytics tools over- or under-counting. It’s random.
  2. I used clients in industries ranging from publishing to e-commerce.
  3. I was utterly sober while doing this math.

My questions

OK, Google, or Avinash, or someone, answer me this:

  1. Am I just doing it wrong? Please tell me I’m just doing it wrong. Please?
  2. If I’m right, why even show this data to us? It has zero value as a relative measure. As an absolute measure it’s worth a bit less than an NSA privacy agreement.
  3. Again, if I’m right, where does this data come from? Inebriated gnomes? A bunch of Atari 2600s managed by chimps? Or are you rolling a pile of dice?
  4. If I measure a single keyword over time, am I going to see the same kind of randomness? I’m not sure I could take it, so I’ll refrain. Just curious.

With that, I’ll return to my room, sit down on the floor, grab my knees and rock gently to and fro while humming tunelessly in a minor key.

61 Comments

  1. Gareth

    I had the same issues, its nearly always wrong and when it was right it was actually counting the rank of images (as best I could tell?) I’m not sure what any of it relates to other than that because as you say, its just wrong.

  2. Thanks for this post Ian. We have done some similar comparisons at Moz and reached the same conclusion. It’s disturbing how different the GWT and GA data are.

    • Omid S

      With all due respect gentlemen, you must be all TRIPPING!

      GWT has a default filter that NEVER gets saved (ie, move to one other section from that queries page and it is reset)… FILTER : WEB… you want “all”. it matches GA data 100%.

      Also watch out for same reset on date ranges you set, they won’t stick once you navigate away

  3. What was the purpose of adjusting for ‘not provided’? Is this because you compared on a keyword level?

    What happens when you compare broader metrics, meaning total organic clicks, gwmt mobile and gwmt web, (via the queries chart, and not the queries table) vs total google analytics google organic visits?

    If you dumped all terms with less than 10 clicks, you could potentially be dumping a large amount of lng tail traffic.

  4. Yes that’s true WMT showing dramatical changes over the time. Some time saw huge spikes and some time huge grounds but traffic and keywords are still same.. Don’t know how and from where they populate this :S

  5. gied

    Have you considered evaluating visits to particular page rather than keywords ? In my market I find it norm that particular keywords have very different “not provided” %.

  6. I agree 100%!! the data in WMT is messed with purposely just like it is in Google trends or the key word tool.
    This is a way Google is giving you data but giving you nothing! no different then not provided.
    Google is a monopoly and they will only give you info if you pay for it (ppc) Just remember it is a free service they don’t need to give you anything. :(

  7. “It’s not even wrong!”

  8. David

    I have always had my doubts with WMT data…. I’m not sure if it is comforting or disheartening to see your results confirming my fears.

  9. Roald

    Don’t quite understand how you calculated %error. Could you explain that in some more detail?

    BTW, if Google decides to change the archive query data to 12 hours, you’ll only have half the data if you run your script every night ;-)

    • I took keyword clicks as measured in the analytics tool, then used the (not provided) number to adjust the GWT data. That SHOULD have gotten me at least in the right neighborhood.

      Even if GWT caps the numbers, so high-traffic sites would always be inaccurate, low-traffic sites should’ve been more accurate. Or the numbers should have been consistently wrong over time for one site.

      No dice.

      • Can you share the method you used to adjust the GWT data? Perhaps there’s something to be learned there.

        • Nothing super-scientific: I figured out (not provided) impact based on analytics data, then used that to reduce visit counts.

          I figured it’d be inaccurate, of course, but it’d be at least consistent.

  10. Great post, thanks for putting all this together and sharing.

    I have a question: how did you calculate the average not provided?

    • Hi Kevin,

      I calculated the impact of not provided by dividing the (not provided) visits by total organic visits.

  11. Colin

    Very interesting!

    One thought – how could you be sure it is worthless? My logic would be that not-provided data makes Google Analytics data so shockingly in complete, so hard to use as a comparison…

    • I tried to account for that, too, but there’s just no consistency.

  12. The data’s crazy useless. When we last looked at this with a client, GWMT appeared to reporting around <10% of the #keywords, traffic and impressions. That's going to introduce a crazy sampling error!

  13. It’s called The Google Dance. Some data is spot on, other data is completely wrong. What is what? No-one knows. Confusion. Frustration. The less we know the better – for Google. It’s like The Ranking Dance. Some phrases behave normally, others bounce around, some pages rank a bit higher, others rank a bit lower.

    I always looked at that GWT query data and for some strange reason never really used it. Never liked the data presentation. Why? Don’t know, sometimes you do things your way, huh?

    Over the years I saw a lot of ‘wrong’ data, especially data from professional SEO tools, who use wrong[ish] Google data. Ranking figures, traffic figures.. some kinda right, some kinda wrong. Lots of manual double-checking. I never rely 100% on figures from tools. Yesterday I had negative [-] organic search traffic figures in seoMOZ pro. Go figure ; )

  14. I have been ranting about this for a while. I have a client that has not ranked for a keyword for 2 years but webmaster tools shows him in position 3 for said keyword.

    The data used to be ok but I think like a few of Google’s other products (Google alerts comes to mind) if it is not making them any discernible direct profit they allocate less and less resource to it until it does.

    I believe that webmaster tools is going to go the way of pagerank (updated sporadically) and then eventually the way of iGoogle/dodo.

    Anyway it’s nice to see someone has some real proof that this is happening.

  15. Thank you for going all netmeg on this before I had to. Drives me up a frickin’ wall.

    • Here in Seattle they call it “Going all Ian.”

      :)

  16. Of course it’s bogus. But knowing it and proving it are 2 different things. Thanks for doing the work and providing us with another club to beat the annoying know it alls over the head with next time they want to explain (to us, to our clients, to themselves) how easy our jobs are. Talk about meaningless, I’m still trying to figure out how local search can be 75% as big as global search.

  17. Ian -> I normally try to leave more substantive comments on people’s blogs, but this time I just want to say that I really enjoyed the post. I’m still chuckling. Thank you.

    Yehoshua

  18. Ryan

    Found this article as a tweet from Rand Fish and I am glad I checked it out. I’m glad I wasn’t the only one noticing this problem with the query data.

    “A bunch of Atari 2600s managed by chimps” – made me LOL!

  19. Steve

    Props for going to that length with your research. I always assumed it was just an average over time. I was surprised to hear an old hat using it for measurement only yesterday.

  20. Never went to this extent (glad you did), but I definitely have noticed these inconsistencies and I personally have not trusted anything GWT has provided for a very long time. Agree with Meg, it’s infuriating.

  21. Satish

    Great work!
    GWT only tracks what happens on google universal search. How exactly u created a similar segment in GA? I doubt you compared apple with apple :-)

  22. The organic figures annoy me less, since I think Google wants you to create a website that just works and not worry about X keyword having 1,000 more searches/month than Y and over-optimizing based on that.

    What bothers me is the bagillion dollars people spend on AdWords, and G still gives incredibly vague figures for those campaigns. They have the data and it would help everybody, why not share it?

    I think this is one area where Bing and Duane Forrester can win over Google. They’ve put a lot of effort into making BWT super helpful, and with the new partnership with Apple, they might finally get in front of some more eyes and convert long-time “Googlers”. Here’s to hoping it’ll happen soon!

  23. Yup. I get the same thing. Understanding that Webmaster Tools is reporting on the searches and clicks and Analytics is reporting on the actual site visits and traffic through the site, I think there still should not be that much discrepancy.

    Thank you Ian!

  24. Thanks for addressing this issue Ian. I been wondering the same thing for quite some time now and haven’t been able to get a straight answer as to why the clicks differ from visits for the same query. The best advice I was given was to stick with GA data because it’s typically more accurate. Hopefully we’ll see some response from either Avinash or Google that adds some clarity to the issue. Cheers!

  25. And then there’s the click-through-rate stat in GWT. I’ve never seen anyone use this stat to measure anything but if someone has that would be really sad.

  26. Yep. I first noticed this when I connected my GWT to GA and was shocked to see that none of my numbers matched up between my Search Engine Optimization > Queries report and my visits from organic search. How can I rely on data when the data I get is so far from accurate?!

  27. I totally agree, I’ve done some tests on a few domains – but never every day (no software skills or laziness).

    People may say your dataset is not big enough to drive conclusions – for me it shows exactly what I see on my domains and I agree – this part (and most of) GWT data is crap.

  28. Nice rant Ian.

    So, I’ve always thought that Google treated the ‘query’ in GWT as not an exact match query as we think of a keyword in Analytics, but more of a broad match as you would think in Adwords.

    And by ‘broad match’ I really mean….we just pick numbers then scramble them up.

    I think GWT is a classic case of where the way Google silos departments (Adwords very separate from GWT, which is separate from Analytics) really hurts the products. GWT could be a really great tool. Sadly it just ends up being mostly frustrating, but with a few gems inside.

  29. Ian, I’m not sure which surprised me more…the huge disparity in the data or the fact that people are just now realizing it. I am hugely grateful someone has finally done an empirical study of this with hard data and numbers. I’ve been saying this for at least the past year as I’ve noticed the GWT query data showing wildly incorrect numbers. Never put it all together though and now that you have I have a resource to refer people who keep telling me I’m wrong on something because GWT query data says so.

  30. Hi Ian,

    I have found some different explanations – both in Googles own product descriptions and a couple of my own:

    From Google – http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35252:

    “Clicks: […] (These numbers can be rounded, and may not be exact.) ”

    and

    “Some processing of our source data—for example, to eliminate duplicates and visits from robots—may cause your stats to differ from stats listed in other sources. However, these changes should not be significant.

    Some tools, such as Google Analytics, track traffic only from users who have enabled JavaScript in their browser.

    Some tools define “keywords” differently. For example, the Keywords tool in Google AdWords displays the total number of Google searches for that keyword across the web. The of user queries for that keyword across the web. The Webmaster Tools Search Queries page, however, shows how many of those keyword searches returned your pages in Google search results, and this is a smaller number.

    There may be a lag between when the numbers are calculated and when they are visible to webmasters—although data gets published in intervals, we are continually collecting it. Normally, however, data should be available in 2-3 days.

    Time zones matter. Search Queries tracks daily data according to Pacific Daylight Time (PDT). If your other systems use different time zones, your daily views may not match exactly.

    To protect user privacy, Google doesn’t aggregate all data. For example, we may not track some queries that are made a very small number of times or that contain personal or sensitive information.”

    From my own experience:

    Did you activate any filters in GTW by any chance?

    The Not provided information and the way you calculate its effect compared to how GTW is handling these same queries must have a huge impact on the diffence?

    Above the Search Queries data table a total number of clicks is displayed – below that number there is a faded number of displayed clicks. On the sites I manage the displayed number of clicks is about 40% of the total number of clicks, which means that the click data in the table is 60% off, but we dont know which 60%… Or am I interpreting this the wrong way?

    Best regards
    Per Jacobsen

  31. Jeff

    It’s a Google tool. Not sure if you’ve noticed, but most of their tools the data is “off”.

    For example, have you also noticed how far off the Google Keyword tool data is? Tells you a particular term is searched for X number of times, yet when running an Adwords campaign, we get Y number of impressions – and this is a consistent thing, across multiple client campaigns, across multiple markets.

    It’s got to the point whenever talking to PPC clients I’ll tell them not to trust the numbers from Google’s own tool…..

  32. My question is, like your conclusion, “so now what?” From your audience comments, it seems your calculations were fairly trustworthy, or at least decent ways to compare the data of GA and GWT. (I’m still learning effective use of both, but what you did makes sense to me.) I’ve long heard how GA is not to be trusted for the competitiveness of keywords, and that the average number of queries is typically pretty off, too.

    Your post leaves me feeling like I want to punch Google in the face. I wish a Google rep could shed some light. These questions deserve a response. Or a confession!

    • I don’t recommend the punching thing :)

      My gut instinct is to stick to marketing: Build a great site. Have great stuff on it. Watch sales or other KPIs as the best indicator of success. The problem is, will clients make that transition? Not easily, I’m afraid. So it’ll take time. There’s no easy solution here – no other data source, very little chance the data will improve, etc.

  33. JC

    Thank you!!!! If I had a dime for every client who said they looked at “Search Engine Optimisation” in GA and saw these GW figures and others who asked why I dont go by this GW data….. I’ll tell them to keep the dime and send them to your post.

    Yeah I discovered this a while ago, so I feel your pain.

  34. Tad

    I’ve done the same kind of analysis myself on a much smaller scale and came to the same conclusions in terms of search traffic reported.

    The ridiculous thing is people trust that its at least in the ball park of accuracy. What’s even more ridiculous is that a company down the street from me likely spent a pretty penny on a tool that compiles the same kind of WMT data: http://www.rimmkaufman.com/blog/vanessa-fox-nine-by-blue-have-joined-rkg/22052013/ and is now touting it as the greatest thing since sliced bread. A tool like this is only as good as the data feeding it.

    I love Webmaster Tools, and I’m of the opinion that the only ranking reports worth using someday are going to be their Average Position reports, but that search volume data is completely bogus.

  35. Laurie

    Interesting post Ian – I’m glad you find the same solace in crunching numbers that I do ;-)

    We did some similar research into GA and GWT ranking data last summer and found plenty of anomalies.

    I also happened to get my hands on a huge volume of real user’s mobile search volume data from a UK mobile operator with circa 20% market share. Extrapolated that data and compared to Adwords mobile search volumes…. and the data was miles apart!

    I guess it all goes to prove you have to take the data Google is feeding us all with a pinch of salt.

    We did our GWT:GA comparison last summer and my memory is a little hazy, but am I right in thinking that there is a geographical difference between the data sets and you have to filter the data to compare data from users in the same region?

  36. The query data in GWT is per impression.
    The query data in the Organic Search Traffic report is per visit and uses last touch attribution.

    They aren’t the same thing, so you will see an “error”.

  37. c

    what part of the Google Webmaster Tools Query Data was wrong? CTR, average position?

      • Ty ler

        So is it just clicks, or do you feel the “average position” data is way off too? I’m most concerned with how we use GWT as our main source of tracking our term rankings in google…

        • In my experience the average position data works OK.

  38. I’ve done a bit of noodling with this in the past (both distant and recent) and found a few things that tripped me up.

    I don’t think we’re looking at, or getting, all of the data in GWT. So if you’re matching GWT to all of your GA data you’ll always see huge error bars.

    I think this gets magnified based on the size of the site and number of keywords driving traffic.

    Here’s the key. In GWT you see Queries, Impressions and Clicks. Those are in bold black. Then underneath Impressions and Clicks you see Displaying x number.

    https://www.evernote.com/shard/s242/sh/e139fdb9-56d3-441b-99fa-5e0cd96ba54b/56e8aca7ccc21d0899459fa8928b37b1

    The problem here is that, visually, the number of queries maps to the ‘Displaying’ numbers. They aren’t equal from a visual hierarchy so you expect that you’re looking at an entire data set. Or at least I thought that’s how it worked and that each page was showing a certain number.

    But here I can download this and change the impressions <10 to 5 and then clicks <10 to 1 and come up with numbers very close to the Displaying numbers.

    However, the same client, same time period shows 110K rows for keywords and that's with (not provided). So GWT is cherry picking the top ~7K queries to show out of that 110K+.

    Because GWT shows the top queries, the impact of the missing data becomes less dramatic, particularly for smaller sites. At one point I was trying to figure out if Google was using a certain % to calculate 'top' but I got busy and never finished that research. (Though I do think it would be interesting to find out.)

    Once you match keyword to keyword it gets reasonable if you use (not provided) as a fuzz factor. In addition, Google's clearly using some sort of rounding and grouping here so that you get these round numbers.

    So for web keyword data on the keyword level it's … ballpark. I don't really trust it but I'm not as crazed about it as I was before my due diligence. I'd still like to build a massive VLOOKUP between the two groups but … time is not on my side.

    However, for images, things are pretty off which is why it looks worse when you included everything. I did this research for my post on tracking image search in GA (http://www.blindfiveyearold.com/tracking-image-search-in-google-analytics).

    In short, I think GWT tracks clicks on a result and not visits to your site which creates a much larger delta between the data sets.

    Of course, I might have screwed something up in doing all of this or am not seeing it the right way either but hopefully my contribution helps move the conversation and investigation along.

    • Great analysis, AJ – much deeper than I went.

      The problem is the inconsistency of the error %. If one web site always showed 20-30% error in GWT, we could at least say “OK, this data is off, but it’s consistently off, so we can use it as a reference.” But that’s not the case. The inconsistency is what drives me nuts.

      • Well you actually DID the massive VLOOKUP. So you’re looking at the keyword by keyword variance and finding continuing and inconsistent variance. So I’d argue you’ve gone deeper. But let’s not Chip n’ Dale here.

        The fundamental problem here is that between (not provided), the rounding issue (http://googlewebmastercentral.blogspot.com/2011/02/update-to-webmaster-tools-search.html) and the sampling issue (http://googlewebmastercentral.blogspot.com/2012/04/even-more-top-search-queries-data.html) you just don’t quite know what you’re looking at.

        The former link was maddening! To make it simpler for people we made it less accurate. Uh, what? This quote in particular was a winner.

        “Generally, a difference of less than 10% between the numbers you see now and those you saw prior to this change should not be considered significant.”

        So the numbers are ballpark. But when you’re playing ball, you need to know whether that big fly is going to be fair (home run) or foul (strike 2).

        TL;DR. I’ve never trusted this data and instead do as much as I can with the referrer and filters within Google Analytics.

  39. Jeevan

    Are you accounting for the keywords that are not showing up in organic bc of the the whole iOS direct traffic issue.

  40. Garethjax

    Dice. Roll for initiative.

  41. Oh, thank god.

    I ignore GWT stats because they made no sense and had seemingly no correlation even based on a casual comparison.
    I thought it was because I deal with low-traffic sites (ie <1000 visits).
    Now I can say I'm not lazy or stoopid, it's all someone else's fault.

    Hurrah!

  42. Howdy Ian,

    Per Jacobsen’s comment above could be the explanation for your data inconsistencies. Plus the fact that the GWT data fluctuated randomly about the Analytics data hints that there may be correlation between the two that’s not apparent from your analysis.

    It seems to me that sampling as you are, you have time series data. The best way to look at that time series information would be to plot, for each keyword phrase, the GWT and Analytics data with time on the x-axis and clicks as the y-axis.

    An analogy to your keyword clicks time series might be measuring a particular stock at noon in Chicago and New York and plotting the Chicago and NY measurements against time each day. Because of the hour time difference, the stock prices would rarely be the same, but they would be highly correlated and would generally move up or down together. Again, because of the time differences, you’d probably find that Chicago’s noon price might vary randomly around New York’s noon price.

    If you plot the GWT data and the GA data against time and there’s no correlation between the two plots, then you may be right in your conclusions. But until you look at the information as a time series, you don’t really know.

    Time series analysis is a topic in statistics that’s beyond the scope of a comment post, but a simple visual comparison of the GA and GWT graphs over time, for single keyword phrases, would show whether GA and GWT are correlated or not.

    Thanks for a very interesting discussion.

  43. Good thoughts and comments here. Though what I am missing in your analysis is the fact that many – if not most – mobile clients use Google’s encrypted search these days. These are tracked in Analytics as direct traffic, you don’t even see them as “(not provided)”. See http://webkruscht.com/?p=139 for details.

    Could be interesting to see your visitor statistics by Browser / OS and compare this with the error% over time.

  44. Dear Google I hope you’re reading this post and then comments all of he way to the bitter end. You encourage everyone to be analytics junkies and then give them random data. I don’t get it. Is this meant to be humour?

  45. Dear Ian,
    Thanks for this. I’m not at all surprised.
    However, I feel this is just the tip of the iceberg.

    There is little doubt nowadays that authors are considering keywords when they write an article for a blog or for the web. What’s the point in writing of no-one is going to read it?

    So let’s say that your data above also extends to keyword analysis overall. This means that authors who are using keyword analysis (and who isn’t) as a basis for their articles are using screwed data. How “biased towards Google” does that make the entire Internet?

  46. Haha, man, Ian, you got me at camel manure! Love your writing style and yes, GWT (I won’t even bother to write it in full as well) is a piece of you know what that comes from where it hurts most… But I have never seen somebody word it so funny and yet pretty much fully accurate… This story made my day!

  47. My worst experience with this is across continents… showing data for a .com when settings are for a .co.za? honestly? it it does it constantly … have given up on it. only thing it good for is sitemap submissions and un-indexing urls…

Comments are closed.