The Evolution of Advanced Keyword Research


Keyword research used to be so easy. You picked terms that the client wanted to rank high for, stuffed them onto the page, shake, bake and voila, the site is ranking in the top 3 search results for query.

Bad SEOs stuffed keywords in all of the wrong places, all over the page, and in same font color as the page background so that only the search spiders saw them. Good SEOs stuffed them into anchor text, heading and subheadings, and <title> tags.

And then things began to change: first in a good way, then in a not-so-good-way, and now maybe in a good way again.

A Brief History of Search Engines

Keyword research was easy because search engines were so dumb – I mean “fleece your little brother for his paper route money” dumb.

All they had going for them was the flimsy Term Frequency/Inverse Document Frequency formula that rewarded documents that had the most instances of the query terms in body text, along with some lightweight suppression clause so that long boring documents did not always get the top spots.  Yay early information retrieval.

The good days of gaming search engines came to a screeching halt with Google’s buzzkill PageRank algorithm. PageRank was based on the academic model that stipulated papers cited by other papers had to be better than those not so cited. Applied to the decidedly non-academic public Web, pages that had a lot of links pointing to them had to be more relevant than the others, right?

Peace and relevant search results lasted until the SEO community was able to figure out how to game this system with begging, borrowing and buying links. Google was shocked! Evidently, this did not happen in the hallowed halls of academia, at least not in the Stanford Graduate Computer Science program of the 1990s.


Waaaaaahlll Pilgrim, Google was not going to take that sitting down. (I have it on good authority that they take nothing sitting down because those stand-up treadmill desks are standard issue at the Googleplex.) They fire off the Hilltop Algorithm. Hilltop was one of the first algorithms to introduce the concept of machine-mediated “authority” to combat the human manipulation of results for commercial gain.

With Hilltop:

  • Pages are ranked according to the number of non-affiliated “experts” that point to it, i.e. not in the same site or directory
  • Authorities have lots of unaffiliated expert document on the same subject pointing to them
  • Affiliation is transitive [if A=B and B=C then A=C]

The beauty of Hilltop is that unlike PageRank, it is query-specific and reinforces the relationship between the page and the user’s query. You don’t have to be big or have a thousand links from auto parts sites to be an “authority” and float to the top. And, to the rejoicing of searchers, it was soon adopted by the other search engines across the land.

Giddy with the success of contextual mapping, the search engines followed up with Topic-Sensitive PageRank.  This involved the geekiest of information retrieval methods, use of predictive analytics, and vector space modeling on a subset of the Web to analyze the context of query phrase term use in a document, in the history of all queries, and in the history of the user who submits the query.

As if Christmas in July was not enough for searchers, the search engines also laid down ontology (an organized schema of subject categories) supposedly derived from the Open Source Directory.  I don’t know about you but that looks a lot like the Semantic Web to me.

Ironically, as search engines got smarter, searchers got dumber. Most of them started to construct poor queries (56%), select irrelevant results (55%), and become disoriented and overwhelmed by the amount of information in search results (38%). Hmmm…maybe it is time to start taking a user-centered approach to optimizing websites for users?

User-Centered Keyword Research

User-centered keyword research lives up to its name by starting with what prospective customers would likely use to find the site. And the best place to find that information is your client.  Familiarize yourself with the client’s product space and vocabulary, ask questions, and look at their competitors.

Then, turn to Google Analytics to find out what is sending traffic and how it is performing. I look at what page they land on, whether they engage or bounce, and if they convert. If there is one tail, long or short, in SEO that is supported by data, “longer query = more likely to convert” is it.

Finally, swing by Google Webmaster tools and see how the search engine currently views site relevance by studying queries, impressions, AVERAGE (important distinction there) position, and click-through rate.

Next, compare actual site behavior with general search behavior using any one or all of my favorite tools:

Google Trends is not nearly a big enough return for the egregious and persistent invasion of our privacy, but it comes pretty close. Google Trends is a view into the long, dark, deep Google data mine of search behavior with the capacity to filter by geography or time.

The true delight lies in seeing Top and Rising search queries to the term phrases in comparison. It provides actual user search behavior. In the comparison below, the phrase “user experience” is more popular than “information architecture.” Note that a significant portion of the search around the general phrase is job-related based on the Top Searches information.


Chart of Advanced KW Research Google Trends

Google Trends

Yahoo! Clues offers many of the same data points as Google Trends with demographic information (age, gender) thrown in for good measure. The data is extracted from Yahoo! Search and is aggregated and anonymized. A Yahoo! Clues-specific feature is the Search Flow data that reveals what the user searched for before the term phase comparison and what they searched for after.


Chart of Advanced KW Research Yahoo Clues

Yahoo Clues

We’ve all experienced the mostly annoying yet occasionally helpful search suggest, the list of query suggestions that appears as you start typing, and changes to meet the changes in your query.

Ubersuggest provides an easy to navigate, much less annoying, more useful aggregation of search suggestions from Google and other “suggest services.” In looking at the Ubersuggest results for query keyword research, I’d say that most folks want someone or something else to do the work for them.

Advanced KW Research Ubersuggest


As you can see from Ubersuggest, if you are looking for a keyword research tool, you are not alone in your quest. Which you choose, however, will be up to you. Some perennial sites in the top search results for “keyword tool” are:

Magical Thinking with Psychographics

At a search conference in July 2012, Marty Weintraub from aimClear delivered a groundbreaking presentation on using Facebook psychographics to develop a new type of user persona that can assist with remarketing.

On the aimClear blog, psychographics are defined as: “…a means of identifying users by interests, occupations, roles in life, predilections, and other personal characteristics” This involves mining social outlets for personal preference data, e.g. a political reporting website that targets individuals who listen to Rachel Maddow, Stephen Colbert and Al Jareeza, like the Muppets, and work for a middle of the road or left-leaning online or print publication.

These preferences are often articulated with term phrases developed by users and potentially reveal what they would use when looking for the client product while facing a search box.

In his book “How Buildings Learn”, architect Stewart Brand recommends waiting a few weeks after a building is finished before putting in the walkways. His reasoning? The footprints in the grass will tell you where people are walking to get in and out. So it is with smarter keyword research. Before stuffing a bunch of term phrases on a page, start with what searchers are using to find your client’s product or service. Then keyword research will be as smart as the search engines, or even smarter.

What keyword research tool do you find most helpful?  Let us know in the comments below.

Start call to action

See how Portent can help you own your piece of the web.

End call to action


  1. Great article! A question re Google Trends: do you get the same message that I do after a certain length of time – I don’t remember the exact phrase but it runs something like, “You have used your quota, try back later”? How is the quota determined for somebody who is signed into their Google account, and is there a way to get a larger quota? Working around that can be such a hassle!

    1. Hello Janet
      And many thanks for the kind words. I have not gotten the error message that you mention from Google. My speculation is that too many people may be trying to access the tool and the system has an automatic trigger to prevent meltdown. There may also be a difference if you are logged into a Google account or not.

  2. Hi Marianne,
    What a well written piece. I thoroughly enjoyed reading it. There is another cool tool that I use for keyword research. It basically aggregates keyword ideas from several sources, it’s pretty neat when you want to start mapping keywords based on user intent.
    Looking forward to more post from you.

Comments are closed.

Close search overlay