Ian Lurie // May 20 2010
One of the easiest ways to improve your link popularity is to fix old, busted links from other sites.
For example: Say I have an online store. One product page is www.mysite.com/bike-tire/tubeless. A nice webmaster decides to link to that page, but uses www.mysite.com/bike/tubless. Because that link is incorrect, anyone going to the ‘tubless’ page gets a 404 error. And, I don’t get any link authority from the other site. Bummer.
I contact the linking site’s webmaster and ask them to fix it, but it turns out their webmaster is on a 10-month sabbatical and no one knows how to edit the site. Bummer, again.
I can get at least some of the authority back, though, by setting up a 301 redirect from the ‘tubless’ URL to the correct, ‘tubeless’ one. That’s building links by looking between the cushions, aka out-executing your competition.
It’s a great solution, but it doesn’t scale. If you have a reasonably-large database-driven site – say an online store with 1500 products – you could have 5000+ pages on your site. And you may have 1000+ broken incoming links. You’ve got to go through those lists, matching up broken links with relevant redirection targets. Egads.
I could just every broken link to my home page. But that’s not ideal:
This process screams for automation: Why not compare each broken link to a list of good ones, and then pick the best match, all using, I dunno, a search tool?
I tried to come up with some great name, like Linkinator, or Link Baby Link, but so far I’m at a loss, so…
aka TAFLRFTG. Sigh.
In plain language, this soon-to-be-renamed tool works like this:
In this case, ‘fuzzy’ has nothing to do with ‘cuddly’. It means ‘a computerized guess’. I don’t make these terms up, so don’t ask me where this came from.
If you’re a bit nerdier, try this:
The result is a nice text file that lists each bad link, plus the URL to which I should redirect.
Lucene works amazingly well, matching up stuff like “/2009/09/seo-101-canonicalization-1.htm)” with “/2009/09/seo-101-canonicalization-1.htm”.
It also finds more subtle stuff. Someone linked to: “/2010/03/get-geeky-grep-Search%20Engine%20Optimisation-tool.htm” and the thingy found “/2010/03/get-geeky-grep-seo-tool.htm” as the best 301 redirect target.
It did this with no human intervention: I fed it a list of bad urls, and a list of good ones, and clicked ‘go’.
Using Lucene in place, though, the thingy runs an analysis of 1000 broken links and 20000 good ones in under 20 seconds.
HUGE props to Raymond Camden, whose example of using Lucene on ColdFusion got me started, and to my CTO, Branden Root, whose Java knowledge kept me from going even more insane.
I’ll definitely be putting this tool out for public use, once I make it prettier, and figure out how to keep it from melting servers to slag. In the mean time, if you’re a ColdFusion developer, you can download my sample code. All of the necessary plumbing is in this one file:
Since ColdFusion is designed so that even programming morons like me can use it, you should be able to interpret the code and convert it to your own language, too.
Enjoy, and let me know if you have suggestions.
Ian Lurie is founder and CEO of Portent Inc., an internet marketing agency that has provided internet marketing, including PPC, SEO, social and analytics services, since 1995. Read More