Ian Lurie // Apr 30 2012
We’re all talking a lot about content and social media these days. But visibility boils down to technical SEO: Ensuring that search engines can easily find and categorize every page of your site. These are the top 4 I look at first when I’m auditing a site:
You can use our server response code checker for starters. Just make sure your server is delivering a 404 for a broken link, a 200 for a page that’s just fine, and a 301 for a redirect. You can learn about server response codes in this post I wrote a couple years ago.
Duplicate content hurts site quality and crawl efficiency. You need to get rid of it. We have our own crawler for testing this, but you can use Screaming Frog SEO Spider. Distilled has a fantastic article that includes detailed instructions on using Screaming Frog to find duplicates.
Site owners find all sorts of ways to make pages on their site vanish. They orphan pages; they break links; they remove all possible ways of reaching a specific page. You need to find those.
There’s no one fantastic way I know to do this. But some of the tricks I use are:
Content management systems like WordPress have lots of extra little snippets of code they use to schedule tasks, deliver content via AJAX, handle searches and generate navigation. That’s fine, but if a search bot starts beating the poop out of some of these snippets, they can suck the life out of your server.
Here’s an example: A few weeks ago, RandFish was kind enough to tweet about a post we’d written on this very blog. That same day, I had an article go live on TechCrunch. As a result, we got about 5x our normal traffic. No big deal.
Unless, of course, you’ve already got GoogleBot rattling around between a WordPress AJAX script and a database scheduler every 15 seconds or so. Then your server coughs, sputters and flips over on its back, waiting for a tummy rub. It also locks up so badly that no amount of cursing or talking nice will get it to let you log in and fix it, by the way. In case you were wondering. And I know you were.
When I looked at our log files for the last month, I found two URLs that GoogleBot kept hitting: wp-cron.php and admin-ajax.php
‘Kept hitting’ means ‘latched onto like a leech at a blood bank’. GoogleBot hit these files 4-5 times per minute.
We disallowed them, and voila: No more crashing server.
That was a classic spider trap: Pages or scripts no bot should find, but did.
Check your log file BEFORE your site crashes and you can avoid our embarrassment.
These four tips are just for starters. You can check for broken links, work on site speed and clean up your code, for example. But once the really easy stuff is addressed, the four ideas above should keep you busy for a while.
What do you all look for in a technical site audit?
Ian Lurie is founder and CEO of Portent Inc., an internet marketing agency that has provided internet marketing, including PPC, SEO, social and analytics services, since 1995. Read More