SEO Obviousness: Duplicate content sucks

Saying there’s no longer a duplicate content penalty is a semantic weasel routine that ignores one hard fact: Duplicate content will kill your rankings.
I’ve written a lot about canonicalization and the content duplication toilet bowl of death. Those are causes of duplication.
Read the two articles I linked to above first if you don’t know what I mean by ‘duplicate content’ or ‘canonicalization’. It’ll make this article make sense.
It really helps to know why duplicate content is bad, though. So here are the SEO reasons you need to avoid duplicate content:
1: Wasted crawl budget
Search engines ‘crawl’ your web site using programs called spiders. When Google or Bing send a spider to your site, they set a crawl budget – a certain amount of time the spider will spend on your site, scampering from page to page.
The spider crawls every page it finds, regardless of duplication. Even if you use rel=canonical to tell visiting spiders to use a different URL, the spider still had to go to the initial URL.
That’s wasted crawl budget, any way you slice it. If Googlebot (Google’s spider) spends 1 minute on your site, loading 1 page every 15 seconds, it crawls 4 pages. But, if 3 of those 4 pages are duplicates, it only crawled 2 ‘real’ pages. The other two are at best wasted, at worst an even bigger SEO problem (see #2 for why).
Duplicate content burns crawl budget. Search engines crawl fewer unique pages of your site, leaving you with fewer indexed pages, and fewer shots at good rankings.
2: Link dilution
Duplicate content is the Ralph Nader (or Ross Perot) of search engine optimization. In simplest terms, every link to your web site is a vote. Some votes are worth more than others, but they’re still votes.
If your site has 4 duplicates of a particular page, four different webmasters could each link to that same page at a different URL. You just split the vote 4 ways, and rel=canonical isn’t going to fix it. Not 100%, anyway.
Duplicate content dilutes link authority.
3: Indexing flip-flop
Finally, duplicate content leaves it up to search engines to guess which page to index and rank. Combine 3-4 copies of the same page at 3-4 different URLs with link dilution (see #2, above) and you’re kinda screwed. Search engines have no good way to know which of those pages should really show up in a search result.
If they rank the wrong page, and you then remove it, you lose your ranking. Or, as webmasters link to different copies of the page, various copies pop in and out of the rankings, never moving up.
Duplicate content sucks
Think carefully about how you’re linking within your web site. Don’t depend on SEO workarounds like rel=canonical to fix the problem later on. Address duplication pre-launch, pre-redesign. If it’s too late for that, address it right now. It’s the best SEO favor you can do for your site.
All sales, all the time!
- Buy my new Google Analytics e-book: Look, people, I just published a new e-book called The Fat Free Guide To Google Analytics. It’s gotten some really great reviews. And I’ve sold 11 copies. 11 freaking copies. If you want to learn to use Google Analytics in a way that actually helps your business and doesn’t suck the life out of you, you need this e-book. If I don’t sell some copies soon, I’m going to retire and open a bicycle shop like I always wanted to.
- SEO Copywriting e-book: An oldie but a goodie – The Unscary, Real-World Guide to SEO Copywriting. Buy it for $7.
- Or, sign up for the Fat Free Guide to Internet Marketing, my online training site. I add new stuff all the time. Plus, you get a copy of the Google Analytics e-book, free.

Portent's Founder & CEO
Ian Lurie is founder and CEO of Portent Inc., an internet marketing agency that has provided internet marketing, including PPC, SEO, social and analytics services, since 1995. more >


I’ve heard the idea of canonicalization banded about but have never really heard any strong views on it – very much a case of ‘perhaps it’s true, perhaps it’s not’. It’s good to hear someone say it’s actually true, since I’ve suspected it might be for a while but haven’t had any serious motivation to do anything about it.. The idea is a crawl budget is curious too. At least it’s correctable with the appropriate redirect codes!
Great post! You’ve explained this so well. And it all makes such simple sense! I didn’t know about the crawl budget of the spiders and I’ll definitely watch out for that on our own site.
Excellent posting Ian..Thank you for making SEO an easier concept to understand!!