How Site Pagination and Click Depth Affect SEO – An Experiment

Matthew Henry, SEO Fellow
SEO pagination tunnels and click depth visual exploration

We’ve all seen pagination links — those little numbered links at the top and bottom of multi-page content. They are used on blogs, e-commerce sites, webcomics, gallery pages, SERPs, and multi-page articles.

Simple pagination examples
Pagination-Examples---User-Interface

31 Flavors of Pagination

 

From the human visitor’s point of view, pagination is pretty simple. If you are on page one, and you want to see page two, you click “2” (or “next”, or whatever). You don’t really have to think about it. From a web crawler’s point of view however, it’s a bit more complicated.

A search engine crawler's perception of a small website
Small-Website---Crawl-Tree

A search engine crawler's perception of a small website

Web crawlers find new pages on a site by following links from pages they have already crawled. The crawler doesn’t know about a page until it has found at least one other page that links to it. (This is an over-simplification, but is generally true. There are some exceptions, like XML sitemaps and direct submission, but we’ll ignore those here.)

For sites with a simple tree-like structure, this works pretty well. The crawler reads the home page. Then it follows links from the home page to (for example) each of the top-level category pages. Then it follows links from these to the secondary category pages, and then from these to the content pages. In this simplistic example, the crawler can get from the home page to anywhere else on the site by following at most three links.

A very small website SEO crawl tree
Very Small Website - Crawl Tree

An even smaller website crawl tree

But now let’s look at an example with paginated content:

Simple “Next” Link

Suppose you have a website that contains a sequence of 200 numbered pages. For purposes of this example, it doesn’t really matter what kind of pages they are. They could be product listings, or blog posts, or even a single article split into 200 pages. (Please don’t actually do that.) What matters is that there are 200 of these pages, and they are numbered sequentially from page 1 to page 200.

For this first example, let’s assume these pages are connected by the simplest pagination possible: a single “next page” link at the bottom of each page:

next page

 

This scheme is as simple as it gets. If you are on page 1, and you click this link, you will be taken to page 2. If you click again, you will be taken to page 3, and so on. If you click repeatedly for a very long time, you will eventually get to page 200. This scheme is fairly common in the real world, mostly on blogs (typically with the link text “« Older Posts”). It is not as popular as it used to be, however. (for reasons that will become apparent below)

From the crawler’s point of view, this site looks like this:

Crawl Tree: Simple “Next” Link Pagination

This chart shows the discovery path that was followed by a web crawler as it crawled a simulated website. In this case, the simulated website had 200 numbered pages connected with a simple “next” link on each page. (There were also some other, non-numbered pages on this site, but the numbered pages are what matter here.)

Each colored dot represents one page. A connection between two dots means the downstream page (the smaller dot) was discovered on the upstream page (the larger dot).

That long squiggly tail is a “tunnel”: a long connected sequence of pages that the crawler has to walk through one at a time.

The main thing to take away from this chart is that this form of pagination is extremely inefficient because it creates a very long pagination tunnel. This is a problem, because:

  • When content is buried hundreds of links deep, it sends a strong message to the search engines that you don’t think the content is important. The pages will probably be crawled less often, if at all, and they probably will not rank very well.
  • If just one page in that long chain returns an error (e.g. because of a temporary server hiccup), the crawler won’t be able to discover any of the other pages downstream.
  • Sequential chains can’t be crawled in parallel. In other words, the crawler can’t request more than one page at a time, because each page can only be discovered after the previous page has loaded. This may slow the crawler down, and may lead to incomplete crawling.
  • Human visitors will probably never reach the deepest pages at all, unless they are extraordinarily patient and persistent. If a human visitor does wish to see your deepest pages (e.g. because they want to read your first blog post), they are likely to give up in frustration long before they get there.

So, how can we improve this? How about…

Adding “Last” and “Previous” Links

Let’s make a few seemingly minor changes to the pagination links:

first previous next last

 

The important changes here are the “last” and “previous” links. Together, they give the crawler (or human) the option to step through the pages backwards as well as forwards. This scheme is also fairly common on real websites, especially blogs. To the crawler, this new site looks like this:

Crawl Tree: Added “Last” Link to Pagination

This is somewhat better. Now there are two tunnels, but they are each half as long. One of them starts at page 1 and counts up to page 101, and the other starts at page 200 and counts down to page 102. This cuts the maximum depth in half. This is better, but still not great.

Stepping by Two Pages

Let’s try something different. In this test, there is no “last” link, but there is a way to skip ahead (or back) by two pages. For page 1, the pagination would look like this:

1 2 3

For a deeper page, it would look like this:

23 24 25 26 27

If you start on page 1, you can jump to page 3, then to page 5, then to 7, and so on. There is no way to skip to the last page. I have seen this scheme on a couple of real-world websites, both huge online stores. (I’m guessing they chose to omit the “last” link for database performance reasons.) This site looks like this to the crawler:

Crawl Tree: Pagination Stepping by Two Pages

The most interesting thing about this chart is that it looks almost the same as the previous chart, even though the pagination schemes for the two are quite different. As before, numbered pages are split into two tunnels, each around 100 pages long.

The difference is that now the tunnels are split into even-numbered pages and odd-numbered pages. In the previous chart they were split into ascending and descending order. This raises the question: if each of these schemes cuts the maximum depth in half, what happens if we combine them both together?

Which brings us to:

Step by Two, plus “Last” Link

In this scheme, the pagination for page 1 looks like this:

1 2 3 200

And for deeper pages, it looks like this:

1 23 24 25 26 27 200

This is just the last two schemes combined together. This scheme allows you to skip ahead two pages at a time, and allows you to jump to the end and then work backwards. Most real-world websites use something similar to this (like the site you are reading right now, for example).

This produces the following chart:

Crawl Tree: Pagination Stepping by Two, plus “Last” Link

This has cut the maximum depth down to a fourth of what it was originally. This is a significant improvement. But why stop there? What happens if we go crazy?

Extreme-Skip Nav (Crazy Idea #1)

We’ve seen above that being able to to skip ahead by just two pages can cut the maximum crawl depth in half. So why not take this to extremes? Why not allow skipping by, say, eighteen pages?

In this scheme, the pagination for page 1 looks like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

And for deeper pages, it looks like this:

1 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 200

 

This allows the crawler (or human) to skip ahead by as many as eighteen pages at a time. It also still allows the crawler to jump to the end and work backwards, as before. This should reduce the maximum depth by quite a bit.

Yes, all those numbered links are kind of ugly, and they add a lot of clutter. You probably wouldn’t use this on a real website for that reason. That’s OK though, because this is just an experiment. Let’s just try it and see what happens.

The above scheme produces the following chart:

Crawl Tree: Extreme-Skip Nav Pagination

This brings the maximum depth down to just seven, which is a huge improvement. Unfortunately, this scheme is probably too ugly and visually cluttered for users to be a good choice for most real-world applications.

We need some way to achieve the same improvement, but with a pagination scheme that is more compact and easy to read. Such as…

Adding Midpoint Link (Crazy Idea #2)

In this scheme, the pagination for page 1 looks like this:

1 2 3 101 200

And for deeper pages, it looks like this:

1 12 23 24 25 26 27 113 200

 

Note that this is exactly the same as the “Step by Two, plus ‘Last’ Link” scheme above, except with two additional links inserted.

The “101” in the above example was added because it is the midpoint between 3 and 200, and the “113” because it is the midpoint between 27 and 200. In other words, the new link is based on the number you get by averaging the numbers immediately to the left and right of the “…” in the old scheme. These midpoint links make it possible to for a crawler to get from any page to any other page in just a few steps.

This scheme produces the following chart:

Crawl Tree: Adding Midpoint Link to Pagination

This shows the same level of crawlability improvement as the previous chart, but now with a scheme that is much easier to read (if a bit counterintuitive).

But How Do These Pagination Structures Scale?

So far, all of the examples have had a mere 200 numbered pages. This creates simple easy-to-understand charts, but a real website can easily have tens of thousands of pages. What happens when we scale things up?

Let’s run the last two crawls, with the same two pagination schemes, but with a hundred times as many pages.

Extreme-Skip Nav, with 20,000 Pages

This is Crazy Idea #1, but with a much bigger crawl:

Crawl Tree: Extreme-Skip Nav Pagination, with 20,000 Pages

Yes it looks kind of pretty, but it’s terrible as far as crawlability and click depth.

The deepest page is at level 557. This scheme does not scale very well at all. The relationship between page count and maximum depth is more-or-less linear. In other words, if you double the number of pages, you double the maximum depth.

Midpoint Link, with 20,000 Pages

This is Crazy Idea #2, again with a much bigger crawl:

Crawl Tree: Midpoint Link Pagination, with 20,000 Pages

The deepest page is now at level 14. This is a dramatic improvement, meaning this scheme scales extremely well.

The relationship between page count and maximum depth is (nearly) logarithmic. In other words, if you double the number of pages, the maximum depth only increases by one.

In general, if a chart is mostly made of long squiggly tentacle-like structures then the relationship will be linear (which is bad), and if the chart has a finely-branched tree-like structure, then the relationship will be logarithmic (which is good).

Begging the million dollar, win-both-showcases, takeaway question: is this midpoint link pagination worth using?

The answer: a resounding “Possibly. It depends.”

If you’re looking for conclusive advice on the structure of your site, the Portent team will almost always favor user experience over pleasing the search engine overlords. But getting the right, highly specific content in front of a searcher as quickly as possible is absolutely part of user experience. If that long-tail content is hundreds, or even thousands of clicks away from your homepage today, taking proactive steps to reduce click depth could well be worth it.

If you have many tens of thousands (or even hundreds of thousands) of numbered pages, this midpoint link pagination scheme may help to get your content crawled more thoroughly, and may help the deeper pages to rank better.

On the other hand, it may be a bit confusing to the user and will add some clutter. For smaller sites, the “Step by Two, plus ‘Last’ Link” scheme may be a better choice.

In the end, the point of this experiment and exploration was to shine some light on an often-neglected part of most websites, and to show that small changes to your pagination can have a surprisingly large impact on how a crawler sees your site and all that wonderful content.

Matthew Henry, SEO Fellow
SEO Fellow

As an SEO fellow, Matthew is Portent's resident SEO tools developer and math wizard.

Start call to action

See how Portent can help you own your piece of the web.

End call to action
0

Comments

  1. Thanks for bringing attention to this topic and helping me visualize it, Matthew! I’ll have to revisit a couple of large sites with this new perspective.

    1. The charts were all generated with a proprietary tool developed in-house for use by our SEO specialists. Sorry, the tool is not directly available to the public. The simulated websites were generated by a script that I originally created for crawler testing and debugging.

  2. Personally I like pagers that have just first, prev, next and last buttons as well as a number field so you can switch to any arbitrary page. But I guess crawlers don’t understand how to fill in those fields ?

    1. Correct. From the crawler’s point of view, that would give you the second chart (the one with two long arms).

  3. This is massive Matthew. I have been battling click depth on a huge real estate site, and this is totally the pill I needed. I’d thought of just about every other way to flatten out the site BUT this one, so I am massively thankful here!

  4. Thank you so much for taking the time to elaborate this so well. Very helpful. I guess interlinking from one blog post to another will also do a lot of good, right?
    Especially if any blog post receives at least one link.

    1. Meaningful, relevant links between related posts are definitely valuable. They give the search engines useful clues about which pages are the most important, and which pages are related to each other. Pagination is primarily about making pages reachable. To make pages rank well, you need to do a lot more than just have a good pagination scheme.

  5. Thanks for the information. Been thinking about this a lot recently. Would welcome any suggestions to improve the pagination indexing of these 33 pages.
    Would direct links that can be crawled on the pages instead of buttons that are processed on the server be better?

  6. Amazing. Thanks a ton for the hard work Matthew!
    It’s pure gold!
    These are some of the most untouched questions and YOU dared to answer it!
    Thank you,

  7. As a small business owner I don’t work a lot with anything related to IT/computers in my day-to-day, but I did study computer science a couple of years back… and I simply LOVE this entire article. The analytic approach to an often overlooked problem that a lot of bigger sites have (even though the owners are probably unaware of it) is phenomenal and the algorithmic solution is such an elegant way to solve this problem.

  8. Hi Matthew – I don’t have anything to meaningful (or anything, really) to add here, and I realize “great post” does not contribute to the conversation, but I simply have to say thanks for sharing your knowledge and experience on this topic here. I was privileged to have been able to learn from you directly and miss that frequently. Please keep writing and sharing your extensive experience and knowledge! – you have a lot to share with the SEO community that a lot of those “well known thought leaders” don’t. Keep it coming!

  9. Very interesting test. As far as I tested it on many e-commerce sites the best way is to show every i.e. 5th or 10th number – the user (UX) and Google (SEO) could get directly. Of course, it depends on how many numbers in pagination exists.
    The problem is the pagination in Javascript – some frameworks generate last page in a loop – I mean i.e. when you click on number 46 scripts generated number 47 with the same content as on 46…

  10. Great post and an important reminder of an issue that is too often overlooked. Highlights the importance of category, and sub category pages to break down number of steps from the home page as well as intelligent IA. Nice work!

  11. Find it really interesting. As a massively visual person, these trees are a big help! Is that a tool you guys built in house, or an external one? Curious minds want to know…. 🙂

    1. I strongly recommend against this strategy for a couple of reasons. First, putting thousands of deliberately hidden links on every page is just begging the search engines to slap you with a penalty. All of the search engines say this is strictly against the rules. Also, it’s a bad idea to have thousands of links pointing to every page on the site, because it makes it impossible for search engines to tell which pages you think are important. In a well-structured site, the most important pages have a large number of incoming links, and the least important pages have just a few.

  12. I love it when I read an article where someone puts WAY more thought into something seemingly innocuous (like, you know, site pagination) and ends up showing how it’s way deeper and more interesting than I ever could have imagined. Awesome post.

  13. Really good article. It’s not easy to find something full of technical stuff and yet interesting and revealing in SEO topics. Thanks!

  14. Ngl, some of these crawl trees are terryfing… But man, what a great article! Very interesting topic to write about – I’m wondering if some of these websites are built this way because they simply didn’t know better when they were building them… Well, now I know to be careful about pagination 😉

  15. This article’s focus seems like the primary concern is search engine crawlability for single product pages from their respective paginated categories.

    What if you’re not too concerned with single product page seo, but rather the product categories are what’s really important for ranking. For instance, you allow user-submitted products, so product titles, descriptions, etc cannot be controlled. Additionally, you have a lot of fairly specific termed product categories, and so there are likely to be a lot of single products that are repeats, with very similar worded titles.

    We can think of this with the likes Amazon/Walmart marketplace, where there are several individual listings for off-brand cotton swabs (essentially repeats of the exact same product without distinctions between them). In such a case, the product category “cotton swabs” would be more important to rank than poorly-optimized, virtually identical single product pages, with “cotton swabs” in the title repeating ad nauseum with each listing. And so, if brand name or other distinctions were irrelevant for these cotton swabs, probably only the first page (say 24 products per page) of the product category would be sufficient to give the user all of the cotton swabs they need to see, with pages beyond this being repeat listings of the same products over and over again.

    In a case such as this, where you really don’t care too much about single product page seo, and the product categories are important, would you advise no-indexing every paginated page beyond the first page of the product category? Or is simply optimizing the first page of the product category (mostly with user-centric copy) enough, and to then hope Google will understand and simply prioritize the first page above all the other pages?

    1. The main focus of this article is to explore ways to make paginated pages easier for crawlers to traverse. I’ve deliberately kept this as general as possible, so it can be applied to any type of pages that use pagination: product lists, blog lists, multi-page articles, lists of events, etc.

      For your example, I’d say if you have many big sets of product pages that all sell the same functionally identical commodity, and you know the customer will choose between them based solely on price, then the conventional approach (a big paginated list of links pointing to separate product pages) might not be the best fit for that business model. Alternatives do exist. See StockX.com for one example of a site that is built around shopping by price alone with multiple sellers.

      That said, if you are stuck with this site structure, I would focus on optimizing the first page in the category as much as possible, and let the search engines deal with the rest. I’m always hesitant to block or noindex large groups of pages unless the situation is extreme, because it can do more harm than good if any of my assumptions turn out to be wrong. In extreme cases, where you have tens of thousands of useless pages and a few hundred good pages, and you have actually observed that the search engines are wasting most of their crawl budget on the useless pages, and you have also observed that this is preventing some of the good pages from getting indexed right away, then I would block the useless pages with robots.txt. Even then, I would treat this as a short-term emergency measure. Ideally, you should find a way to get rid of any useless pages.

  16. Great explanation, did not thought about it. Do you know WooCommerce plug in which adds “last” and “first” page buttons?

  17. Very helpful information on pagination. And I have one question. Let’s say we have a lot of paginated pages and the crawler makes it hard to reach the deeper pages. But what if we have the single products or articles included in the sitemap? In this way, search engines will still find all the products/articles that are linked from the deeper paginated pages, and it does not matter if the crawler reaches out to them through the pagination as long as we have them included in the sitemap.
    Thank you!

    1. You raise a good point. If you have every destination page listed in a sitemap, this provides an alternate route for the crawlers to find all of the pages, and can greatly reduce the crawlability problems associated with pagination. Having a site map is definitely a good idea if your platform supports it. However, even with a sitemap, there are still some reasons to think about how efficient your pagination is. It’s generally a bad thing if the sitemap is the only route by which search engines can find a page. It sends the message that you don’t think the pages are important, which is likely to affect rank. Inefficient pagination can also be a major source of frustration for human users. Two frustrating scenarios that I have occasionally run into are: 1. a blog that expects me to click hundreds of links to see the oldest post, and 2. a list of products sorted by price that require me to click many times just to get to the products that are in my desired price range.

Leave a Reply

Your email address will not be published. Required fields are marked *

Close search overlay