noindex and nofollow: Comprehensive SEO Implementation Guide

This article was written by our AI author Page, but reviewed by a human SEO, José Fausto Martinez.

In the constantly evolving world of SEO, understanding and correctly implementing noindex and nofollow directives is crucial. These tools are powerful when used appropriately, offering control over how search engines interact with your website.

What Are noindex Tags?

The noindex tag is a directive used in HTML or HTTP headers, instructing search engines not to include a page in search results. For this tag to be effective, the page must be crawlable and not blocked by a robots.txt file​​.

noindex Tag Best Practices

When implementing noindex, ensure the pages are not disallowed in your robots.txt file. Disallowing the page prevents search engines from seeing the noindex directive, rendering it ineffective. Instead, ensure that the page is crawlable so the noindex tag can be recognized​​​​.

Implementing noindex Tags

How to Add noindex Tags to HTML

To use noindex in HTML, you add a <meta> tag in the <head> section of your page, like this: <meta name=”robots” content=”noindex”>. This ensures that while search engines can crawl the page, they won’t include it in their search index​​​​.

How to Add noindex Tags to HTTP Headers with X-Robots Tags

When handling non-HTML resources like PDFs or images, or if a server-side solution suits your needs better, leveraging the X-Robots-Tag in the HTTP header is a savvy strategy. This approach is particularly useful when you cannot access the HTML and add a meta tag, or when you need to apply the directive across multiple pages. To implement the noindex directive, you would use X-Robots-Tag: noindex in your HTTP header.

Troubleshooting the ‘Indexed, though Blocked by robots.txt’ Issue in Google Search Console

Screenshot of an error message saying "Indexed, though blocked by robots.txt"

One common issue encountered in Google Search Console is the “Indexed, though blocked by robots.txt” notification. This issue arises when Google has indexed pages despite being blocked from crawling by the robots.txt file.

The Root of the Issue

This situation typically occurs because external links or other signals have led Google to deem these pages worthy of indexing. Google’s algorithm, aiming to provide the most comprehensive index possible, sometimes includes pages in its index even if they are blocked from crawling. The search engine recognizes their potential relevance through other external signals.

Resolving the Issue

A key factor in this scenario is the use of the noindex tag. Ideally, a noindex tag tells search engines not to index a page. However, if a page is blocked by robots.txt, search engines like Google cannot crawl the page to discover the noindex directive. Therefore, even if a noindex tag is present, it becomes ineffective because the search engine crawlers are barred from accessing the page to detect this tag.

To fix this issue, remove the robots.txt disallow lines that are preventing Google from crawling the pages you want “noindexed”. After a month or two, when you’ve noticed those URLs are now “noindexed”, then you can add the disallow rule back to the robots.txt.

What Are nofollow Tags?

The nofollow tag, on the other hand, advises search engines not to follow links on a page, preventing link equity transfer and crawling of linked content​​.

nofollow Tag Best Practices

It is important to use nofollow judiciously. Pages that include links in comments, forum posts, or user-generated content should have the nofollow tag because you might not have complete control over the links added. This helps maintain the quality of your site’s link profile​​.

Applying nofollow to Individual Links

While the broader application of the nofollow directive through the HTML header or HTTP header impacts an entire page, it’s also possible to apply this directive to individual links. This selective approach allows you to control the flow of link equity on a link-by-link basis, which can be crucial for managing how search engines perceive and interact with your site’s link structure.

Implementing nofollow Tags

How to Add nofollow Tags to HTML

For nofollow, you can add a <meta> tag in the <head> section: <meta name=”robots” content=”nofollow”>. This directive is particularly useful for controlling the flow of link equity and preventing the crawling of untrusted or less important links​​.

How to Add nofollow Tags to HTTP Headers with X-Robots Tags

In addition to HTML implementations, nofollow tags can also be applied via HTTP headers. To implement the nofollow directive in HTTP headers, you would include the X-Robots-Tag with a ‘nofollow’ value. This is done by adding X-Robots-Tag: nofollow in the HTTP response header.

You can combine noindex and nofollow directives in a single HTTP header, such as X-Robots-Tag: noindex, nofollow, to apply both directives simultaneously.

Applying nofollow to Individual Links in HTML

While the broader application of the nofollow directive through the HTML header or HTTP header impacts an entire page, it’s also possible to apply this directive to individual links. This selective approach allows you to control the flow of link equity on a link-by-link basis, which can be crucial for managing how search engines perceive and interact with your site’s link structure.

Conclusion

Mastering noindex and nofollow tags is a vital skill in SEO. Understanding and correctly implementing these directives can significantly enhance your website’s interaction with search engines, improving your overall SEO strategy.

Start call to action

See how Portent can help you own your piece of the web.

End call to action
0

Leave a Reply

Your email address will not be published. Required fields are marked *

Close search overlay