Indispensable for optimized SEO indexing, the meta robots tags aim to provide guidelines to search engine crawlers. So what are these guidelines and how do they affect the ranking of your web pages in Google?
META robots are one of the essential fundamentals for any good sustainable visibility strategy. To boost your natural referencing, take advantage of the expertise of our digital agency by requesting your free SEO audit.
What is a robots META tag?
Placed in the header of a page code (HEAD), the robot META tag provides information to search engines.
It has a strong impact on SEO indexing and the visibility of each page thanks to a number of guidelines for crawlers, and more particularly for Googlebot and BingBot.
If the presence of SEO robot META tags is therefore essential, they should be used with full knowledge of the facts, some aiming in particular to prevent the appearance of content in the SERPs.
Simply right-click on a web page, select “View Source,” and then perform a Find for “robots” to locate the meta robots tag.
It will resemble the following:
<meta name="robots" content="noindex" /> <meta name="googlebot" content="noindex" /> <meta name="googlebot-news" content="noindex" /> <meta name="slurp" content="noindex" /> <meta name="msnbot" content="noindex" />
The robots meta tag is important because it gives the robots.txt file an extra degree of protection. Because it hasn't seen the robots.txt file, a crawler can still crawl and index one of your sites after following an external link.
This crawling and indexing is prevented by the robots meta tag.
Meta SEO robot: the different directives
The META tag of each page of a site can be different and consist of one or more directives for the attention of all or part of the search engines.
Noindex and index
The Noindex directive tells bots that the page should be skipped and not appear in the SERPs.
It is widely used to block the indexing of an url whose content may be duplicated, thus avoiding any risk of Google penalties (in particular Google Panda).
Conversely, the index directive stipulates to search engines that content indexing is desired.
Logically, it has little interest since in the absence of a contrary instruction (noindex), the spider will logically crawl the page.
Nofollow and follow
The nofollow directive is probably the one that divides the SEO community the most.
In theory, it makes it possible to indicate to Google Bots that we do not want the links inside the page to be followed, in particular because they have no weight, no authority.
However, one can also think that this does not necessarily send a positive signal to Google, which anyway, treats the information as it wants.
Conversely, by specifying the follow directive in a robot META tag, SEO indexing of links is expressly requested.
Noarchive and archive
Noarchive is a directive telling Google not to cache the page.
In fact, users cannot access an earlier version of the url, which is eminently practical in the context of a shop which has to regularly review its prices, for example according to the dollar rate.
The archive directive tells Google that the site owner wants their url to be cached accessible by search engines.
For Bing, nocache and cache must be used respectively instead of noarchive and archive.
None and all
The none directive allows SEO robots to ignore the page completely.
Therefore, it is not indexed in the SERPs and the links to which it points are not followed.
Conversely, by indicating all in a tag, we indicate that we want the page to be indexed and the links followed.
In practice, there is no reason to add it since, in the absence of a contrary directive, the crawler will logically perform these two actions.
By adding the Nosnippet directive in a robot META tag, it is requested that snippets are not indexed in search results and also not cached.
With the notranslate directive, it is requested that no automatic translation be available for the page.
Unavailable_after: [date] is a somewhat special directive since it tells SEO robots that content can be indexed in SERPs, but only for a given time.
Beyond the specified date, there is no longer any possibility of accessing it.
It can be very useful in the case of a flash sale for example.
Note also that it is possible to provide different directives to Google and Bing:
META robot for all engines;
META robot for the attention of Google only;
META robot for Bing only.
This instructs the search engine not to include the page description from the Yahoo! Directory in the search snippet.
Prevents search engines from displaying the DMOZ page description in the search snippet. The DMOZ directory is administered and maintained by the Open Directory Project (ODP).
The fact that not all search engines support all values further complicates matters. So here's a nice table to help you out:
What is NoIndex?
NoIndex is an SEO instruction that instructs search engine spiders not to index the web page that contains it.
Certain pages have no interest in being indexed.
The implementation of this instruction is done directly in the HTML code of the page, and more precisely in the “robots” meta tag.
Webmasters use NoIndex to prevent indexing of pages
There are several reasons why web pages do not have to appear in search results pages (SERPs) and therefore need to be indexed. It can be because they are PDF pages, because they are affected by duplicate content or because they lack content and they need to avoid the sanctions of the engines. The NoIndex directive was created so that webmasters can tell search engine robots not to index certain URLs.
It should be placed in the <head> part of the source code and added to the other metadata.
It takes the form <meta name = “robots” content = “X, Y”> in the robots meta tag of the page header, where X = “index” or “noindex” and Y = “follow” or “nofollow”.
Indeed, we first tell the robot through the noindex not to index, but it will have to be specified later on whether to follow or not to follow the links which are in the page in question when ‘there are.
The follow attribute sends bots to follow the links and the nofollow attribute stops them on the page itself.
The NoIndex in SEO – Also useful against duplicate content!
If the NoIndex directive was created, it is because its use serves not only to orient the robots of the motors but also to avoid the penalizing actions of the latter. Take the example of duplicate content. When a search engine discovers in its index pages or parts of pages present on different URLs, it goes on to penalize some of them in the spirit of discouraging content plagiarism. There are however cases where the duplication of content is necessary, for the advertising of a product on several pages or on several sites for example.
The means is then given to the webmaster who takes care of it to indicate to the robots that such and such pages are duplicate contents therefore not to be indexed and he can also indicate the original page whose address is what is called URL canonical.
NoIndex is also used not to index a website's internal search results pages, pagination pages, copyrighted content and duplicate category pages.
It is also used to prevent indexing during the phase when the webmaster uploads a page just to test its functionality.
It is important to distinguish between NoIndex and Disallow.
The second is shown in the robots.txt file and is a true ban instruction.
Unlike the NoIndex which lets the robots consult the content of the page, the Disallow completely blocks access to the page.
It is mainly used to protect sensitive parts of the site or content that should not be viewed.