Recently, one of our readers asked us for advice on how to optimize the robot.txt file to improve the referencing. The Robots.txt file tells search engines how to explore your website, making it an incredibly powerful referencing tool. In this article, we will show you how to create a perfect robot.txt file for referencing.
What is the robot.txt file?
Robots.txt is a text file that website owners can create to tell research engine robots how to explore and index the pages of their site. It is usually stored in the root directory, also called the main folder, on your website. The basic format of a robot.txt file looks like this :
User-agent: [user-agent name] Disallow: [URL string not to be crawled]User-agent: [user-agent name]Allow: [URL string to be crawled]Sitemap: [URL of your XML Sitemap]
You can have multiple instruction lines to allow or prohibit specific URLs and add multiple sitemaps. If you do not ban a URL, the robots of the search engines assume that they are authorized to explore it.
Here’s what a robot.txt : example file can look like
User-Agent: * Allow: / wp-content/uploads/Disallow: / wp-content / plugins /Disallow: / wp-admin /Sitemap: https://example.com/sitemap_index.xml
In the example robots.txt above, we have authorized search engines to explore and index files in our WordPress download folder.
After that, we prohibited research robots from exploring and indexing plugins and WordPress administration records. Finally, we provided the URL of our XML site plan
Why use a Robots.txt file for its WordPress site?
If you do not have a robot.txt file, search engines will continue to explore and index your website. However, you will not be able to indicate to search engines which pages or which folders they should not explore.
This will not have much impact when you start a blog for the first time and you do not have a lot of content. However, as your website grows and you have a lot of content, you will probably want to have better control over how your website is explored and indexed.
Research robots have an exploration quota for each website. This means that they analyze a number of pages during an analysis session. If they do not finish exploring all the pages of your site, they will return and resume exploration during the next session.
This can slow the indexing rate on your website. You can solve this problem by prohibiting research robots from trying to explore unnecessary pages such as your WordPress administration pages, plugin files and your theme folder.
By banning unnecessary pages, you save your exploration quota. This helps search engines explore even more pages on your site and index them as quickly as possible. Another good reason to use the robot.txt file is when you want to prevent search engines from indexing an item or page on your website.
It is not the safest way to hide the content of the general public, but it will help you prevent it from appearing in research results.
What is an optimized Robots.txt ?
Many popular blogs use a very simple robot.txt file. Their content may vary depending on the needs of the specific site :
User-agent: * Disallow : Sitemap: http://www.example.com/post-sitemap.xmlSitemap: http://www.example.com/page-sitemap.xml
This robot.txt file allows all robots to index all content and provides them with a link to the website’s XML paraps.
For WordPress sites, we recommend the following rules in the robot.txt :
User-Agent: * Allow: / wp-content/uploads/Disallow: / wp-content / plugins /Disallow: / wp-admin /Disallow: /readme.htmlDisallow: / refer /Sitemap: http://www.example.com/post-sitemap.xmlSitemap: http://www.example.com/page-sitemap.xml
This indicates that search robots index all WordPress images and files. Search robots are prohibited from indexing WordPress plugin files, the WordPress administration area, the WordPress Read Me file and the affiliate links.
By adding sitemaps to the robot.txt file, you are helping Google robots find all the pages on your site. Now that you know what an ideal robot.txt file looks like, let’s see how you can create a robot.txt file in WordPress.
How to create a Robots.txt file in WordPress?
There are two ways to create a robot.txt file in WordPress. You can choose the method that best suits you.
Method 1: Modification of the Robots.txt file using All in One SEO
All in One SEO, also known as AIOSEO, is the best WordPress reference plugin on the market used by more than 2 million websites. It is easy to use and comes with a robot.txt file generator.
If you have not already installed the AIOSEO plugin, you can see our step-by-step guide on how to install a WordPress plugin. The free version of AIOSEO is also available and has this functionality.
Once the plugin is installed and activated, you can use it to create and edit your robot.txt file directly from your WordPress administration area. Just go to Tout en un SEO »Tools to modify your robot.txt file.
First, you will need to activate the editing option by clicking the “Activate custom robots.txt” button in blue. With this option activated, you can create a custom robot.txt file in WordPress.
All in One SEO will display your existing robot.txt file in the ‘Robots.txt Preview’ section at the bottom of your screen. This version will display the default rules that have been added by WordPress.
These default rules tell search engines not to explore your basic WordPress files, allow robots to index all of the content, and provide them with a link to your site’s XML site plans.
Now you can add your own custom rules to improve your robots.txt for referencing.
To add a rule, enter a user agent in the “User Agent” field. The use of a * will apply the rule to all user agents. Then indicate if you want to “Allow” or “Prohibit” the exploration of search engines. Then enter the file name or path of the directory in the “Directory path” field.
The rule will be automatically applied to your robot.txt file. To add another rule, click the “Add Rule” button. We recommend that you add rules until you create the ideal robot.txt format that we shared above.
Your personalized rules will look like this.
Once you are done, do not forget to click the “Save changes” button to save your changes.
Method 2. Modify the Robots.txt file manually using FTP
For this method, you will need to use an FTP client to edit the robot.txt file. Just connect to your WordPress hosting account using an FTP client. Once inside, you can see the robot.txt file in the root folder of your website. If you don’t see one, you probably don’t have a robot.txt file. In this case, you can simply create one.
Robots.txt is a raw text file, which means you can download it to your computer and edit it using any raw text editor like the Notepad or TextEdit. After saving your changes, you can download them again to the root folder on your website.
How to test your Robots.txt file?
Once you have created your robot.txt file, it always makes sense to test it using a robot.txt test tool. There are many robot.txt test tools, but we recommend that you use that of Google Search Console.
First, you must associate your website with Google Search Console. If you haven’t done it yet, see our guide on how to add your WordPress site to Google Search Console. Then you can use the Google Research Console robot test tool.
Just select your property from the drop-down list. The tool automatically retrieves the robot.txt file from your website and highlights errors and warnings if found.
The goal of optimizing your robot.txt file is to prevent search engines from exploring pages that are not accessible to the public. For example, pages in your wp-plugins folder or pages in your WordPress administration folder.
A common myth among the referencing experts is that blocking the WordPress category, beacons and archive pages will improve the exploration rate and lead to faster indexing and higher rankings. It is not true. It is also contrary to Google’s webmaster guidelines.
We recommend that you follow the robot.txt format above to create a robot.txt file for your website.
Optimizing robots.txt WordPress for its referencing: our conclusion
We hope this article has helped you learn how to optimize your robot.txt WordPress file for referencing. The robot.txt file is too often forgotten when optimizing a website, however, it can improve your natural referencing. You want to learn a little more about WordPress and its cogs ? We explain to you, in a dedicated article, as optimizing its foot-of-page and how to add new users / authors to your blog !