Create a regular text file called "robots.txt", and make sure it's named exactly "robots.txt". This file must be uploaded to the root directory of your site. (ie: http://www.yoursite.com but NOT http://www.yoursite.com/inner-directory/).
Now that you have learned to create and where to upload the robots.txt file, lets learn the format of robots,txt file ( the exact text to be added in the robots.txt file ).
Examples of Robots.txt :
Example: 1
Disallow all robots on your website. This is not what you want, but will give you an idea.
Example: 2
You may not want Google's Image bot crawling your site's images and making them searchable online. To restrict Google's Image bot add the below given declaration to your robots.txt file.
User-agent: Googlebot-Image Disallow: /
Example: 3
To disallows all search engines and robots from crawling select directories or pages, use the following declaration.
User-agent: * Disallow: /cgi-bin/ Disallow: /images/ Disallow: /admin/test.html
Example: 4
Targetting multiple robots
User-agent: * Disallow: / User-agent: Googlebot Disallow: /cgi-bin/
In the above example we declare that crawlers in general should not crawl any parts of our site. Then we have allwoed Google to crawl the entire site apart from /cgi-bin/ directory.
Example: 5
Per Google's FAQs for webmasters, the below is the preferred way to disallow all crawlers from your site EXCEPT Google:
User-agent: * Disallow: / User-agent: Googlebot Allow: /
|