Add your website to LinkExchange4seo Directory - 100% SEO friendly and human edited link directory !!
 

Introduction to Robots.txt

Robots.txt is a magical file used to control search engine crawlers and robots.

You might be aware of search engine crawlers like Google, Yahoo, MSN etc which tries to index your website pages. These crawlers or robots are useful and they help you get listed in the search results. There are hundreds of other robots which might collect email address from your site ( Spam bots ). These unwanted robots will eat up your site's bandwidth. As a site owner you need to decide which robots are allowed to crawl your site and what they are allowed to do.
"robots.txt" is a regular text file where you can add a few set of rules to instruct robots not crawl and index certain files, directories within your site.
 
  Ads
 

Creating your "robots.txt" file

 

Create a regular text file called "robots.txt", and make sure it's named exactly "robots.txt". This file must be uploaded to the root directory of your site.
(ie: http://www.yoursite.com but NOT http://www.yoursite.com/inner-directory/).

Now that you have learned to create and where to upload the robots.txt file, lets learn the format of robots,txt file ( the exact text to be added in the robots.txt file ).

Examples of Robots.txt :

Example: 1

Disallow all robots on your website. This is not what you want, but will give you an idea.

User-agent: *
Disallow: /

Example: 2

You may not want Google's Image bot crawling your site's images and making them searchable online. To restrict Google's Image bot add the below given declaration to your robots.txt file.

User-agent: Googlebot-Image
Disallow: /

Example: 3

To disallows all search engines and robots from crawling select directories or pages, use the following declaration.

User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /admin/test.html

Example: 4

Targetting multiple robots

User-agent: *
Disallow: /
User-agent: Googlebot
Disallow: /cgi-bin/

In the above example we declare that crawlers in general should not crawl any parts of our site. Then we have allwoed Google to crawl the entire site apart from /cgi-bin/ directory.

Example: 5

Per Google's FAQs for webmasters, the below is the preferred way to disallow all crawlers from your site EXCEPT Google:

User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /

 
<< Previous Next >>
 
 
Home
About Us
Submit Link
Articles
Contact Us
 
 

Categories

Animals & Pets
Arts & Culture
Automotive
Business
Careers & Jobs
Computers
Education & Research
Entertainment & Media
Games
Law & Politics
Home & Family
House & Garden
Internet
Lifestyle & Relationships
Money & Finance
News & Media
Real Estate
Recreation
Reference
Regional
Sciences
Shopping
Sports
Travel & Vacation
 

Services

• Link Popularity Development
• Directory Submissions
• On-page Optimization
• Article Writing and Submissions
 
 

Free SEO Tools

• Back Link Checker
• Google Page Rank Checker
 
 
Our Network Sites
Linux and Web Hosting Tutorials
Free Tutorials & Articles for Linux, Web Designing,
Hosting and various other curious topics!!

www.TechCuriosity.com
Reliable Hosting Solutions
Reliable Web Designing, Development
& Hosting Service. Instant setups!!

www.HostingSolutions4u.com

Lovely Wallpapers
Free Lovely Wallpapers
for Desktop and Mobile!!

www.LovelyWallpapers4u.com
 
Privacy Statement | Link to Us   © LinkExchange4seo.com. All rights reserved
Powered by www.HostingSolutions4u.com