robots.txt is a file that you can use to instruct where web crawlers should look for information and where they should not.

How it works?

Good web crawler first accesses root of a domain and looks for robots.txt file.

For example if robot wants to check www.example.com/welcome.html it will first check if www.example.com/robots.txt exists.

And again for example it finds:

robots.txt:

# No robots, Please
User-agent: *
Disallow: /

In above file:

User-agent: * means this section applies to all robots and
Disallow: / instructs the robot that it should not visit any pages on the site.

Note: It is important to know that robots can ignore your /robots.txt and robots.txt file is a publicly available file.

First consideration is really important to know since the robots who ignore the instructions are usually malicious.

What to put inside?

robots.txt is a plain text file. Here are few examples:

To allow all robots to visit all files:
User-agent: *
Disallow:

And opposite disallow all robots out:
User-agent: *
Disallow: /

If you need to disallow a specific agent to visit specific folder
User-agent: SpecificBot # replace the 'SpecificBot' with the actual user-agent of the bot
Disallow: /notimportant/

Above example shows also how you can put comments in the file.

In addition you can tell robots where your sitemap is located
User-agent: *
Sitemap: http://www.example.com/sitemaps/sitemap.xml

Where to put it?

The short answer: in the top-level directory of your web server.

A bit longer: it should be located after your domain name. For example www.example.com/robots.txt not www.example.com/robot_file/robots.txt

Comments

8 Responses to “Robots.txt tips and tricks”

  1. Brandon on August 9th, 2012 8:19 pm

    Excellent, I’ll definitely use these tricks for my WP site. Thanks for the quality content.

  2. Vishnu on September 29th, 2012 5:48 pm

    Am using these on my site but my traffic is poor…any help

  3. sally on May 25th, 2013 10:22 am

    Hey! Do you know if they make any plugins to help with SEO?

    I’m trying to get my blog to rank for some targeted keywords but I’m not seeing
    very good success. If you know of any please share.
    Thanks!

  4. fitness kickboxing workout classes in columbus on July 20th, 2013 5:04 am

    Hey would you mind sharing which blog platform you’re using? I’m planning to start my own blog soon but I’m having a difficult time choosing between BlogEngine/Wordpress/B2evolution and Drupal. The reason I ask is because your design and style seems different then most blogs and I’m looking for something completely unique.

    P.S Apologies for getting off-topic but I had to ask!

    My web site :: Link removed

  5. Ashleigh on July 20th, 2013 8:12 pm

    Hello very cool blog!! Man .. Excellent .. Amazing .. I will
    bookmark your blog and take the feeds additionally? I’m happy to find so many useful information right here in the put up, we want work out extra strategies on this regard, thank you for sharing. . . . . .

    Take a look at my weblog celebrity weight loss surgery (Link removed…)

  6. Crystal on July 21st, 2013 8:06 am

    Wow that was unusual. I just wrote an extremely long comment but after I clicked submit
    my comment didn’t appear. Grrrr… well I’m not writing all that over
    again. Anyway, just wanted to say excellent blog!

  7. Teddy on July 21st, 2013 2:31 pm

    Hi there! This is my first visit to your blog! We are a group of volunteers and starting a new initiative in a
    community in the same niche. Your blog provided us
    useful information to work on. You have done
    a wonderful job!

  8. Rick on July 25th, 2013 1:48 pm

    Hello there! This is my first visit to your blog!
    We are a collection of volunteers and starting a new project in a community in the same niche.

    Your blog provided us useful information to work on. You have done a marvellous job!

Leave a Reply

You must be logged in to post a comment.