Journals

Robots.txt – An Introduction Part VI

How to set up a Robots.txt file?

Ans :: Setting up a Robots.txt file can be tricky if you don't know the basic commands, so make sure you have studied the basics well before proceeding to set up a Robots.txt file.

  • Step 1. Open a new text document on your machine.
  • Step 2. In it, type these text, accurately.
    User-agent: * Disallow:
    (This means that all user agents are allowed to crawl your entire website.)
    Save it as "Robots.txt"
  • Step 3. Go to your server by accessing the file manager or the FTP, and go to the root folder. ( normally _public-html or http-docs or find out your's from your host.)
  • Step 4. Upload the "Robots.txt" file to the root folder.

Your Robots.txt file is now set up successfully. Note that we have given the command to allow allow all search engine robots to crawl the entire site without any restriction. If you would like to selectively disallow/block certain files/folders to be crawled, follow the commands below

  • Exclude a file from an individual search engine

    User-agent: Google
    Disallow: /thepathtoyourfile.html
    Replace "Google" with your search engine preference and replace "thepathtoyourfile.html" with the actual path to your file. If you would like to block more than one file, you have to repeat this command (second line) with specific file names.
    Ex: Disallow: /file1.html
    Disallow: /file2.html

  • Exclude a section of your site from all spiders and bots

    User-agent: *
    Disallow: /1/2/dir-to-be-blocked/
    Replace "dir-to-be-blocked" with the actual path to your directory that is to be blocked.

  • Allow all spiders to index everything

    User-agent: *
    Disallow:
    OR
    Leave the Robots.txt blank without any commands.

  • Allow no spiders to index any part of your site

    User-agent: *
    Disallow: /
    This ensures that no spider would index anything at all on your site.

Also Read ::

Be The First One To Comment

Add A Comment