Journals

Robots.txt – An Introduction Part III

What do Robots do?

Ans :: Robots mainly performs four types of tasks.

  • Site Indexing - Which is more like taking a copy of a new website it finds and storing it in some location at the search engines servers. This is accomplished by scanning the documents on a website and mirroring them to temporary servers.
  • Validates the site code - Which is more like comparing the website code to W3C standards and grading them according to accuracy.
  • Link Checks - Which includes tracing all possible links (incoming and outgoing) from indexed websites, and calculating the sites grading factors such as authority, relevance etc.

Not all the search engine robots are the same, some advanced ones like that of Google are known to do more complex tasks like categorizing websites, analyzing their search engine metrics, popularity ratio etc, but generally all the robots perform the above tasks.

google-robot

What does a Robots.txt file do?

Ans :: Robots.txt file gives commands to the visiting robots (on the website) to help them index and collect relevant information about the website. It's more like the helpdesk, which will give all information, guidance and help to the visitors at an event about how to reach the venue, important places, time schedule, map etc.

The commands on the robots.txt file is completely configurable by the webmaster.

Using the right commands, a webmaster can decide everything related to search engines like what search engines are allowed into the website, what is the information available to them, what are the documents that are not available for the search engines and even pass information like how often are pages added to the website and how often should the robots visit them.

Where to spot the Robots.txt file?

Ans :: The Robots.txt file is located at the root folder of your website. This is most often the _public-html or the http-docs folder. Root folder means the top most directory on the website that is accessible to the public.  It is critical to place the Robots.txt file in the root folder. Placing it elsewhere will not make it functional.

Also Read ::

Be The First One To Comment

Add A Comment