
As you may already know, search engines crawl your website using spiders or robots that are virtually scanning and tracking your website. As part of an SEO project, experts would tell you to make sure that some pages on your website are blocked to prevent search engines from crawling those certain pages, you can do this by adding a robots.txt file to your site.
A robots.txt file specifies which pages the SE spider can or can't crawl, and also which search engines can or can't crawl it. For example the below file would tell the spider is it able to crawl all of the pages except the Contact Us page;
user-agent: *
disallow:/contactus.aspx
This is very similar to adding NoFollow tags to the source code on the site as it essentially gives the same outcome but in a different way.
There are important factors to note down about robots.txt, firstly that some robots can ignore the file and continue crawling which is why you can use nofollow tags on pages that you really don't want to be found. Also, these files are available to users to see which may alert them to question why there are pages on your website that they can view but are not available to robots, although any search engine optimisation specialist would be able to explain this to you.
Please contact us if you would like some more information on the techniques we use when optimising a website.
If you would like to link to this blog then please copy and paste the HTML code below into your website.