How To Create and Use Robots Text File
Normally Search engines Robots search for a specific file called robots.txt before they index your website. The Robots.txt File is created specially to give directions to Search Engine crawlers/robots. Put following 2 lines in your robots.txt file if you want to allow search engines to crawl/spider everything on your site:
User-agent: *
Disallow:
The “*” in the first line specifies that the directions are for all search engines. The second line points to that nothing is disallowed.
One time you have created your robots.txt file, You need to upload this on you root directory.
Some Examples of Robots.txt:
Underneath are quite a few ordinary examples of how you can use a robots.txt file to set parameters and control how different Search Engine Robots access your website.
The subsequent example would allow all crawlers/robots to access all files except for your images file.
User-agent: *
Disallow: /images/
The subsequent would all Robots to crawl all files excluding the cgi-bin files and images directory.
User-agent: *
Disallow: /cgi-bin/
Disallow: /images/
“Crawl-delay" parameters in the robots.txt file. This parameter specifies the number of seconds for a crawler/spider to delay between requests.
User-agent: Googlebot
Crawl-delay: 20
User-Agent: msnbot
Crawl-Delay: 20
Bad Robots and Email Harvesters
Underneath are several robots/crawlers/spiders that you might want to block.
User-agent: Titan
User-agent: EmailCollector
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: ExtractorPro
User-agent: WebZip
User-agent: larbin
User-agent: b2w/0.1
User-agent: htdig/3.1.5
User-agent: teleport
User-agent: NPBot
User-agent: TurnitinBot
User-agent: dloader(NaverRobot)
User-agent: dloader(Speedy Spider)
User-agent: FunWebProducts
User-agent: WebStripper
User-agent: WebSauger
User-agent: WebCopier
Resources and Tools for Robots.txt:
Author:
asghar paracha SEO
Flash Action Script programmer - Iphone Application Programmer
|