What is a robots.txt file?
The robots.txt tells the search engine spider which web pages on your website should be indexed and which web pages should be ignored. You can use a simple altercation editor to accomplish a robots.txt file. The acceptable of a robots.txt book consists of declared “records”. A annual contains the admonition for a adapted seek engine. Each annual consists of two fields: the user abettor bandage and one or added Disallow lines.
For an example:
User-agent: googlebot
Disallow: /cgi-bin/
Disallow: /uploads/
This robots.txt book would acquiesce the “googlebot”, which is the search engine spider of Google, to retrieve every webpages from your website except for files from the “cgi-bin” directory. All files in the “cgi-bin” agenda will be abandoned by googlebot.
Google testing?
Webmasters accept begin out that Google seems to be experimenting with a Noindex commands for the robots.txt file. It basically seems to do the aforementioned as the Disallow command so it’s not bright why Google is application this command. Other commands that ability be activated by Google are Noarchive and Nofollow. However, none of these commands is official yet.
Check your robots.txt file
Open your web browser and access www.yourdomain.com/robots.txt to appearance the capacity of your robots txt file. Here are the a lot of important tips for a actual robots.txt file:
Affect your rankings on Google?
If you accidentally use the amiss commands again you ability acquaint Google to go abroad although you wish them to basis your pages. For that reason, it is important that you analysis the agreeable of your robots.txt file.
1. Dont change the order of the commands. Start with the user-agent line and then add the disallow commands:
User-agent: *
Disallow: /cgi-bin/
Disallow: /uploads/
2. Two official commands for the robots.txt file: User-agent and Disallow. Do not use more commands than these.
3. Be sure to use the right case. The file names on your server are case sensitve. If the name of your directory is “Support“, don’t write “support” in the robots.txt file. You can find user agent names in your log files by checking for requests to robots.txt. Usually, all search engine spiders should be given the same rights. To do that, use User-agent: * in your robots.txt file.
4. There are alone two official commands for the robots.txt file: User-agent and Disallow. Do not use added commands than these.
Disallow: /cgi-bin/
User-agent: *
Disallow: /images/
Disallow: /support
Be abiding to use the appropriate case. The file names on your server are case sensitve. If the name of your agenda is “Seotool“, dont address “seotool” in the robots.txt file. You can acquisition user abettor names in your log files by blockage for requests to robots.txt. Usually, all seek engine spiders should be accustomed the aforementioned rights. To do that, use User-agent: * in your robots.txt file.
Something wrong if you dont have a robots.txt file?
If your website doesnt accept a robots.txt file (you can analysis this by entering your www.yourdomain.com/robots.txt in your web browser) again seek engines will automatically basis aggregate they can acquisition on your site. Checking your robots.txt book is important if you wish seek engines to basis your web pages. However, indexing abandoned is not enough. You have to aswell accomplish abiding that seek engines acquisition what they’re searching for if they basis your pages.







