One of the most boring topics in technical
SEO is robots.txt. Rarely is there an interesting problem needing to be solved
in the file, and most errors come from not understanding the directives or from
typos. The general purpose of a robots.txt file is simply to suggest to
crawlers where they can and cannot go.
Basic parts of the robots.txt file
Basic parts of the robots.txt file
- User-agent — specifies which robot.
- Disallow — suggests the robots not crawl this area.
- Allow — allows robots to crawl this area.
- Crawl-delay — tells robots to wait a certain number of seconds before continuing the crawl.
- Sitemap — specifies the sitemap location.
- Noindex — tells Google to remove pages from the index.
- # — comments out a line so it will not be read.
- * — match any text.
- $ — the URL must end here.
Other things
you should know about robots.txt
- Robots.txt must be in the main folder, i.e., domain.com/robots.txt.
- Each subdomain needs its own robots.txt — www.domain.com/robots.txt is not the same as domain.com/robots.txt.
- Crawlers can ignore robots.txt.
- URLs and the robots.txt file are case-sensitive.
- Disallow simply suggests crawlers not go to a location. Many people use this to try to de-index pages, but it won’t work. If someone links to a page externally, it will still be shown in the SERPs.
- Crawl-delay is not honored by Google, but you can manage crawl settings in Google Search Console.
- Allow CSS and JS, according to Google’s Gary Illyes:
- User-Agent: Googlebot
- Allow: .js
- Allow: .css
- Validate your robots.txt file in Google Search Console and Bing Webmaster Tools.
- Noindex will work, according to Eric Enge of Stone Temple Consulting, but Google Webmaster Trends Analyst John Mueller recommends against using it. It’s better to noindex via meta robots or x-robots.
- Don’t block crawling to avoid duplicate content. Read more about how Google consolidates signals around duplicate content.
- Don’t disallow pages which are redirected. The spiders won’t be able to follow the redirect.
- Disallowing pages prevents previous versions from being shown in archive.org.
- You can search archive.org for older versions of robots.txt — just type in the URL, i.e., domain.com/robots.txt.
- The max size for a robots.txt file is 500 KB.
Now for the
fun stuff!
Many companies have done creative things with
their robots.txt files. Take a look at the following examples!
ASCII art and job openings
For Example:
Nike.com has a nice take on their slogan inside their robots.txt, “just crawl it” but they also included their logo.
Nike.com has a nice take on their slogan inside their robots.txt, “just crawl it” but they also included their logo.
Seer also uses art and has a recruitment
message.
TripAdvisor has a recruitment message right
in the robots.txt file.
Fun robots
Yelp likes to remind the robots that Asimov’s
Three Laws are in effect.
As does last.fm.
According to YouTube, we already lost the war
to robots.
Page One Power has a nice “Star Wars”
reference in their robots.txt.
Google wants to make sure Larry Page and
Sergey Brin are safe from Terminators in their killer-robots.txt file.
Who can ignore the front page of the
internet? Reddit references Bender from “Futurama” and Gort from “The Day The
Earth Stood Still.”
Humans.txt?
Humans.txt describes themselves as “an
initiative for knowing the people behind a website. It’s a TXT file that
contains information about the different people who have contributed to
building the website.” I was surprised to see this more often than I would have
thought when I tried on a few domains. Check out https://www.google.com/humans.txt.
Just using robots.txt to mess with people at
this point
One of my favorite examples is from Oliver
Mason, who disallows everything and bids his blog farewell, only to then allow
every individual file again farther down in the file. As he comments at the
bottom, he knows this is a bad idea. (Don’t just read the robots.txt here,
seriously, go read this guy’s whole website.)
On my personal website, I have a robots.txt
file to mess with people as well. The file validates fine, even though at first
glance it would look like I’m blocking all crawlers.
The reason is that I saved the file with a
BOM (byte order mark) character at the beginning, which makes my first line
invalid — as you can see when I go to verify in Google Search Console. With the
first line invalid, the Disallow has no User-Agent reference, so it is also
invalid.
---------------------------------------------------------------------------------------------------------------------------------------------------------------
This article
originally Copyright by: Search Engine Land.
Very nice article it's help me lots about Robot.txt file use.
ReplyDeleteOnline Marketing Services in Delhi
ReplyDeleteThis is a very important blog for Digital Marketing if you want same information then click Best Internet Marketing Expert indore
nice blog...
ReplyDeleteAdwords services in hyderabad
Than you so much for your information!!! Zinavo Technologies provide web design & development with good quality if security in your website. Web Designing Company in Bangalore | Website Design Companies in Bangalore
ReplyDeleteyour posts are very helpful thanks for sharing the information about the Social media optimization ,SMO Services keep updating more posts about it
ReplyDeleteSocial media optimization ,SMO Services
This comment has been removed by the author.
Delete
ReplyDeleteThanks for the info. Internet marketers is a platform where you can learn digital marketing tips & tricks which is useful to your Business or if you want to start a blog.
Digital Marketing services in tirupati
Found to be unique article in recent times. Thank You for sharing this article.
ReplyDeleteThanks for posting,Thanks for the info
ReplyDeleteSocial Media optimization Services in India
thanks for sharing this.SMO Services
ReplyDeletebest Seo
Thanks for sharing
ReplyDeleteppc services in bangalore
ReplyDeleteHello,
we provide affordable and result-oriented SEO services, please give a chance to serve you.
Thanks
Admin: E07.net
ReplyDeletethank u for sharing this post
seo and smo services company
Smo Agency
Thanks for this nice article to us. There are very interesting and informative details about the PPC company. This article has a superb explanation of the topic.
ReplyDeleteThank you for sharing such information with us.
ReplyDeleteCustom Debsite Design Phoenix
Web Design In Phoenix
Do you mind if I quote a few of your posts as long as I provide credit and sources back to your website? My website is in the very same area of interest as yours and my users would truly benefit from some of the information you provide here. Please let me know if this alright with you. Appreciate it!
ReplyDeleteInfluencer Agency Switzerland
Thanks for sharing useful information on robots.txt file. Keep on sharin.
ReplyDeleteBest web designers toronto
Digital marketing agency toronto
Thanks for sharing such a piece of wonderful information. Your knowledge about the topic is great please keep sharing.
ReplyDeleteWeb Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
Web Design Karachi | Digital & Social Media Marketing | SEO in Pakistan
I thoroughly enjoyed reading your post. I'll pass it along to my other friends because the information is extremely informative. Continue to share your fantastic work with us. Vidbullet coupon
ReplyDeleteSEO services
ReplyDeleteSEO services
SEO Company in Coimbatore
SEO Company in Coimbatore
Social Beat is a digital growth partner for hyperscaling startups & top brands - Google Premier Partner, Preferred Facebook Marketing Partner. By digital marketing company in IndiaThe Best digital marketing agency in India which not only offers SEO, PPC, SMM, Branding but also provides 360° online marketing.
ReplyDeleteFree Job Alert site is for Government,Sarkari Naukri,Banks,Railways,Police Recruitment, Results of IBPS,UPSC,SSC,RRB, Fresher IT Jobs and Walkins.
How can I contact a match? What can I do with a free membership? How do I cancel my Premium subscription? Getting started - Read more: Elite Singles Contact Information:
Qualified interior designers Gorakhpur has, will always ensure that the quality of work delivered is as per the expectations of their clients. Visit Now: Interior Designer in Gorakhpur, UP
Thanks for the marvelous post! ✅ I really enjoyed reading it, you might be a great author. Hosting Now By #1 Trusted Web Hosting Provider in NZ.
ReplyDeleteThis is genuinely an awesome read for me. I have bookmarked it and I am anticipating perusing new articles. Keep doing awesome!
ReplyDeleteWant SEO Services in pune? Want to Hire best SEO Company in Pune
Then Visit us
SEO Company Pune
As a Best Delhi NCR SEO Services company we offering the best SEO services in Delhi NCR that helps you to get more organic traffic on the website.
ReplyDeleteAs a Best seo services in Delhi NCR: offering the best SEO services in Delhi NCR that helps you to get more organic traffic on the website.
ReplyDeleteNice Blog, thanks For Sharing this informative post. Here you will get the Best SEO Company in India and Best services, Like - SEO Services, Digital Marketing, PPC Services, Google Ads, etc.
ReplyDeleteVery clear article about robot txt file. Houston Digital Marketing Agency is providing any kind of digital marketing for your website in reasonable price.
ReplyDeleteHelpful information. Lucky me I discovered your web site accidental.
ReplyDeleteDevelop an On-Demand Delivery App like Dunzo with The App Ideas. Here's how to build an app like Dunzo. know more info about how to Create an App Like Dunzo, Features, Services, Cost to develop Dunzo Like App. Want to know more then Contact Us.