written by:

Robots.txt is another fancy, techy word that may confuse a lot of people new to the digital marketing game. Never fear. We’re going to make this very simple. Robots.txt (otherwise known as a robots exclusion standard or robots exclusion protocol) is a file than can be placed on the backend of your website server. This file can:

  • Instruct search engine robots how to crawl files
  • Show search engine robots how to index files and directories on your domain
  • Tell specific search engines to ignore or pay attention to a directory
  • Block web spiders from accessing certain content on your website
  • Point to your XML sitemap, redirecting search engine robots that way
  • Ensure that important elements like JavaScript and CSS aren’t blocked on your site

Overall, a robots.txt file allows you to give search engine robots specific instructions about what to do with content on your website (what should be ignored, processed, scanned, etc.).

What Does It Look Like?

Accessing your site’s robots.txt is very simple. All you have to do is type in your site’s URL followed by /robots.txt. For example:

http://www.example.com/robots.txt

This will work for any website. The robots.txt file is publicly available and anyone can see the file for any site just by following the above search instructions.

Common Instructions You Might Use Or See

In many cases, a robots.txt file will begin with something like this:

What that signifies is that this section applies to all search engine robots. Past that, here are some common instructions you might use or see on your robots.txt file.

  • User-agent: *
  • Disallow:

Allows all robots to have total access.

  • User-agent: *
  • Disallow: /

Excludes all robots from visiting any pages on the site/server.

  • User-agent: *
  • Disallow: /junk/

Excludes robots from the specific part of the server in-between the common slashes.

  • User-agent: Google
  • Disallow:
  • User-agent: *
  • Disallow: /

Allows a single specified robot (that which comes after the user-agent assignment).

  • User-agent: Badbot76
  • Disallow: /

Excludes one specific robot (that which comes after the user-agent assignment).

What You Should Do

If you have a website, it’s a good idea to take a look at your robots.txt file and make sure it is set up correctly. If there is nothing immediate you want to disallow or hide from search engines, you don’t have to take any action; but it is nice to be aware that you can make such alterations in the future. Something you definitely want to make sure to add to your robots.txt out the gate, though, is an instruction that will allow the search engine robots to access your site’s XML sitemap through there.

Overall, when it comes to robots.txt files, first and foremost make sure your site has one. If it doesn’t, it can be set up on the top-level directory of your web server. If your robots.txt is already up and running, utilize it to make sure search engine robots are seeing what you want them to see on your site, and not what you don’t want them to see!

Comments

Leave a Reply

Your email address will not be published.

SEO

Schema Markup Essentials Pt. 1: Structured Data & Rich Snippets
Small Business

Girl Vs. Internet – Google’s Local Pack
SEO

What Is The Difference Between A Resource Center & A Blog?

Ready To Change The Game?

Thank You For Your Interest In Wpromote!

Your message has been received and you will be contacted by one of our marketing specialists shortly. If you have any other questions, please do not hesitate to contact us by calling 310.421.4844 or by emailing sales@wpromote.com. We look forward to speaking with you shortly.

Sincerely,

The Wpromote Team

 
Become An Insider! Never Miss Our Industry-Leading Content

Thanks for signing up to be a Wpromote Insider.
You’ll be the first to get the scoop on our latest services, promotions and industry news.


CONNECT
  • Los Angeles HQ: 310.421.4844
  • Chicago: 312.690.7112
  • San Francisco: 415.423.1535
  • View All Offices
  • Dallas: 214.696.9600
  • Houston: 281.974.5569
  • Denver: 720.583.9064