Robots.txt for WordPress Blog



As you my already aware that Search Engine Optimization (SEO) is never ending working for blogger. There are always something that we can – and should – optimize so our blog have more exposure in search engine. One of important part for SEO is to tell search engine what to do with our blog content when crawl our blog. Yeah – surely search engine still know what to do with our blog even we do not tell how but of course, The result will be not optimum. We can optimizing the result by tell search engine what exactly to do with our blog. Which one it is allowed, and which one is not. Having robots.txt file in our blog is the solutions for it.

How Does Robots.txt Work?

We can tell search engine clearly how to treat our blog content by using robots.txt file. Which content search engine should be indexed and which content that should be not indexed. In short word, robots.txt tell search engine robot what allowed and what not allowed. Robots.txt command quite simple. There are only three comment that we use to optimize it.

Here three specific comment that we use in robots.txt:

User-agent: the robot the following rule applies to
Disallow: the URL that want to block
Allow: the URL that want to alllow

Here the example how we can use above command in Robot.txt for better understanding:

# disable google image bot
User-agent: Googlebot-Image
Disallow: /

# allow adsense bot on entire site
User-agent: Mediapartners-Google*
Disallow:
Allow: /*

As you can see that the first example tell google-image boot is disallow to crawl the site. And in the second example, robots.txt tell Google AdSense bot to crawl entire site.

As per example above, we should already aware how robots.txt work. And please note robots.txt is only needed if we want to restrict some of our content from search engine. If we want allow all, we surely no need a Robots.txt file.

Optimize Robots.txt for WordPress Blog

In my opinion, The most important thing that why we should consider to use robots.txt is to avoid duplicate content. Duplicate content is really bad for SEO blog, it even can make a blog penalized by search engine. That’s why we need to optimize content of our blog robots.txt file so there are no room for duplicate content. Another thing that usingof robots.txt is for security reasons. We can hide some blog directories from searching by disallow search engine to index it.

Here robots.txt file I use in this blogging update blog :

#Disallow some path for all search engine boot
User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/upgrade
Disallow: /wp-content/backup-*
Disallow: /wp-content/themes
Disallow: /images/
Disallow: /feed
Disallow: /aff/
Disallow: /category/*/*
Disallow: */feed
Disallow: /search/*/*
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*?*
Disallow: /*?
Allow: /wp-content/uploads

# Allow Google Image in entire site
User-agent: Googlebot-Image
Disallow:
Allow: /*

# Allow Google AdSense in entire site
User-agent: Mediapartners-Google*
Disallow:
Allow: /*

# Disable Internet Archiver Wayback Machine
User-agent: ia_archiver
Disallow: /

# Disable digg mirror
User-agent: duggmirror
Disallow: /

Sitemap: http://letupdate.com/sitemap.xml.gz

I use above base on my needs to use robots.txt to avoid duplicate content, try to avoid google hack, and also try to help search engine know better about this blog. It may be needed some changing in future for better result.And of course, you can customize it freely base on your own needs.

Conclusions

As a blogger, we should aware of robots.txt and what we can do about it in order to optimize our blog for Search Engine (SEO). Duplicate content is one of many thing that must be avoided regarding to SEO and that exactly what we do by optimize the robots.txt file. There are may any other ways to avoid duplicate content, but I specially suggest to use robots.txt file for it. Robots.txt also useful for security purpose. Yeah – we can use robot.txt to hide some of our blog directories from searching by disallow search engine to index it. And Robots.txt also useful to make search engine index our blog base on our needs and can use to avoid google hack by hide some file from search engine (google) eyes.

NOTE: I choose to optimize my blog SEO by using robots.txt file even tough I already use “All in One SEO Pack” WordPress plugin which already have tools (virtual robots.txt)to restrict (no-index) category, archive, and tag archive to avoid duplicate content. It is because I think it is best to hard code the instruction instead use virtual instruction like in “All in One SEO Pack” WordPress Plugins. And also robots.txt file give more flexibility in restriction.

That’s all update this time, wait for next update!

PG

Author: dana (305)

Dana is the founder of LetUpdate dot Com , a blog where he shares much about blogging, and little about internet and IT. Yup, You will find many knowledge sharing about blogging here, and also knowledge sharing about Internet and IT sometimes. You can subscribe the LetUpdate dot Com Feed or twitter @danalingga to be updated.

Also Read:

28 comments to Robots.txt for WordPress Blog

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Your Latest Post