Recently I wrote about BigBlueBall slipping in search results. In that article I mentioned how I was using the robots.txt file to restrict what Google’s spider would index. Then it occurred to me — could canonical issues also trigger a duplicate content penalty?
A little explanation is in order for the uninitiated. Canonical issues, in this context, refer to www.bigblueball.com and bigblueball.com (no www.) both referring to the same page. People are smart enough to figure out that they are one and the same and not really duplicate content, but a search engine cannot make that assumption. Why? Because they could actually point to different content.
So what can you do? I did a little search, and found a great resource on doing 301 redirects. Basically it involves using the .htaccess file (assuming you’re running on an Apache web server) with the following code:
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^yoursite.com [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [L,R=301]
If the search engine tries to access the site without the “www.” it gets a 301 error telling it to permanently redirect to the new URL (with www.), thus eliminating any possibility of a duplicate content penalty (for that, at least). This also ensures that any non-www links people have made to the site will continute to work just fine.
Oh, and for an update on search relevance. In my original post I mentioned that searching on “msn messenger” BigBlueBall was listed in the 318th position. I’m happy to report that it’s moved up to 189. Still not great, but better.
Jeff says
Barbara, Microsoft does have a lot of locale-specific MSN Messenger pages, but within the first ten results there are several non-MSN websites, include Mess with Messenger, e-Messenger and MsgPlus!.
Also, looking further down, there are a lot of sites that lack content and value, but are still ranked higher that BigBlueBall.
Yes, the keyword “msn messenger” is not narrowly targeted and it will be difficult to get to the top, but I’d like to at least be above the crappy sites.
Jeff says
Barbara, Microsoft does have a lot of locale-specific MSN Messenger pages, but within the first ten results there are several non-MSN websites, include Mess with Messenger, e-Messenger and MsgPlus!.
Also, looking further down, there are a lot of sites that lack content and value, but are still ranked higher that BigBlueBall.
Yes, the keyword “msn messenger” is not narrowly targeted and it will be difficult to get to the top, but I’d like to at least be above the crappy sites.
Iwo says
I am using this but when I type in http://techroam.com it doesn’t go to http://www.techroam.com … is it only for search engines or am i doing something wrong?
Iwo says
I am using this but when I type in http://techroam.com it doesn’t go to http://www.techroam.com … is it only for search engines or am i doing something wrong?
Jeff Hester says
Iwo, you have to be running Linux with Apache, and the mod_rewrite module needs to be available. If you’re not sure about any of these things, contact your web host and they should be able to help you out.
Jeff Hester says
Iwo, you have to be running Linux with Apache, and the mod_rewrite module needs to be available. If you’re not sure about any of these things, contact your web host and they should be able to help you out.
Iwo says
Thanks for the reply, I am running apache and I have mod_rewrite on. I made a stupid typo in the .htaccess file and it didnt work. I just rewrote it and everything works great. Thank you for the article.
Iwo says
Thanks for the reply, I am running apache and I have mod_rewrite on. I made a stupid typo in the .htaccess file and it didnt work. I just rewrote it and everything works great. Thank you for the article.
Iwo says
One more question, I am considering putting a robots.txt into my root and I am thinking about disallowing /category/ since it shows the same thing as the normal blog. Tags also show the same posts and same goes for the archives but I think I read somewhere that archives should not be disallowed. What do you think about all of this? Is it a good idea to write a dissallow for cartegories in robots.txt?
Iwo says
One more question, I am considering putting a robots.txt into my root and I am thinking about disallowing /category/ since it shows the same thing as the normal blog. Tags also show the same posts and same goes for the archives but I think I read somewhere that archives should not be disallowed. What do you think about all of this? Is it a good idea to write a dissallow for cartegories in robots.txt?
Jeff Hester says
Iwo, should you block /category/ or /tags/? I cannot say conclusively, but it does makes sense. As long as there’s a way for the search engines to crawl your site and find all of the articles, you should be fine.
Jeff Hester says
Iwo, should you block /category/ or /tags/? I cannot say conclusively, but it does makes sense. As long as there’s a way for the search engines to crawl your site and find all of the articles, you should be fine.
Iwo says
Thanks for the quick response.
Iwo says
Thanks for the quick response.
Stephanie Sullivan says
BTW, you can also log into Google’s Web Master tools and check a box that says you want to use www or not — or have both point to the same thing (that’s from memory — so wording may be different). 😉
Stephanie Sullivan says
BTW, you can also log into Google’s Web Master tools and check a box that says you want to use www or not — or have both point to the same thing (that’s from memory — so wording may be different). 😉