Canonical Tag – A Solution to Duplicate Content

by on February 24, 2009

After years of listening to Webmasters tails of duplicate content woe. The search engines have united in what may be the biggest announcement since the release of sitemaps.

The Background

The canonical tag lets you specify your preferred version of a URL. If your site has a lot of similar content accessible through multiple URLS, the new tag allows you to tell the search engines what URL to use. It will also ensure all your page rank is retained by that URL.

The best examples of where this tag can be used is for ecommerce sites or other large CMS sites that produce a lot of dynamic content. Taking the ecommerce site as an example you may have a URL such as:

http://www.example.com/product.php?item=1

Due to things such as tracking, session ids or sorting the content in certain ways you may end up with the same data accessible through a URL similar to:

http://www.example.com/product.php?item=1&category=123

This has now created duplicate content. For a large site this effect is multiplied.

Duplicate Content Problems

Although the notion of a duplicate content penalty has been branded about by many an SEO, this just isn’t true. There is a great post about this topic over at the Google Webmaster Blog:

Demystifying Duplicate Content Penalty

But Google do explain some of the problems caused by duplicate content such as:

1. Search engines don’t’ know which version of a page to include/exclude from their index
2. Search engines don’t know whether to direct the link metrics (trust, authority, anchor text, link juice etc) to one page or keep it separated across multiple versions. This can result in your PR being spread across multiple copies of the same page.
3. Search engines don’t know what version of your page to rank for your keywords.

Also another problem to keep in mind is that a search bot may spend enough time on your site to index 100 pages. If 50 of those are duplicate content. This is having an adverse effect on how much of your good content is getting indexed.

The Solution

The Canoical tag is part of the HTML header on a web page, it’s in the same place you would find the Title attribute and Meta description. From our example above we would have the following URL on pages that are marked as duplicate of the master URL:

<link rel=”canonical” href=”www.example.com/product.php”>

There are already several plugins made available for this Canoical Tag

Canoical Tag Plug-Ins

Also there is some great info from Matt Cutts here:

Matt Cuts Speak on Canonical Tag

Share

  • http://searchireland.blogspot.com Jonathan Darling

    Thanks Kieran – like Nurofen to a cold, this is a quick fix for a huge headache.

Previous post:

Next post: