By now digital marketers focused on natural search are well aware of the “canonical tag” and its power to eliminate duplicate content. In fact, it may well be one of the most important features in today’s world of SEO. The following case study highlights the need, implementation, and impact of the canonical tag.
About the Canonical Tag
Using the rel=”canonical” tag improves link and ranking signals for content available through multiple URL structures or via syndication. It helps search engines consolidate the information and search equity they have on a single, preferred URL.
Why Duplicate Content is a Problem
Duplicate content occurs when a website has two or more documents with the same content. Search algorithms tend to exclude duplicate documents from their indexes and place less value on the preferred page.
Since Google released the canonical tag in early 2009 as a solution to resolve duplicate content issues on a website, we’ve seen and heard a lot of instances of websites that have experienced disastrous results due to a variety of implementation mistakes. Some have been due to inexperience, while others have come about from unanticipated technical breakdowns. In most cases, such problems are not known until it’s too late, as with the case of one of our ecommerce clients.
In the case of this electronics marketer, we observed over several months that a duplicate content issue did not seem to improve, but continued to increase even after the company implemented a canonical tag.
To monitor the duplicate content issue we used Google Webmaster Tools HTML Improvements section to understand how much duplicate content was being reported. For many months, GWMT reported there was a significant amount of duplicate content that existed. Considering the canonical tag had been implemented for almost a year, we couldn’t understand why a decrease was not being reported.
After further investigation, we identified the canonical tag may have been implemented incorrectly. The initial implementation was formatted like this:
One would be inclined to think that this implementation is correct and should work as intended because it clearly identifies “rel” and “href” correctly. Plus, it should work as intended since it was preceded by “link” when opening the element, despite the order of the attributes. Additionally, we used spidering tools that reported the canonical tag upon crawling the website – and reporting the links within each one.
However, in verifying pages in W3C’s Markup Validation Service, it was reporting pages that included this canonical tag on them with an error message indicating the Element link is missing the required attribute property.
Our initial assessment after seeing this was that it’s highly likely that it was being ignored by search engines too.
Lastly, in a 2011 post from Google’s Matt Cutts’ blog, he mentions that Google takes the rel=canonical urls as a strong hint, but in some cases they won’t use them if they “think you’re shooting yourself in the foot by accident (pointing a rel=canonical toward a non-existent/404 page), we’d reserve the right not to use the destination url you specify with rel=canonical.”
After realizing this might be the reason the duplicate content issue was not improving, we moved quickly to recommend that our client implement the revised canonical tag, which Google and Bing reference in their canonical tag implementation and instruction guidelines.
<link rel=”canonical” href=”http://www.client.com/en/client-Products/Product/Camera-Lenses/AF-S-NIKKOR-24-120mm-f%252F4G-ED-VR-Refurbished.html” />
A short while after implementation we noticed a significant improvement in reducing duplicate content. As you can see from the chart below, our client experienced an 80 percent decrease in duplicate content in a five-week period.
The canonical tag is a very powerful solution for solving duplicate content issues. That said, if it is implemented incorrectly or disrupted during a major site release, it can cause serious problems for your website and its search results.