Canonical tags enable webmasters and SEOs to avoid duplicate content easily. However, certain aspects need to be taken into account so that the benefit of the tag is not lost.
This guide contains everything you need to know about the canonical tag, when it can be used, and what you should watch out for.
In February 2009, Google joined forces with the Microsoft and Yahoo search engines and has been supporting the canonical tag ever since. In their corporate blog, this was directly announced as a feature for webmasters:
This element is a meta-tag that can be inserted into the HTML or http header. The tag utilizes rel=canonical to point search engines to the original version where there are two identical pages with different URLs. In this way, the duplicate of the canonical URL is ignored and only the canonical URL is indexed.
The HTML element is referred to not only as the canonical tag, but is often called the canonical link, or simply the "canonical" as well. When used correctly, the canonical tag avoids duplicate content and helps you with search engine optimization.
If you include a specific canonical URL as a tag in the source code, you will assist search engines in handling your website. This is because search engines will learn that the currently crawled URL is not to be indexed, but rather the referenced URL. If links are already pointing to the duplicates, their link strength is transferred to the canonical URL, thanks to the canonical tag. At the same time, you can avoid a situation where Google selects a URL you may not prefer for indexing, such as a URL with http instead of https.
You can specify a canonical URL in two different ways. The tag can be inserted in the <head> element of the HTML document. In most cases, implementation in the HTML header is recommended because it is technically very easy to implement. In addition, many content management systems offer the possibility to set canonicals in the HTML header via plugins.
Note, however, that you cannot place the canonical tag in the HTML of certain documents. PDF documents, for example, are not HTML, so with those a canonical tag cannot be placed in the <head> element. The HTTP header must be used in this case.
Example:
Let's assume that there are two different URLs on your site that have almost exactly the same content. Perhaps they differ only in terms of a different menu or button. There are already links to both pages. However, to make sure that Google does not have to choose which of these two URLs should be displayed in the SERPs, you must now select a canonical URL.
Let's assume that the two URLs look like this:
1. http://mypage.de/seotips
2. http://mypage.de/seotips-up-to-date
We now decide that the first URL should be the canonical URL. In practice, you should select the more relevant page, the page with the shorter URL, or the page with the most links as the canonical URL.
Now, you add rel=canonical to the duplicate in the <head> element. Now, it looks like this:
<link rel="canonical" href=”http://mypage.de/seotips">
Google and other search engines now "know" that there is a canonical version of this page that they should consider when indexing. Links to both URLs then count for the canonical URL. Thus, you have implemented a kind of "soft forwarding" without redirecting the user.
If your resource is a PDF document or other document type supported by Google, the canonical tag is implemented in the http header. In order to do so, you must consider a different syntax.
Let's take the example from above, only this time the second document is a PDF source:
1. http://mypage.de/seotips
2. http://mypage.de/seotips-up-to-date.pdf
The following line is inserted in the http header of the PDF document:
Link: <link http://mypage.de/seotips >; rel=“canonical“
Google currently supports canonical tags in the http header only for web searching.
It is advisable for each page to refer to itself by a canonical tag in the event that there is no duplicate of it. For example, search engine bots can be prevented from including a campaign URL in the index. Google also recommends this approach.
Online stores are often confronted with problems relating to duplicate content. These problems usually arise when the same product is available in different categories via different URLs.
Let us take an example of an online store for sneakers. The model in red is available for men as well as ladies and is also placed in the "leisure shoes" category. This results in four URLs with the same content:
1. http://www.shoes.com/sneaker-red
2. http://www.shoes.com/men/sneaker-red
3. http://www.shoes.com/women/sneaker-red
4. http://www.shoes.com/leisure-shoes/sneaker-red
In order to distribute the "link juice" optimally to a URL and at the same time prevent duplicate content, you select the first URL as a canonical URL. In this case, you add the canonical into the <head> element of the other three URLs.
-Area of the three other URLs the canonical one:<link rel="canonical" href=”http://www.shoes.com/sneaker-red">
Many online store systems already feature automation of canonicals. Typically, the canonical tag is placed on the main page of the product.
Content can often be converted into a print version, not only in online stores, but also on news sites or on large corporate websites. Some content management systems can create such a print version with a single click. This conversion usually creates a separate URL, which, in the worst-case scenario, is referenced in the search engines' index. In doing so, duplicate content is generated, which can also lead to an undesirable limitation in usability. For example, if a user accesses an indexed PDF file, he cannot navigate or return to your site.
In this case, a canonical tag in the http header of the print file or the PDF can help. The tag then points to the original URL, which is also indexed.
Tip: If you make a change to the URL structure of your website, you should check not only redirects, but also the existing canonical tags. This also applies to a change of the protocol, for example, when switching from http to https.
A canonical tag can also be used to refer to an author of an article or an original post. For example, it is possible that a contribution in OnPage magazine is used on a different page for republishing. For Google and other search engines to refer to the original, the re-publisher uses a canonical link to OnPage magazine.
In this case, other users can read the original article on other websites. However, only the main article will appear in the search engine index.
If you mark your website with hreflang to mark the country and language versions for search engines, you should also use rel=canonical. Each language version refers to itself by a canonical.
There are many cases where the incorrect use of rel=canonical can lead to serious problems. For example, an incorrect canonical may cause a URL to disappear from the Google index and no longer be ranked.
For this reason, you should avoid the following errors.
This question is difficult to answer universally. This is because ultimately it always depends on the individual case as to whether a canonical or a redirect makes more sense. If you have the choice between a 301 redirect and a canonical, you should always opt for a 301 redirect. If you cannot use a 301 redirect for technical reasons or a possible limitation of usability, the canonical tag is a sensible alternative.
For example, if you have multiple URLs in your online store that point to the same product, a canonical tag makes sense. For one thing, the solution is easier technically; for another, your users should be able to actually call up the relevant URL.
If you want to redirect a URL directly, only the redirect is possible because the canonical tag does not actively redirect URLs, but is merely a reference for search engines.
Facebook and Twitter can also read and implement rel=canonical. If you now share a URL with a canonical in these social networks, the information on the canonical URL is gleaned. If the post with this URL is liked, the likes count for the canonical URL.
If you can't use rel=canonical at all, you still have the option of storing only indexable URLs that output the status code "200 OK" in an XML sitemap. You can submit this sitemap to Google via the Google Search Console. In this way, you obtain a small chance that Google will only index the pages submitted through the sitemap. However, this is only an option for newly-created web pages, since the indexing of existing web pages can barely be influenced. If you genuinely cannot set a canonical tag, you should at least attempt to show search engines which URL is the most important via the internal links. If, for example, you have a news article and a print version of that article, all internal links should go to the article itself, and only a single link should go from the article to the print version.
If the URLs of your website work with many variables or parameters and it is not always easy to work with a canonical, you can also use the Search Console. There, you specify how Google should deal with certain parameters in URLs.
Canonical Tags are a powerful tool for SEOs and webmasters to avoid duplicate content and to better distribute Link Juice. Even without plugins, the implementation is also very simple from a technical perspective.
However, there are some pitfalls that we have shown here. If the canonical is implemented incorrectly, the meta element can also be damaging and, for example, exclude important sub-pages from the ranking. On the other hand, your page can benefit greatly from the tag if you keep the above rules for setting up rel=canonical in mind.
Published on 06/06/2017 by Eva Wagner.
Eva is an experienced content marketer. Until May 2018 she was a member of online marketing team at Ryte. Using her creativity and the knowledge of current topics, she was responsible for the German Ryte Magazine and the Ryte Wiki. She also organized Ryte’s presence at major trade fairs such as the dmexco in Cologne.
Own the SERPs with the only Platform using exclusively Google Data.
Book free demo