In SEO terms, canonicalization is the establishment of a single true (or canonical) web page that takes precedence over all other versions of the page in the eyes of the search engines.
For example, an Ecommerce site may be selling a winter coat in several different colors. Each winter coat color might have its own separate page, which would be the same page as all other colors with the only difference being the color; the pricing, description, material, and brand would all be the same. If there were 5 different coat colors there would be 5 pages with identical content. Now imagine if you sell 20 types of winter coats, each with 5 colors. As you can imagine, duplicate pages could easily flood your site.
Canonicalization is important for SEO because search engines like Google do not like duplicate content. Google states that in some cases where duplicate content appears to be intentionally deceptive they may lower your site’s rankings on Google or even outright remove your site for all search queries. This would obviously be catastrophic for a website or business reliant on drawing digital traffic.
However, even non-malicious examples of duplicate content can have negative affects on SEO. For one, it means that search engines have to use up a considerable amount of your site’s crawl allotment to crawl and index duplicate pages. This means that search engines robots may have a hard time crawling your entire website which could lead to certain pages on your site not being indexed at all, and thus not eligible to show up in the SERPs.
Additionally, if there are multiple versions of a page it is possible that when other websites link to you they will link to alternate versions of a single page. Let’s go back to the winter coat example; let’s say 10 websites link to the red winter coat page, and 10 sites link to the black winter coat page. Rather than having 20 backlinks to a single back, the value of these backlinks has been split in half. Instead of having one page rank highly for the keyword ‘winter coat’, you now have 2 pages ranking for the term, but neither is strong enough to rank high enough to actually bring in any traffic.
You can set up canonical pages through use of the rel=canonical HTML tag. This is an easy thing to do, though it can take a long time to do this manually if you have a lot of duplicate pages. Simply select which version of the page you want to be the canonical page, and then add the following HTML element to each of the top of duplicate pages, as well as on canonical page itself:
Make sure you use the absolute URL of the canonical page and not just the relative URL (i.e. include the https://example.com and not just /winter-coat). Check out this article from Google about common issues with the rel=canonical tag.
Once this is all done, you’ll want to double check your site’s XML sitemap. Make sure that the URLs in the XML sitemap are the same ones you are specifying as canonical on the web pages themselves.
It will depend on your website, but there are some plugins and extensions that can help you automatically set your canonical tags. For WordPress sites, we strongly recommend installing the Yoast SEO plugin (there is both a free and premium version; either is fine for canonicals). Yoast is great for many reasons, but in the context of canonicals it will automatically set canonicals. You can also set your canonicals manually via Yoast, so you won’t have to directly edit your HTML files. For more information please see this article from Yoast about canonicals.
For Magento sites, consider purchasing the Mageworx plugin with the SEO Suite Pro extension. Besides all the other benefits of Mageworx, the extension allows you to easily set up canonical URLs. For more information on Magewox, click here.