Web Hosting

Clarifying Duplicate Content on WordPress – Again

Duplicated Laptops. The old chestnut that WordPress ‘creates duplicate content issues’ keeps coming up. Someone even wrote it to me in an email recently.

Let’s be clear: WordPress does not create duplicate content problems.

What WordPress does is allow the same content, on the same site, to be accessed via a number of different URLs (or permalinks): tags, categories, day archives, month archives, year archives and author archives.

This is not duplicated content. It’s the same content on the same website accessed via different URLs.

Duplicate content is the same content on different websites.

Hear it from Google

At the recent Google Site Clinic held in London this point was addressed square on – here’s a snippet of the report on this specific topic from the Google Webmaster Central Blog:

Duplicate content within a website is generally not a problem, but can make it more difficult for search engines to properly index your content and serve the right version to users.

There are two common ways to signal what your preferred versions of your content are: By using 301 redirects to point to your preferred versions, or by using the rel=”canonical” link element.

You can read the full version here.

As I’ve written before, Google understands how platforms like WordPress operate and does not see the same content on the same website accessed via different URLs as duplicated.

And you have a range of ways to ensure that the URL you want to be seen as the primary version of your page is seen as such:

The Canonical tag

WordPress introduced the canonical tag function as a default in version 2.9. And the All-in-one-SEO-pack plugin, Platinum SEO and the other SEO plugins also feature the canonical tag and enable you to decide whether or not to use it.

With the plugins, all you have to do is check the option and your posts will be canonicalised.

So duplicate content is not created by WordPress.

Preserving Link Juice

The more valid concern is that you can lose link juice as a result of the same content being indexed via multiple URLs.

So the way to fix this is to noindex each of your archives – or, at least, those that you don’t consider that important.

I noindex and nofollow all the archives on my WordPress sites with the exception of the category and tag archives.

That’s because I’ve set up my categories and tags carefully to ensure that related content can be found easily (more details here).

So I use the canonical tag on each article, set index,follow on my category and tags pages, and noindex all the other archives.

This enables me to focus the search engines on the primary version of each article, and it gives them a structured pathway along which to crawl the rest of the site so they can index it both easily and fully.

And it minimizes lost link juice.

Cheers,

Martin Malden

 

Web Hosting

Comments on this entry are closed.

  • Cervix Webdesign 9 February, 2012, 6:41 pm

    Finally someone who seems to know what he is talking about!

    I have read researched this matter over and over only to find conflicting points of view even from renowned SEO experts, and most differ from yours. Yours however is the only one that explains his without making assumptions about what Google thinks.

    Would you advise though not to use XML sitemaps for both categories and tags?

    • Martin 9 February, 2012, 7:37 pm

      The XML sitemap covers the entire site, not specific elements of it, but I recently noindexed my tag pages. Category pages are still indexed and both category and tag pages are set to follow.

      The reason for noindexing tag pages was because every article on the site is covered (indexed) via both the category pages and the individual posts and pages, so no need to index tag pages as well. Some of my category pages rank extremely well for highly competitive terms.

      The reason for setting ‘follow’ on both tag and category pages is to make sure the SE’s are able to get to every article on the site.

      Cheers,

      Martin.

  • Cervix Webdesign 10 February, 2012, 3:36 am

    I like consistency, but whatever works. If you changed you method, I am interested to know what the effects will turn out to be.

    I by the way use different XML sitemaps, in which case to me it seems better to choose only one for tags or categories. Still with using tags or categories, the problem of duplicate content using Wordpress, if it indeed would be a problem, still exists. So some go even further with it and use noindexing even for subcategories, while others claim this is pointless.

    If everyone else has a different opinion on the matter, let me be the first not to take a definitive point of view for now 🙂

    • Martin 10 February, 2012, 10:34 am

      To me there are two points:

      1. I want all my articles to be indexed
      2. I want to keep my SEO focused

      That’s the logic that drives the approach I’ve used, and why I subsequently noindexed my tag pages.

      As I said, some of my category pages rank in the top 6 for extremely competitive keywords and that brings me traffic that tends to explore the site a bit, which is nice.

      So, for me, indexing my category pages has worked well.

      But a large part of that is because, as I mentioned in the article, I’ve taken a disciplined and focused approach to how I use categories (and tags).

      If my use of categories was not disciplined I very much doubt they would rank as well as they do.

      For example, a site that has 20 – 30 categories, each with just a couple of articles in them, is not likely to have its category pages rank well. A site like that would want to noindex its category pages.

      So looking at index and follow settings in isolation is only part of the story.

      You first need to be sure that you organise the underlying content in a disciplined and focused way, and then use settings that will enable the search engines to reach and, therefore, index all of it.

      Cheers,

      Martin.

  • Cervix Webdesign 10 February, 2012, 11:58 pm

    Hi Martin,

    Thanks for replying. It is crystal clear. I will keep it in mind.