Official Google Webmaster Central Blog: Demystifying the "duplicate content penalty"

Demystifying the "duplicate content penalty"

Friday, September 12, 2008 at 8:30 AM

Duplicate content. There's just something about it. We keep writing about it, and people keep asking about it. In particular, I still hear a lot of webmasters worrying about whether they may have a "duplicate content penalty."
Let's put this to bed once and for all, folks: There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that.
There are some penalties that are related to the idea of having the same content as another site—for example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value. These tactics are clearly outlined (and discouraged) in our Webmaster Guidelines:

Don't create multiple pages, subdomains, or domains with substantially duplicate content.

Avoid... "cookie cutter" approaches such as affiliate programs with little or no original content.

If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.

(Note that while scraping content from others is discouraged, having others scrape you is a different story; check out this post if you're worried about being scraped.)
But most site owners whom I hear worrying about duplicate content aren't talking about scraping or domain farms; they're talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site's performance, but it doesn't cause penalties. From our article on duplicate content:

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

This type of non-malicious duplication is fairly common, especially since many CMSs don't handle this well by default. So when people say that having this type of duplicate content can affect your site, it's not because you're likely to be penalized; it's simply due to the way that web sites and search engines work.
Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy. You can find details in this blog post, which states:

When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.

We select what we think is the "best" URL to represent the cluster in search results.

We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.

Here's how this could affect you as a webmaster:

In step 2, Google's idea of what the "best" URL is might not be the same as your idea. If you want to have control over whether www.example.com/skates.asp?color=black&brand=riedell or www.example.com/skates.asp?brand=riedell&color=black gets shown in our search results, you may want to take action to mitigate your duplication. One way of letting us know which URL you prefer is by including the preferred URL in your Sitemap.
In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.

In most cases Google does a good job of handling this type of duplication. However, you may also want to consider content that's being duplicated across domains. In particular, deciding to build a site whose purpose inherently involves content duplication is something you should think twice about if your business model is going to rely on search traffic, unless you can add a lot of additional value for users. For example, we sometimes hear from Amazon.com affiliates who are having a hard time ranking for content that originates solely from Amazon. Is this because Google wants to stop them from trying to sell Everyone Poops? No; it's because how the heck are they going to outrank Amazon if they're providing the exact same listing? Amazon has a lot of online business authority (most likely more than a typical Amazon affiliate site does), and the average Google search user probably wants the original information on Amazon, unless the affiliate site has added a significant amount of additional value.
Lastly, consider the effect that duplication can have on your site's bandwidth. Duplicated content can lead to inefficient crawling: when Googlebot discovers ten URLs on your site, it has to crawl each of those URLs before it knows whether they contain the same content (and thus before we can group them as described above). The more time and resources that Googlebot spends crawling duplicate content across multiple URLs, the less time it has to get to the rest of your content.
In summary: Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately, it's unlikely that one of those ways will be a penalty. This means that:

You typically don't need to submit a reconsideration request when you're cleaning up innocently duplicated content.
If you're a webmaster of beginner-to-intermediate savviness, you probably don't need to put too much energy into worrying about duplicate content, since most search engines have ways of handling it.
You can help your fellow webmasters by not perpetuating the myth of duplicate content penalties! The remedies for duplicate content are entirely within your control. Here are some good places to start.

Posted by Susan Moskwa, Webmaster Trends Analyst

The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

135 comments:

Robert J Miller said...: I blog about Colorado Hiking, Biking, and Camping and wrote some Knols with basically the same content. Do my Knols put me at risk for getting my blog penalized for duplicate content?

Thanks.; September 12, 2008 9:48 AM
Jennifer Mathews Somogyi said...: I ran into the similar content issue when Classmates wanted to expose the user profiles, but didn't want the user's information exposed on the page. So how do you make over 40 million pages unique when all you have to work with is a name and a school they attended?

I talked to an SEO expert about it and she suggested writing unique content for each page "just block out x amount of pages each week to be written" - seriously? on 40 million pages?

I ran a test to see what was considered unique content to see if we could pull the information from the database into the content on all the pages in the way that would have them recognized as unique enough not to get filtered out.

You can read about it
here; September 12, 2008 10:20 AM
J. Max Wilson said...: What about a blog portal website that aggregate RSS content from multiple blogs or websites on a similar theme?

For instance, what if I have website that aggregates and republishes blog posts from blogs by Amish farmers, and each aggregated post receives its own permanent url on the aggregating website that adds information about related posts from other aggregated blogs, and cross blog tags?

Some of the blogs post full-content feeds and so their post is substantially duplicated on the portal.

The portal version clearly attributes the source of the post to the originating blog and includes a link back to it.

No advertising is displayed on content from other blogs.

Will the portal get flagged for duplicating content?

This is a real world scenario. I currently operate such a portal--though not for Amish Farmer blogs :), if there are any.; September 12, 2008 10:23 AM
aroedl said...: What about scraped content from for example wikipedia.org? I've seen some sites having 1:1 copies of wikipedia articles. Is there a penalty in those cases?; September 12, 2008 10:57 AM
paisley said...: Question: what about stuff like this..

http://googlewebmastercentral.blogspot.com.tynted.net/2008/09/demystifying-duplicate-content-penalty.html

what happens when this shows up in search results like...

http://www.google.com/search?hl=en&q=site%3Atynted.net; September 12, 2008 11:32 AM
Trisha said...: So is it ok to use services like blogburst that use my content on other sites?; September 12, 2008 1:30 PM
Steve said...: Ok... Identify if I have multiple domains, and have the same content on both URLS... I guess that a redirect would be a better idea?

I'm assuming it's bad if I own both domains (for instance, not a real example)
bikeridingsomething.com
and
bikeridingsomethingelse.com

and they have basically the same data? Ideas on best practice? Redirect? Leave it as is? Muh...; September 12, 2008 6:27 PM
jasmin said...: 1. Duplicate content is often necessary due to geolocalisation of SERP (for instance, information posted in French in a Canadian site cannot be found by people searching in France within France pages)

2. Google directory is duplicate content of DMOZ directory.
No penalty. Why? Why?; September 12, 2008 7:27 PM
lubokduit said...: hai, i'm newbloggers.oh my..now i knew if duplicate contant can make penalty. thank for information.
mesotheloma; September 13, 2008 7:12 AM
ristvin said...: I often re-write ads that I'm an affiliate to. I do this because the ads are often poorly written and I can't get clicks through them. Sometimes my ads do have a testimonial that is the same or I include one of the good sentences that are unique to the original ad. Is this considered duplicate content?; September 13, 2008 3:22 PM
Satsaid said...: Sorry! i don't talk about technique
but I request from you

Please added some help or tips in first page about Hurricane like a
- Catarinas
- Google Logo same Olympic
- The most popular search about its (this can help what happen and maybe forecast events..)

thank; September 13, 2008 6:22 PM
Goedkoopste lening said...: What about Hostheaders? I hear these are also condisidered to be duplicate content.; September 15, 2008 12:25 AM
Dijkstra said...: Excellent article, but it missed an important source of site duplication. As @jasmin pointed out, some duplication comes for geolocating domains. I mean, "domain.com/french" may have the same content as "domain.fr". The issue here is that we couldn't do a redirection from .fr to .com because geolocation: the .fr domain is intended to people living in france, so they have diferent products. BUT although they have SOME diferent products, 90% of the site is the same.; September 15, 2008 3:16 AM
SearchMasters said...: Thanks for writing this post. But I am still very concerned about how sites are **still** disappearing from Google for pages main search phrases, when other websites pages copy either meta descriptions, or content including the search phrase on the pages.

You did not write a response to my post in popular picks http://groups.google.com/group/Google_Webmaster_Help-Indexing/browse_thread/thread/e213eec10610a481, but I hope that this post of yours is a try at a response. So thanks.

BUT - You say that we should not be concerned about scraper websites copying meta descriptions and content around search phrase on our pages. How then can I have a client website where people have copied my clients content, then I make it unique again, and the homepage reappears on Google??? How can it happen a second time, and even when I make the content unique again, the homepage is still not appearing for the main search phrase.

This seems to be firstly a simple dup content penalty that I was able to fix by making content unique, then a more draconian penalty that is keeping the homepage ranking for the phrase despite the now unique content.

Full case study on my blog:
http://www.searchmasters.co.nz/articles/160/sites-disappearing-from-google/

I have filed a reinclusion request for the clients site to try and get it back. However, I would rather the issue did not exist in the first place.

Google - please get your heads out of the sand. The issue STILL exists, despite your protestations to the contrary.; September 15, 2008 4:16 PM
Norman said...: my site showed a lot of dup title and discription meta tags. a lot of coping pages using them as a template even though the content was very diferent on each page.
i have bee updating and correcting these errors but in the dashboard diag it still shows the old tags. i was wondering hoe often the site is crawled and when would the changed tags show up as changed instead of the old tags.
in the dashboard it does not show when the site was last crawled or when it will be crawled next? just wondering; September 15, 2008 6:38 PM
Stuey said...: Amen to all of that but I've noticed that the scrapers, if they are from a well established site, can take the blog content I've written re-produce it verbatim and rank instead of me.
In some cases I've noticed plagurists have taken articles from me and put them on their sites and then embedded links to the images from the article too. Not only are they stealing my content, they are also stealing my bandwidth and they still rank instead of me.; September 16, 2008 1:24 AM
Guru said...: Nicely put. Well, i still have few questions arising in my mind. First of all what is Google's perspective on sites which have to use almost similar content e.g cheat sites as cheat codes cannot be altered to make them unique as they totally loose their meaning plus a lot of cheats are submitted by users who submit their content on several sites which makes it impossible to make every page unique.

Secondly some times when i publish content on my site it is instantly copied on few blogs which then get indexed way before me.

Now since i have thousands of pages of content and i wonder how to locate each page which is being copied and each site copying my content and filing DMCA against them.

I will highly appreciate if you could at least reply to first part of my question as my site has lost its serps and your answer might put me in right direction to fix my rankings.; September 16, 2008 4:43 AM
SearchMasters said...: The general definition of "duplicate content" that effects rankings for a certain phrase:

- the first instance of a search phrase in text should have unique content in the two words before and after that search phrase (its my educated guess at "two". One might do, might be "three", but I do it for two and it seems to work.)

ie if the search phrase is "blue widgets"

then you need to make sure that your phrase:
"the best blue widgets in America" is not on any other website.

so its easy enough to make even your cheat code pages unique - just have some machine generated text before and after the exact search phrase:
ie "bestcheatcodes.com warcraft3 cheat codes for free" - and make sure that phrase is unique.

To make sure a phrase is unique - Google for it with quotes around it.

And the great thing about machine generating the before and after text is that when other people copy your sites content, you make just the one change in your template, and your thousands of pages are unique again.

Simple - apart from when Google bans your page because its been copied too many times - or that is my take based on some case studies.; September 16, 2008 5:00 AM
Samuel said...: Spanish translation of this article here

Traducción al español de este artículo; September 16, 2008 5:02 AM
editor said...: Hi: A fraudster has set up a website which duplicates all our pages. As soon as we upload a new item to our site it appears also on the fraudulent website. The website uses our names to make it look genuine. What can we do? How can we inform Google? Any advice would be appreciated. Thanks; September 16, 2008 10:46 AM
Vincent Vega said...: I have duplicate content on my new blog (www . webmasterinter . net) and on my blogspot.
It's because I haven't got good domain when I decided to start.
I;m happy to see that I'm not gonna be penalized much, cause new blog has updated version of that old post.

Anyway, when do we read about google sandbox?; September 16, 2008 11:07 AM
Andrew said...: This is very useful information, thank you. I don't think I'll worry about getting these 'penalties' so much anymore :); September 16, 2008 12:28 PM
Angela said...: You see people "warning" about duplicate content all the time when talking about articles submitted to article directories and the like. I've been reminding people that if "duplicate content" the way they are saying is a problem then the article directories would have died out long ago. And what about the sites that syndicate the articles? Wouldn't they have disappeared as well?; September 16, 2008 8:46 PM
khurram said...: My query relates to my site “http://youpark.com” and feedurmobile.com

Problem is my site URL were query based and now I have made the SEO friendly and wrote rule for redirection. In normal browse all links are valid. But on webmaster tool I saw that following type of links are being reported. I don’t know how crawlers are crawling and generating these from my application.

http://www.feedurmobile.com/o2-xda-phone&manufacturerName=o2/MobileApplications/WL/147/2022/portalbrowsebycategorycb?categoryName=synchronization&devicename=o2-xda-phone&manufacturername=o2-software&portalID=147

whereas above is the combination of two urls that webmaster reported. It should be
Current URL – SEO friendly

http://www.feedurmobile.com/o2-xda-phone&manufacturerName=o2/MobileApplications/WL/147/2022/

and OLD URL

http://www.feedurmobile.com/portalbrowsebycategorycb?categoryName=synchronization&devicename=o2-xda-phone&manufacturername=o2-software&portalID=147

or

http://www.youpark.com/Symbian/IM for Skype for Symbian S60 v.3/25800/Product/

it should be
http://www.youpark.com/Symbian/IM for Skype for Symbian S60 v.3/25800/ without word "product" at the end.

Please help me to dig this.; September 16, 2008 9:02 PM
Kalvalastir said...: My site http://zaxarius.altervista.org/ seems to be considered a duplicate of other websites that actually uses the same CMS (actually phpNuke). You can spot them in here: http://www.google.com/search?q=5+clone at the 1st and 2nd place. The first title is taken from the ODP, in this category.

How is this possible?

The three sites actually DO NOT resemble the same at first glance. I guessed that the similar header and table structure generated by the CMS would make this possible, but looking from user-side I think this should not happen, because pagese ARE NOT similar! They are different websites at all! Can anyone explain this?; September 17, 2008 12:44 AM
Chris said...: I have several sites on different medical topics, however I license syndicated content from several sources. I know this content is not "unique" per se, but it is not scraped or "duplicated"... it is content I pay a lot of money for... I know this content appears in many other sites and in many cases I cannot alter the text as that would be a breach of contract. Do you have any ideas on how I can avoid the "duplicate" content penalties for this. Thanks; September 17, 2008 7:34 AM
suvra said...: I have uploaded a website which is http://www.utsav-collections.com
BUT, I haven't found into google search results. Which is i have uploaded last 20-35 days. I ahve uploaded also sitemap. but still it is not found. what is the reson?; September 18, 2008 12:44 AM
ricdik said...: I work for a company that is creating over 200+ local pages & we've
stressed to them the importance of creating unique copy for each
page. As we know large companies don't want to put as much time into
creating that many unique pages. The company & services are all the
same in all the areas so to them all the pages should be the same. So
the pages will be under a root domain & sub folders, such as "www.root-
domain.com/DMA-city/product", where the DMA would be the only thing
that would change.

Also, the meta data & titles would be different only by the DMA & then
the content on the site lets say its 500 words that speak to the
product & DMA (which would be the only thing changed).

These are all under the same root domain & are duplicate content how
can they build all 200+ pgs & make them indexable while avoiding
duplicate content. Would changing the DMA in the 500 words, meta data
& title be sufficient if the DMA is listed at least 10-15 times on the
page.

It seems to me that they are willing to create 200+ unique pgs when
the products are the same for every market, so to have individual
local pages for each market would cause them under the current method
to have duplicate content issues. I know it says large blocks of
content the same or similar, if these were individual URLs or sub-
domains would they resolve the issue, e.g. "DMA.root-domain.com, or
www.root-domain-DMA.com; September 18, 2008 12:13 PM
gem-online said...: hello, google!

i have these 2 blogsites, i pray google crawls and ranks them good -but i am not a web lady so if anyone can help to make http://gifts-parasayo.spaces.live.com and http:www.linkedin.com/in/philippineslifestyle

i thank you,
regards!
gem; September 21, 2008 10:07 PM
8P said...: would laborpains.org/, server1.laborpains.org and http://www.unionfacts.com/blog/ which are all carbon copies of the same pages be good examples of bad ways to duplicate your site?; September 22, 2008 5:39 PM
Susan Moskwa said...: @j. max wilson:
If your aggregation site only republishes the feed content and doesn't add significant value, it's possible that it will be filtered out in search results (we'll may show the original articles instead of your aggregation site, depending on what a user is searching for). However, if your site adds some type of significant value, you may be able to rank for that aggregated content. This article may shed more light on it (the tone is a bit wrong for your situation, since you say you're not scraping, but the recommendations at the bottom of the article may be helpful).

@aroedl:
That kind of scraping could be eligible for a penalty.

@paisley:
That type of stolen content may still appear in a site: search, but you'll notice that if you search for any portion of the actual article, our blog (the originator of the content) shows up first and all other results are hidden with the message "We have omitted some entries very similar to the 1 already displayed." That's what we mean when we talk about grouping and filtering: we may index duplicate content, but we try to group it and only show 1 version when we actually serve search results. Searching for text will give different results than a site: search.

@steve:
If you own more than one domain and they all have the same content, I'd definitely recommend doing a 301 redirect from each page on domain A to the corresponding page on domain B. This makes sure that people will end up at the right place no matter which URL they enter, but only one version will be indexed.; September 23, 2008 1:18 PM
Steve said...: @susan moskwa

Thanks... I appreciate the input. I was thinking that I was going to have to do something of that nature. Again, thanks.; September 23, 2008 1:23 PM
J. Max Wilson said...: @Susan Moskwa

Thanks for your response. I have carefully read every page that I could find related to duplicate content in the webmaster knowledge base, and it is still unclear.

It seems to me that determining whether or not a site "adds some type of significant value" can be pretty subjective. Is that determination made by an automated algorithm or does it involve human evaluation?

Would showing potentially related posts from other blogs by analyzing links and keyword count as significant added value?

I think that it should.

In either case, I appreciate your input and will see what I can do to make sure that my aggregation website adds valuable additional information such as community ratings.

Thanks!; September 23, 2008 1:32 PM
Susan Moskwa said...: @dijkstra:
In the example you give (example.com/french vs. example.fr), instead of serving the same content on both of those domains, you can redirect one to the other and then use Google's geographic targeting tool to target example.com/french to France.

@searchmasters:
I made an open call here for people to send in examples where they still feel we're not handling duplicate content correctly. I'll pass your case study along to the right folks.

@chris:
If you're displaying the same content that many other sites do (through a syndication deal), then that content is duplicated; not maliciously, in your case, but it is still being duplicated (the same content appears in multiple places). As I mentioned in my article, if you want to rank for that content, you'll need to add a lot of additional value for users that isn't available on any of those other syndicating sites. It's also fine to syndicate content without adding value if your site's business model doesn't rely on search engine traffic.

For those of you concerned about scrapers, you can read this post and/or file a DMCA takedown request.; September 23, 2008 1:36 PM
Susan Moskwa said...: @j. max wilson:
I agree it's a bit subjective, but it's hard to give specific examples because what "adds value" will be different for every site. Usually it means having something on your site that can't be found anywhere else. Ask yourself: why would someone come to my site rather than to the site that originated this content? What kinds of content or features would make them want to bookmark my site and share it with friends?

You may want to try experiment with a website testing/optimization tool to see what users respond well to.; September 23, 2008 1:42 PM
SearchMasters said...: @Susan Moskwa - thanks
The clients website has still been penalised (now a month later) based on being copied twice within a month.

I have other clients sites that have been similarly penalised, so I am very keen for your engineers to look at this with all urgency.

Appreciated.; September 23, 2008 3:14 PM
J. Max Wilson said...: @ Susan Moskwa

Thanks again for your helpful feedback.

Not to be a pest, but I am still curious about whether the "adds value" determination is made by algorithm, a person, or a combination of the two?

Thanks for all your work!; September 23, 2008 3:46 PM
SearchMasters said...: @J. Max Wilson
Please refer to my post - http://googlewebmastercentral.blogspot.com/2008/09/demystifying-duplicate-content-penalty.html?showComment=1221566400000#c5859499633564154704

Get unique words around your search phrase for starters. Then experiment with how much additional unique content per page you need, to be able to get the page ranking for that search phrase.

Yes, my post was a little simplistic, since Google is able to see past the opening paragraph, and see that the remainder of the page is the same. ie Google can see that many article submission websites have the same article, even when there are unique words around search phrases in opening paragraphs.

I have often used what I call "semi random" - Creating say 5 variations of an opening paragraph that I plug search phrases into for database generated pages. Then based on say the record number, I always use say the 4th variation for a certain page, and the 3rd variation for another page.

I have very successfully used this method for directories say of the local outlets of a retailer on their website, or jobs/categories on a jobs website.

The "adding value" is an algorithmic check, so get your algorithm working and you can get pages ranked.; September 23, 2008 5:01 PM
NanoMan said...: OMG Google is revealing information the common man can understand. After years of trying to explain this thankfully your company came out of the closet. Not bad only took a decade!; September 24, 2008 4:02 PM
NanoMan said...: @ J. Max Wilson said...

Not to be a pest, but I am still curious about whether the "adds value" determination is made by algorithm, a person, or a combination of the two?

Thanks for all your work!

With billions of webpages do you think there is much human hand work mucking about in the writing of the search algorithm??

I would say, most everything is done by algorithm, to enable Google to secure their environment as much as possible, with the max productivity and without the need for continual human intervention.

;->; September 24, 2008 4:06 PM
Salentino said...: Italian Traslation
http://blog.imevolution.it/91/cose-veramente-la-sanzione-del-contenuto-duplicato/; September 25, 2008 12:24 AM
Grim D. Reaper said...: I duplicate content to areas where I know I have viewers who strictly stay inside their own "walled community," like Blogger, for example (which you are required to have a log on in order to comment) or Live Journal. Will I be penalized for duplicate content when my intent is getting to people who refuse to visit my site rather than aggregate in their own community?; October 3, 2008 3:47 PM
rynophiliac said...: Wow this is the most helpful tool I have ever found on duplicate content so far!

I have a small service company in Phoenix Az and I am pretty sure my site has been punished for duplicate content.

I created a site windowcleaneraz.com, however I am new to this whole web development thing and have learned about how domains with your keywords are better for search results and since acquired a better domain windowcleaningscottsdale.com, I developed the new site and all of sudden one day my traffic for both sites went from about 3 or 4 a day to zero and has been zero ever since, I found both sites on the 6 page results when they used to be #1 and #2 for my search of "window cleaning scottsdale"

While this was happening I had just finalized a purchase on a new domain windowcleaningphoenix.com for $600 which i think is the best I could have for my service and area and now have finished the design of that site.

I really dont even want the other two domains, windowcleaneraz.com and windowcleaningscottsdale.com but I own them for a year. I have since done a little research and confirgured a 301 redirect from both the old domains to the new windowcleaningphoenix.com but I fear it may be too late. The damage has already been done.

Is there anything else I can do to help the matter and get back in the results again?

I have done extensive research and all I seem to find is poeple's opinions, this blog my be just what I need.

I really wish I could just remove the two old sites from googles index altogether and let the domain registration run out.

Thanks for any advice/help/ideas
it is much appreciated
Ryan; October 5, 2008 10:23 PM
tekgems said...: Where does eBay fall into all this? They list duplicate content all the time that also appear on other eCommerce sites. They also publish content based off of SKUs.; October 7, 2008 11:31 AM
God's Princess said...: I was wondering what would happened if someone stole your content and re-published it on their own site. Would you be penalized for this?; October 25, 2008 6:16 AM
Susan Moskwa said...: @Gods Princess:
Check out this article.; October 27, 2008 9:20 AM
My Digital Partner said...: A couple of previous comments have pointed out a similar problem to mine. I would like to have 2 similar sites running along each other for the .com and .co.uk top-domains. Most of the content would be the same except for some portfolio examples and the contact details. Google Searches within the UK would probably rank www.mydigitalpartner.co.uk higher up, while the "generic" mydigitalpartner.com does quite well when not selecting "Pages within UK".
So what do I do?
Redirect the .co.uk domain to .com, which would impact negatively on UK searches.
Upload the same content to .co.uk domain, taking the risk of being penalised?
How do other companies create country specific pages when the content is written in English and over 90% is the same?
Many thanks in advance,
Regards,

Greg; November 4, 2008 1:32 AM
Praveen said...: Let suppose i am publising pages as it originally posted at somewhere else but i add some remarks/ comment on the site. Is this a case of penalization by google; November 6, 2008 2:26 AM
UpScale Pet Accessories! said...: Very nice post. I wish there could be a little more info coming from google about how they are interacting with your site. The newest updates to google webmasters is great, although not updates as much as I'd like. I'm on there everyday! We have content that is coming from our supplier for our products and it appears we got dinged for it but I can't be sure. Since then we've re-written all unique product descriptions for about 600 products. I wish I could know for certain if this was the cause. I'll keep reading this blog. :)

- John
Dog Supplies; November 11, 2008 2:08 AM
info said...: RSS Aggregators such as Bloggapedia or Blogcatalog will take the legitimate site out of the search results, inluding quoted text (I think in the range of the first 200+ letters). This is a fact, not fiction. It is clearly a duplicate (aggregated) content but their site will be the only one in the Search Results.

So whatever you do, do not give your feeds to aggragators. I am resorted to putting IP bans on their bots.; November 11, 2008 11:18 AM
Rob:-] said...: All my web sites are set up to deliver the same page for the following URLs:

http://www.mydomain.com
http://www.mydomain.com/index.html
http://mydomain.com
http://mydomain.com/index.html

I'm guessing from this post that there is no penalty for any of these. Is that correct?

I only publish the mydomain.com on printed materials but some people still put the www. on the front out of habit.

Why does anybody use the www. on the front anymore? It seems like a waste of four keystrokes and bandwidth.; November 14, 2008 11:42 AM
info said...: www. is actually a subdomain and could have a different content. usually it doesn't.; November 14, 2008 11:49 AM
info said...: to "new laptop battery"

the issue of content is a separate story. in regards of why sites "disappear" from the google index has to do with algorithm. not only they are not present in the "ommitted" results, but they are not indexed at all on the "direct quote" if let's say some feed aggregator with higher ranking has an exact copy of your "original" but lower ranked site. i run many tests on many sites and i think i nailed this problem. my advice is not to submit your feed to aggregators as some of them will replace you in the search results.; December 2, 2008 6:29 PM
Greg said...: I run an image gallery and have a severe duplicate content issue. I used to rank #1 for all types of keywords, but now my site doesn't even rank.

I have not done anything to deceive anybody and feel very frustrated that no one can find my pages even when they do a detailed search for my photos.

I can't help that photos look like duplicate content to google. I do include a small description along with each photo but apparently that's not enough.

My pagerank on my homepage is 6 and on some of my photo category pages, they are 4.

I don't understand why Google has penalized my entire site. I can understand if individual photo pages are not in the index, but my category pages should be.; December 12, 2008 7:33 AM
Arthur Asker said...: Susan, Thank you for a well written post.

A. If i have two domain names "financialadvisors.com" and "financialadvisers.com" and i want to achieve the highest SERP for both spellings then should I create two web sites with duplicate content except for the spelling of advisor - this keyword will be on hundreds of pages so it would be clumsy to put both spellings in every article?

This would involve a lot more work in terms of linking strategy etc., but if you have one site with the consistent English spelling on hundreds of pages, then you would not have a high enough SERP to attract the American traffic?; January 21, 2009 6:32 AM
eRIZ said...: What about duplicate content in XSLT?

I've asked: http://www.google.com/support/forum/p/Webmasters/thread?tid=5c7e45eef252b0a1&hl=en

but nobody from Google staff answered.

Whatever, why don't Google have any method for contact?; February 12, 2009 1:08 PM
dickdetering said...: Webmaster tools is full of BUGS. I am 100% sure of that. My site is http://live-asian-webcam.com/ (consistantly everywhere with the slash in the end). After several weeks of 100% corrrect repair of my site, Google is still permanently warning me about duplicate content and not found urls. Several weeks ago I took the following measures:
1) robot.txt (google is unable to detect it while its own analysis tool reports that everything is correct !)
2) The robot txt file is even in the sitemap which has NO errors according to Google !
3) A perfect permanent Google friendly 301 redirect is NOT detected by Google.
4) According to Google the status of the resubmitted sitemap (including the robot txt file and including the homepage) is OK. However Google keeps on ignoring robots.txt, keeps on warning that it cannot find homepage etc.
5) Google completely ignores the canonical link element.
6) Google completely ignores a removal request of a duplicate page.
7) Google completely ignores a 100% correct internal and external linking structure.
What a bad software is that !!! It's about time Google fires some of its engineers !; March 28, 2009 10:50 PM
Hadara said...: Hi,
I am new to blogging. I publish my posts on both my personal blog and my company blog , all under my name. Is that ok? I asked around and no one can give a sure answer. I red your post passionately but I couldn't understand the answer to my question.. Thank you
Hadara; April 16, 2009 10:07 PM
Susan Moskwa said...: That is duplicate content (since the same content is available on two different sites), but doesn't sound manipulative. What will likely happen is that only one version will appear in search results--either the version from your personal blog or your corporate blog, but not both.; May 6, 2009 11:15 AM
J said...: This blog

http://www.informationengineer.org/

simply scraping my blog posts. And it takes over my ranking for my blog post previously was! Below are some other posts it copied exactly from my blog

http://www.informationengineer.org/2009/05/29/free-microsoft-powerpoint-download.html

http://www.informationengineer.org/2009/06/10/powerpoint-free-online-editor.html

http://www.informationengineer.org/2009/06/10/newmalwarej-trojan-remover-software.html

http://www.informationengineer.org/2009/06/10/newmalwarej-trojan-remover-software.html

http://www.informationengineer.org/2009/06/10/system-security-451-serial.html

http://www.informationengineer.org/2009/06/09/flv-to-mpg.html

http://www.informationengineer.org/2009/06/09/clear-cookies-in-firefox-by-using-cookies-remover.html

I wondering if Google says it is very smart then why it can't differentiate between the original or not? If you see that site is built solely on scraping other people's content and never get banned.; June 10, 2009 9:56 PM
Ellithy said...: my blog Nogoom FM has got 180 duplicate urls because of showcomments= (urls generated for every comment by blogger)
I use re="canonical"
however duplicate links increase !!; June 25, 2009 5:47 AM
belhana said...: thanks; June 29, 2009 12:42 AM
Mark said...: Hi Susan,

Chiming in here a little late but we have a situation which is becoming common place and I wonder what your take on it is.

Our German site is hosted on a .de TLD. We do well in Germany but in Austria and Switzerland, also German speaking countries, we are lagging behind. We want to have a .ch and .at version of our site as these TLD's are more localized.

My plan is to duplicate the site onto other German language domains and host these sites from their respective countries. I am however, worried that this might have an adverse effect on our main .de site.

Please let me know your thoughts as I really don’t want to create any problems for our main site or waste time by duplicating the sites on other German TLD's

Thanks in advance,

Mark; July 1, 2009 12:30 AM
danwan2 said...: Let's say I write a very good article and other bloggers want to republish that. Let's say 100 Blogger copy my article, repost it on their blog with a link back to my site.

What will happen?

When I understand it correctly, the other pages probably will not show up in the search results. But what about my site. Will the backlinks from the other blogs help my site to rank higher?

What if they have other articles on the same page? Is duplicate content calculated on a per sentence, paragraph or per page?

Thanks
Dan; August 16, 2009 11:55 AM
Malathy said...: More than the original post, the comments and answers to the comment threw more light on the duplicate content issue.
I thank everyone who has posted questions and shared their voice.; September 4, 2009 5:55 PM
Mark said...: If you know who (or what site) is re-posting your original content or scraping it, put that site address into Whoisit and find who owns the site and where it is hosted.
Send an email to abuse@thehostingcompany with a followed registered letter demanding copyright infringement. Include links to your original content (with proof it is yous, dates on the posts should do it) as well as links to the offenders on that host's servers and I have never seen a case where that domain wasn't taken down with-in a few days. (By law they have 30.) If the site is not taken down, you can get a lawyer and sue both the host for allowing copyrighted (intellectual property) materials on their servers 30 days after being notified of the infringement as well as those infringing your rights. These are international laws but not all countries are signatories, so there are some times that you have little or no recourse.; September 21, 2009 11:13 AM
ml30 said...: I recently re-wrote all of my duplicate HTML about 2 months ago, but Google Spider is still reflecting the old duplicate content in Webmaster tools?

Any idea how this can be reconsidered by google.

Thanks
Martin
(Low to lower webmaster savviness!); October 9, 2009 7:05 AM
blog header guy said...: Well, I'm still mystified despite your attempt at demystification. :)

Hypothetical:

I write an article about acquiring public domain content and submit a variation of this article to several article directories for publication.

I also post it on my sites new blog.

I am the author, so why would I be penalized for duplicate content on my own site?

I can certainly appreciate your dilemma, but there should be some recourse for me to get my content indexed.

Should I post my content first on my blog and then submit it to an article directory?

Thanks.; October 13, 2009 6:45 AM
Micheal said...: hi Susan

thanks for the article.

In my case, I have a domain name that uses the word "university" and I know that some clients use that word, and some use "college".

So, i'd like to capture those customers that search for university and college.

My experience is that people value the domain name as an indicator of the quality of the search result, so I'd like to use collegeurl.com and universityurl.com for people who search for "college" and "university" respectively.

thing is, if I use separate websites, apart from the admin hassle, the content will be generally the same across both. So, I run the risk of content duplication, if I understand the rules correctly.

What can I do? I'd like customers who think "college" to think that my site "collegeurl" is the one they need, and similarly for people who think "university".

If I had cookies for sale in the US, and wanted UK people to buy them, I'd expect them to search for "biscuits". But same product, same site more or less.

my feeling is that people value the domain name as an indicator of quality, so I'd like to use the domain name where possible.

Can I have two very similar websites, or can I do a 301 to one of them and that will be OK for SEO?; October 26, 2009 9:43 PM
angrezy said...: I would like to know duplicate content with respective to Tags and Catgories.
Eg: If my category name is jackson-songs

And i created a tag with jackson , songs.

Does this become a duplicate content?; December 7, 2009 1:39 PM
s said...: Thank you for clearing this up for us. However i am still confused. You say "There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that...There are some penalties that are related to the idea of having the same content as another site..." Isn't having the same content on another site called duplicate content? For example if i have an article published in many different article directories, what would be Google's reaction to that? would that be penalized or not?
Thank you very much in advance.; December 19, 2009 3:26 PM
info said...: if their site has higher ranking, i.e. amazon.com your site will be excluded from "search results". that is the way it works now.; December 21, 2009 6:05 PM
Susan Moskwa said...: @s: Yes, but there are some cases where this is okay and some where it's bad. If other people are republishing your article with your permission, that shouldn't be a problem. If someone is scraping your content and republishing it without your permissions, that's a problem. You have to look at the context in which duplication is happening.; December 22, 2009 9:55 AM
Dries said...: How will Google react if an amount of data has been thrown away (let's say 4 domains with exact duplicate content) and redirects are all properly set to the new pages? Does that lift a penalty (if there is one on those domains actually)?; January 12, 2010 9:43 AM
Paul Luther said...: I think part of what contributes to the confusion surrounding this topic comes from Google itself.

Just look at what appears to be contradictions.

At the start of the article you state: "There's no such thing as a duplicate content penalty".

Then you go on to say:
"There are some penalties that are related to the idea of having the
same content"

Ok. So there are some things that one could do that would be related to duplicate content but not exactly the same thing.

But then you advise:
"Don't create multiple pages, subdomains, or domains with
substantially duplicate content".

Also:

"Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately it's unlikely that one of those ways will be a penalty..."

"...but unless you've been duplicating deliberately it's unlikely that one of those ways will be a penalty"

As you can see, even on the same page there is confusion.

Why would someone have to worry about using duplicate content deliberately if you stated that "There's no such thing as a "duplicate content penalty." ?

Either there are cases that you can be penalized for using duplicate content, which would mean your opening statement was not a correct statement.

Or one should not worry about being penalized for using duplicate content. Which makes your last statement incorrect.

This contributes to the confusion.
Especially when one person can't even agree with their own statements on the same page.; February 8, 2010 1:53 AM
Dave said...: OK, I've always been worried about duplicate content - particularly with regards to articles I've written. So many 'SEO experts' say you need to spin every article to make it appear unique. I write articles for my blogs (example Healthy Herbal Living and then submit the same articles to sites like Ideamarketers and Articlebase in the hope of getting recognition, syndication and, ultimately, backlinks. Perhaps now I can worry a little less about being penalised for having identical content, although I still feel a little uneasy!; February 11, 2010 2:07 PM
Editor said...: That seems a little confusing because I think of duplicate content in an entirely different way.
Isn't it naturally going to "penalize" one site or another if google seeks variety in top 10?

Are you telling me a site that plagerizes mine could end up better than me? Hypothetically what if a major site with tons of authority like the new york times started scraping every site and targeting all the #1 keywords they could find? What's to stop them from being thought of as the "original source" simply because they already have so much authority and history and site age and so on?; February 11, 2010 7:49 PM
FurnessPoker said...: Great article, it has almost always been a concern of the seo side of my site.; February 12, 2010 7:10 PM
Johnny Walker said...: I'd love some advice on a possible side-effect on your "duplicate content penalties". I recently put together a guide on a forum. Once it was finished I moved the guide over to my blog (ie. deleted the guide from the forum) and put a link on the forum directing the user to the new content.

Unfortunately for me, Google scraped my blog before it updated its cache of the forum. So according to its calculations, I had ripped content off a forum and put it on my blog.

My blog took a massive hit because of this.

A few days later it updated its cache of the forum, so it now has a link pointing to the content's new home... but my Blog is still being penalised.

Your system seems illequipped to handle content being legitimately MOVED (not copied) to a different domain.; February 23, 2010 6:59 AM
JadeDragon said...: So what happened when eHow decided to clone the whole eHow.com site as a allegedly eHow UK site. Many of the writers took a big hit in earnings as a result. eHow then redirected URLs but the redirects caused many pages to fall from the SERPs completely, even when searching for the exact article title.

I can't see the point of geo-localization when the information is universal like most eHow articles.

Is the information in this article still correct?; March 8, 2010 11:37 AM
say what? said...: What about duplicate content that is only a small percentage of the page? For example, my site has two pages about MPLS VPN. On the main MPLS VPN page, it has one paragraph that is the same as another one on an MPLS VPN sub page. But there is a lot of different content as well. Would Google treat these as duplicate pages?; March 16, 2010 8:42 AM
Psysum said...: Hi,

What if i post same article content of 2 different websites.

1) Posting my article first on a website whose ranking is lower than the other.
2) What would be the time or period of reposting the article
3) Would the lower rank site would be benefitted to have the article 1st?

Thanks for the help; March 30, 2010 8:26 AM
Kris said...: I noticed Google indexing my page over http and https, would that be consideration duplication content? Why would Google index over https when I don't have any links over https on my website?; March 31, 2010 3:45 PM
Karol said...: So here is my question.

I provide (sell) specialized content to multiple websites. They use the content as a monthly post/bulletin on their websites.

Each website has about 20 pages and they are unique in every way other than that one page.

Will that one page which holds "duplicate data" harm their rankings in anyway? At any given time there could be tens of websites with this duplicate content.

Thanks,
PJ; March 31, 2010 5:47 PM
Susan Moskwa said...: @Kris: Yes, that would be considered duplication, but this is a perfect example of unintentional duplication that shouldn't hurt your site in any way -- Google will just pick one version (http or https) to show in search results.

This could have happened because someone else linked to your site using https. It could also happen if you have relative links on your site; if someone enters an https part of your site and then clicks on a relative link, that link will be in https too.; April 5, 2010 1:35 PM
Susan Moskwa said...: @Karol: If the site has mostly unique content but a page or two with your syndicated content, the syndicated pages will be competing against other URLs with that syndicated content, but the rest of the site (with original content) shouldn't be affected.

If, however, the site is mostly made up of syndicated or copied content (perhaps from a variety of sources), it's unlikely to rank well and could potentially be considered in violation of our guidelines.; April 5, 2010 1:38 PM
info said...: Yes. Google is good about it. Content aggregators from other sites should be punished.; April 5, 2010 6:01 PM
Terry and Tony said...: I like what Stuey said on September 16, 2008 1:24 AM, as the same happens with our sites. Some of our good content or images are taken by sites who already do quite well in the search results on Google, and then appear higher up on page one or two than we do for the same search terms - when it was OUR original content to start with. It hasn't happened too often, but as we add more and more pages to our sites and blogs, it is starting to happen a bit more. We find it's not the end of the world, or that we don't get too upset, if at least they link to our original post or article too, but they don't always.; April 9, 2010 1:56 PM
Daniel Kamen said...: Hey Guys. What if I have a site and publish my press release. Then I share it with PR sites or document sharing sites. Is this considered duplicate content?; April 22, 2010 11:58 AM
Clean Star said...: I found this post very interesting and it brings up two questions in me:
I'm using a CMS for my site which creates non search engine friendly urls and also a plugin that redirects to search engine friendly urls. Now I have noticed Google crawls both kind of urls and I'm wondering if this means duplicate content or not.
Also for my shop I receive the product descriptions from my supplier and I know I am not the only vendor he has. So there will be more websites out there using the same product descriptions and probably the same category names.
Is this hurting my site and is this a reason why google decreases steadily the amount of indexed pages for my shop from once more than 2000 out of around 2500 (Oct-dec. 2009) to by now less than 600!!! out of more than 2700 pages.; April 26, 2010 10:16 AM
Hikari said...: Nice!

I have alternative domains in my site providing the exact same content. I was told I'd be penalized because of that, but I've seen foruns have alternative domains too and they are well ranked.

Don't penalize duplicate content, because that may harm innocent sites. Just choose the most relevant URL/domain and use it.; May 4, 2010 5:26 PM
Will said...: Hi Susan,

A very interesting read, thank you. I have a specific question relating to 'tag' pages in WordPress - I wonder how these post list pages affect duplicate content?

They tend to use excerpts that display the first 55 words of a post. These tag/post lists pages do not display the entire post - but the 55 words they display, are they enough to count as duplicate content and affect the ranking for the original post?

Thank you

Will; August 28, 2010 9:43 AM
Susan Moskwa said...: @Will: They could potentially be considered duplicate content depending on a searcher's query (if the user searches for text that appears in that snippet as well as in the original article). Keep in mind, though, that it doesn't "affect the ranking" of the original post other than that search engines are likely to display one URL and not the other. (It's not like it just makes the original post rank lower in general.); August 30, 2010 1:27 PM
Will said...: @Susan Moskwa

Thank you for the reply, good to know that the ranking for the original post will not be affected.

fyi - In the past few days I've started using a couple of features in WordPress that seem useful:

1) There is the option to write a manual excerpt for each post, this will be used in place of the "first 55-words excerpt" if present. I find this useful not only from a duplicate content point of view, but also because the first 55-words of a post don't always give an accurate reflection of its content. This manual excerpt is often similar to the meta description for the post.

2) Use Tag Descriptions! The nice thing about them is that they can be used to display information for a specific tag before all the posts for that tag are listed. I think this is really good because if a visitor to a WordPress website clicks through to a tag page/post list, it might not mean that much to them to see a whole bunch of posts with excerpts showing. The visitor might click away thinking it's not what they want. So as well as helping with duplicate content, tag descriptions can be used to enhance the tag, and give visitors the information they want. Tag pages no longer have to be just posts lists, they can be unique pages too.

Will; September 5, 2010 5:22 AM
kurtp said...: How about content that is adjusted for location? I manage a mobile notary site in San Diego that has a dozen different area pages talking about notary work in those areas with links to some other internal pages. The content isn't completely duplicate, but there are elements that are the same. Might I be penalized for this? I sure hope not, as these are natural and important.

Kurt
San Diego Mobile Notary; September 20, 2010 3:11 PM
cantstopdigging said...: Thanks for clarifying, I was afraid of having duplicate content, but now I can see there is nothing wrong with it and Google can deal with it without penalizing webmasters.; October 14, 2010 6:30 AM
EgyDoctor said...: i have a blog about free vista programs and tips http://vista-progs.blogspot.com and i used to publish my posts in different related forum for visitors to identify my blog would google thinks it is a duplicate content .... and how could i overcome its drawbacks on my blog seo and indexing...
and how about leaving an anchor text in this duplicate post ... could have any benefit or not ???
: (; October 26, 2010 3:14 PM
www.oblong.org said...: Hello Google,
I have a duplicate video on 2 different pages. Can I apply a Canonical tag to just the "object" i.e. the video.; November 12, 2010 7:16 AM
Phil Page said...: I have a personal blog and a corporate blog. The personal blog has some Google advertising, while the corporate blog does not. I write short articles for the corporate blog, and then add paragraphs within or at the end to expand on the content in the personal blog. I'm the author of all information; some is duplicated on one the corporate site, but adds content and value on the personal site. I dont want to do redirects or links back and forth. I'd like both to be recognized in a search engine since they are for different audiences and different purposes, but both authored by me.; November 18, 2010 4:35 AM
Phil Page said...: This comment has been removed by the author.; November 18, 2010 4:35 AM
Phil Page said...: One more question regarding syndication. My corporate website content will be syndicated by other customer sites, while recognizing the original corporate website. When finding the duplicated content, does Google just recognize the 'original' content and then disregard the syndicated for ranking purposes?; November 18, 2010 5:43 AM
Susan Moskwa said...: @Phil Page: If you publish posts as you describe in your first comment, without doing anything to indicate to search engines that they're duplicates, it's likely that search engines will simply choose which one to show to searchers based on each searcher's query. This sounds like what you'd like to happen.

Re: your question about syndication, Google tries to return whichever URL it thinks is most relevant to a user's query. Often this will be the original post (if an article is syndicated verbatim in more that one location), but that could vary if there's some content on one of the syndication sites that makes their URL more relevant to a particular user's query (for example, if the query was related to your article's topic but also related to their business's site somehow). If you don't ever want syndicated versions of your articles outranking your own, you might want to require others to put a rel="canonical" or a noindex meta tag on their versions of your articles.; November 29, 2010 3:14 PM
Global Success Network said...: Thanks for this information ! We shall write more original content on our blogs and website !; November 30, 2010 2:45 AM
PL@Yb0y said...: Dear Sir!
I want to ask you what happens if i publish a post on my blog and also publish the same post on EzineArticles.com, will it affect me in a negative way?; December 25, 2010 11:29 AM
Frank said...: If I have a shopping cart program on one domain and then simply copy the database and cart program to another domain, like www.paintball.co.za/catalog and www.paintballsouthafrica.co.za, will I get penalised?; January 4, 2011 8:47 AM
Walter said...: My client has a substantial print media advertising budget and wants to place unique URL's in each print ad for tracking purposes. They must be brief, so the solution she envisions is multiple domains with identical home pages and links from the home pages to the main site. I would use a robots.txt file to preclude content indexing. Is this a viable solution?; January 6, 2011 9:24 AM
John said...: It seems to me Google doesn't know who wrote the original materials and that seems to be a big issue with lots of folks and duplicate content. I've notice that some of my pages have now been dupicated word for word and I'm sure that has hurt my page rankings; though Google may indicate the contrary. Copyscape has pointed out many pages that I wrote that are now on other people's sites. Most are sites in foreign countries where I don't have any recourse to fix the problem. I don't think there's any way to prevent this and Google is going to continue penalizing people for their own works that have been copied. In essence, it forces people who write original content to continually revise their materials substantially to avoid the penalty. There are not enough hours in the day to accomplish this task. So, I think Google is being difficult and making it harder to do business on the internet by demanding "quality, original content". After 10 minutes of brand new material being on line, it's already being copied. The vicious cycle being all over again.

In my honest opinion, Google is going to have to find another way of evaluating content on sites. Otherwise folks will eventually begin to look for other avenues to conduct business. Where there is a will, there's a way. Google may be king now, but later it will eventually come back and hurt them in other ways. Facebook is actively looking at ways of taking away the market share Google now enjoys. If not careful, Google may eventually find itself in 2nd, 3rd or last place.; February 15, 2011 11:32 AM
Peter said...: I should preface this by saying I dont how Google decides on if a site has copied data or not.
But what about this scenario :-

If Bob has a domain that is 10 years old and full of his own content.

His neighbour Bill starts a new business / domain and posts his content.

Bill and Bob are competitors - so Bob decides to copy word for word Bill's content page and post on his own website.

How does Google decide which content was original and which isnt ? As I believe it - Google will check the age of the domain, so in this case Google will decide that Bob was first, when in reality it was Bill.

I'd love your thoughts on this; February 26, 2011 11:13 AM
Susan Moskwa said...: @Peter:
Google's goal is to return the content that is most relevant to a user's query. We don't just "relevance" purely on the age of each domain. Signals of authority and trustworthiness--such as how long a piece of content has been live, or who links to it--factor into our decision, among others. Often if someone scrapes someone else's website, the original site/content will still have more authority signals since it's been around longer, so it will be able to outperform the scraper. We are also able to tell when the content on a particular URL has changed; so just because Bob's domain is older than Bill's doesn't mean we automatically assume it originated the content.; February 28, 2011 10:52 AM
Red Wrappings said...: We have an e-commerce site that has multiple country TLDs. We serve the same product to each country but the currency is tailored to the country, so the US is in US$, Australia is in AU$ and the UK is in GBpounds. There are also differing delivery conditions in different countries. However the description of each product is the same. We have no wish to deceive anyone, just provide prices and delivery conditions relevant to the customer in their country. We suspect however that our traffic may be inadvertently diluted because of this strategy. Can you comment on the likely result of this strategy and suggest the best course of action for us whilst retaining multiple currencies, thank you.; March 2, 2011 4:17 AM
Susan Moskwa said...: @Red Wrappings:
Generally, I recommend that you take everything that's similar between these pages, and put it on one URL; and take everything that's different (prices, shipping, local addresses, etc.) and either put that on different pages, or in a section of the original page where users can pick which one applies to them. For example, you could set a cookie such that once a user selects their locale, they're always shown the relevant prices (with an option to change their locale later on), and anyone without the cookie set will see all the options.; March 7, 2011 2:57 PM
Vijay's Blog said...: Hi John Mu,
Thanks for the Google references above.
Friends of mine own 3 dental clinics in Melbourne.
I am about to create separate websites for each of them (all will be hosted on the same server):
• http://www.smile-point.com.au
• http://www.wyndham-smilecare.com.au
• htttp://www.manor-lakes-dental.com.au
They wish to create a knowledge bank (blog) of tooth problems/remedies for each of their sites.
All 3 of their sites are of equal importance/relevance (i.e. no canonical links).

Would they be penalised if they were to post the same content to 3 separate blogs?
Whats the best way to go about this?

Cheers, Vijay
melbournevijay at gmail dot com; March 12, 2011 4:02 PM
Richard said...: Two questions:

A: I "deliberately" have duplicate content on two different sites, my own website, and www.articlesbase.com. Articlesbase has an Alexa ranking of about 1000, which far surpasses my site. So, even if the individual webpages are roughly equal in hits, ranking, etc., does my duplicated articles on another website penalize my own website by forcing the crawlers to choose my duplicated content to display from the other site?

B: More than one other website has copied and pasted by original work without asking, and they both have a higher Alexa rating than my site. How can I protect my original writing? If I complain to Google, will they do anything, and even if they do something, will that help my site get off page 10 on the search engines?; March 25, 2011 8:59 PM
Daniel Kamen said...: my two sense is initially your internal links will lose the value as if they were coming from an external site, but you also must consider if these sites have same content, have the same look and feel, interlink and on same server there is the potential for them to think there is something fishy happening and getting a penalty. Always safer and better to have on different serversg; March 28, 2011 1:40 PM
sumi said...: there was spelling mistake on google india news webpage dated 29-mar-2011 09:15 am. in top right section of cricket updates. new zealand "WOM" the match instead of "WON". to whom shall we inform for correcting the spelling mistake?; March 28, 2011 8:55 PM
WWu777 said...: What about forum archives that create SEO friendly html static pages of my forum? Is that considered duplicate content? Here is my forum archive for instance:

http://www.happierabroad.com/forum/archive/index.php; April 17, 2011 9:44 PM
PC Tricks Tips said...: How do we check if someone copying our blog or web site content and post them in blog, website or forum?; July 1, 2011 1:26 AM
3eyegroup said...: Hi, i have one site, but i want to buy another domains point to same site.
ex: my site is www.domain.com and i buy www.otherdomain.com but both have the same hosting and the same site.
Google maybe penalized my website?; July 26, 2011 8:14 AM
Susan Moskwa said...: @3eyegroup: Make sure you 301 redirect the new domain(s) to your preferred domain, rather than serving a copy of the site at each URL, and you should be fine.; July 27, 2011 3:03 PM
Phil Page said...: How does google handle press releases? They are often reprinted word-for word on other blogs, as is their purpose. I receive Google updates every day of websites cut and pasting press releases. Are they penalized, or are the original websites posting the press releases penalized?; July 29, 2011 8:24 AM
nathey20 said...: Awwwwhhh I love you Google! You guys seem so nice!!

I promise the next time someone mentions a duplicate content penalty that I'll point them in this pages' direction! :-); September 27, 2011 1:55 AM
beafarr said...: Hi there, i have the following situation. A client of mine has a corporate site in English. Lets say this corporate site URL is: www.name.com. This client has many dealers (they sale a product) on 16 countries, so we bought domains such as: www.name.fi, www.name.ae, www.name.ca and so on.

The plan is that while all the content from www.name.com gets translated into all the different languages for all the dealers, in the mean time we want to have at least an English version under all the 16 domains- this content will have the same content as the corporative site (www.name.com) cause all the dealers sell the same products, and all have the same features etc… The corporate site is a purely informative site (product features that we cant change form dealer to dealer of course)

The question is: Do I get penalized by Google by doing this? It will be duplicated content but all is coming form the same company but on different URL for the 16 countries.

When we have all the translation for the 16 countries, we will add them and maybe we will take out the English version after that.

Can someone help me???
Thanks; September 29, 2011 3:41 AM
Kuldeep Bansal said...: i want to know why google decreased links of my site allexam-results.com & Bollywoodstorm.com from searching results ?; October 19, 2011 6:03 AM
The Final Webmaster. . said...: hey guys i have a confusing issue, i.e i have a blog which is about bloggger tricks,the issue is few 10 of other blogs have same html codes which i have posted in my blog post. .text content is different but few html codes are same n few other blogs. .does this creates a duplicate content issue??pls help me. .if possible pls review my blog http://pccafeworld.blogspot.com and suggest me which is duplicate content in eyes of google; November 3, 2011 10:34 PM
Ezekiel said...: I am working on a redesign of an existing website. Do to a few IT complications it is likely that the original site and new site (on a different domain) will be active for quite sometime. The content from the old site will be used on the new site.

While this clearly is duplicate content, is there a way to avoid penalization?; November 9, 2011 10:13 AM
Gina said...: I have a right-hand column that is the same on all my pages.

At the top, it has links to my "Top 5 pages", and under that some comments and compliments from visitors.

Would that be considered as duplicate content?; November 21, 2011 8:13 PM
Chankey Pathak said...: Someone is copying my site’s (www.chankeypathak.com) contents to his blog (chankey-pathake.blogspot.com). What should I do? I have already submitted a request to the Google. Is that enough?; December 9, 2011 12:48 PM
Gina said...: @Chankey Pathak: did you report the copyright infringement by clicking on "Report Abuse" at the top of the copycat's blog?

That's the place to do it.

I see that she has about 24 blogs, and all of them are Blogger blogs, no doubt they are all stolen -- it seems she just changes one letter, as she did for yours (chankey-pathake) -- another one she stole is thedirectdownload.com which she renamed 2thedirectdownload.com -- check the list, you'll see what I mean. Just look at the blog title on her profile, click on the link and you can clearly see how she has added a letter or number to an existing blog.

You could write to all the owners of the original blogs and denounce her; hopefully, they will all report her and Google will ban her for good.

Also, there have been new posts from Google about copying content since this one, and one of the things that they say is that Google is able to recognize which content came first, so in that sense you are protected.

I see you have posted insults on her blog, which she didn't even bother deleting. This is not the best strategy on your part. Do it the smart way instead.

Good luck!; December 11, 2011 9:58 AM
Arizona Mildman said...: I see a million (maybe an exaggeration) sites out there now trying to sell what they call Spintax Software that automatically paraphrases all words in an article except for the keywords you want repeated, in order to "AVOID DUPLICATE CONTENT" with a loosely worded post from Google that they say is a direct quote from Google about duplicate pages not being allowed. Accordiing to this article, unless you are a "scraper" (non intelligent context thief without the brains to paraphrase and edit what you just read) and can't write and investigate your own material, then, this whole industry is a scam and a way for software people to sell something to people who have too much time and money on their hands. Am I correct in assuming this?; January 5, 2012 12:10 PM
Artsy Chicas said...: Nobody seems to have answered any posts regarding legitimate duplication e.g.
1. geolocalisation: when a company sells products in more than one country or region
2. When large resellers e.g. Amazon, BestBuy, Asda sell the same range of products and subscribe to the same content providers for product enrichment data e.g. descriptions, vinification for wines, recipes etc.

Are we to assume they get round this by buying their way into the indexes and that smaller businesses that want to do the same thing will have to suffer because Google can't accommodate multi-region\multi-tenant sites?; January 6, 2012 2:57 AM
Susan Moskwa said...: This is a great solution for multilingual/multiregional content that's effectively duplicate.

For reseller sites, it may be hard to outrank the original source or other resellers if you're using the exact same product description or content feed that other sites do. Writing your own original content can help, especially if it's good -- www.woot.com is a great example of original copywriting helping to sell products.; January 17, 2012 11:30 AM
Education at your Control said...: What about symbolic links?? We own several domains, all hosted on our own dedicated server. Each domain maintains a unique front page and symbolic links to our main site's shopping cart in a sub-directory. Is this acceptable? Will one or all get penalized?; April 19, 2012 10:25 AM
Inspirational Quotes said...: I have a quotes blog with blogspot, and I recently pinged the blog with Pingomatic. I think it is highly likely I used the .co.uk domain, rather than the standard .com accidently, as I copy and paste the url into Pingomatic.

Since then, my blog has dropped out on Google search. Would pinging the blog from the local domain .co.uk rather than .com affect the blog and its position? Is it possible that Google would see this as copied content and drop the blog from its search results?; April 29, 2012 5:21 AM
Kelly Watt said...: I understand there is a no index file you can embed so Google does not index URLs that have duplicate content, such as on a local host during the web development process. If so, how can you do this to prevent the website in development from being indexed before you point the domain to the completed site files? Also, is it possible to rectify old indexed pages from a local host during development after the new site is live, and there are duplicate pages indexed in Google? Examples are found on our company website www.wikads.com where indexed pages from other sites can be found from the past.; May 11, 2012 9:29 PM
Google Webmaster Central said...: Hi everyone,

Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.

Thanks and take care,
The Webmaster Central Team; June 6, 2012 4:08 AM

Demystifying the "duplicate content penalty"

Friday, September 12, 2008 at 8:30 AM

135 comments:

Useful links

Labels

We love feedback!

New to Webmaster Central?

Subscribe via email

More Blogs from Google