Duplicate Content is essentially content that search engines recognize as being re-posted or re-purposed. Duplicate content can be a product of webmaster oversight or sloppiness, or an attempt by unscrupulous webmasters to manipulate search rankings for their site by gaming the Google system. Either way, duplicate content can negatively affect your web efforts in the form of Google penalties and dilution of keyword focus. Here are a few of the more common types of duplicate content.
Duplicate Content By Bad Design
Many websites contain duplicate content that is the product of bad web design. For example there are sites that use the same title tag across every one of their pages, thinking this will increase their keyword value for the terms they’ve indicated. This actually tends to draw keyword focus away from pages that should be associated with your targeted keyword, and redistributes the keyword value throughout the site willynilly.
Google and other search engines try to differentiate between websites that have duplicate content caused by a server misconfiguration or design flaw, and those that are genuinely ripping off other properties. This being said, it is advised you do everything in your power to keep this decision out of the “hands” of the search engines. Here are
Google’s guidelines for duplicate content.
Duplicate Content Caused By Scraping
Duplicate content can come from within a given website, or from without. For example, a site could have the term “Integrated Marketing” as the title tag of numerous blog pages, or they could have a full page of copy describing “Integrated Marketing” copied and pasted from an authority site like Wikipedia. If the content was recognized as being indexed by Wikipedia first, then little if any value for the content will be attributed to the copycat site. Furthermore, in instances where there are high percentages of duplication across a site, Google will take measures to lower the value of offending pages.
To provide some SEO background, in the early days of SEO, black hat webmasters looking to boost their site rankings would often copy a competitors high ranking pages word for word in order to poach their search value. There were many instances of entire sites being ripped off and the subsequent impostor websites outranking the original content for a given search. Google had to make changes to their algorithm to account for this tactic, and now have systems in place to devalue copied material.
Duplicate Content by CMS
With certain Content Management Systems like Joomla!, there are duplicate pages that get created automatically when a new page is created. For example, Joomla! by standard configuration creates a PDF version and print version of each new page created. These pages are identical in content, so search engines see each of the pages past the original indexed version as duplications. The preferred method of rectifying these duplicate pages is by either A:Turning off the option in Joomla that controls this feature, or B: Blocking Search Engines from crawling the PDF and Print pages with your Robots.txt file.
Don’t give search engines an excuse to devalue your website. Avoid duplicate content!