Preventing Duplicate Content
Posted on January 01, 2014
There are three ways duplicate content can happen on the Web. All three are deliberate acts and two are within the control of a webmaster. To get the uncontrollable out of the way first, this is about content that some other unscrupulous party plagiarized and posted as their own without acknowledging the author or copyright owner.
The Duplicate Blog Post: SEO Sin
Posting the exact same blog column, update or opinion piece is the most frequently-occurring case of deliberately-duplicated content. This is content which is exactly the same, word for word, published on different sites. Let's say you have an article about tulips published on site A. You have an account on site B and you decide to publish the article there, as well. In fact, there are "black hat" SEO practitioners (so called because they are the underground, somewhat shady kind) who are prone to spam dozens of sites with duplicate content just to get numerous links that are a plus for search engines. This is duplicate content that, expert opinion holds, the search engines are likely to notice. In the best case, one of the sites will not be indexed. At worst, both could be penalized with lower page rank. The reason is that Google and other search engines are said to have a bias for fresh, original content and accordingly, they penalize wrongdoers.
When to use 301 redirection?
And then there is the legitimate situation of temporarily having the same content on two or more pages or domains. This can happen when the webmaster has made changes to the URL structure of the site or is in the process of migrating it to a new domain. Since it is desirable to maintain the hard-earned ranking of the original pages, a canny webmaster can opt to preserve the inbound links. The way to do this is with an advanced SEO technique: 301 redirect code is one of the best ways to avoid penalties for temporarily-duplicated content. When a visitor enters the original page URL, 301 redirects seamlessly tell visitors (and search engine spiders, as well) that the page has moved and shift them quickly to the new address. Needless to say, you must remove the original content eventually and then redirect because if not, you clearly have duplicate content. These 301 redirects are implemented on the Web server (Apache or IIS) or via PHP/ASP, obviously a task system engineers can carry out.