New Service Alert! Let us be part of your journey to success. SEO for Small Business is Here.

Where might you have seen our work?

Small places create combinations, but crosses that occur cannot provide many combinations. So be careful in making justifications, especially SEO.

Duplicate Content Guidelines

Last updated: Jun 03, 2022

Disclaimer: Our team is constantly compiling and adding new terms that are known throughout the SEO community and Google terminology. You may be sent through SEO Terms in cmlabs.co from third parties or links. Such external links are not investigated, or checked for accuracy and reliability by us. We do not assume responsibility for the accuracy or reliability of any information offered by third-party websites.

Avoid duplicate content illustration
Figure 1: Showing one pink sticky note with the words “Original” surrounded by many yellow sticky notes with the words “Copy” as an illustration of the duplicate content. Plagiarism or duplicate content occurs when one piece of content has the exact same content as another. Find a more complete explanation in the following guideline.

Avoiding Duplicate Content

Duplicate content is often created intentionally, however it will impact the site's ranking in the SERP, or get a penalty from Google. Get to know how to deal with duplicate content. 

Avoid Creating Duplicate Content

In general, duplicate content refers to the part of the content (small or large part) or even across a domain that is exactly the same as other content. Mostly, duplicate content is not aimed to deceive. Here are the examples of harmless duplicate content:

  • A discussion forum that can generate both plain and stripped-down pages targeted at mobile devices
  • Store items displayed or linked via different URLs
  • Printer-only version of the web page

If your site contains many pages with similar content, you can show your preferred URL to Google in a number of ways. (This is called “canonicalization”)

However, in some cases, content is intentionally duplicated across domains to manipulate search engine rankings. Duplication practices lead to a bad user experience because visitors see the repeated content that is essentially the same.

Google puts a lot of effort into indexing and displaying pages that have different information. For example, if your site has both “regular” and “printer-only” versions of each article, and none of them are blocked with the noindex meta tag, Google will choose one to list.

In some cases, Google retains duplicate content that is displayed with the intent of manipulating rankings and defrauding users. Google will also make appropriate adjustments in the indexing and ranking of these sites.

As a result, a site's ranking may be decreased or be removed entirely from Google's index, so it won't appear in search results.

There are steps you can take to proactively address the issue of duplicate content, and ensure readers see the content they intended.

1. Use 301

If you have reorganized your website, use a 301 redirect (“RedirectPermanent”) in your .htaccess file to redirect users, Googlebot, and other spiders. (In Apache, you can do this using the .htaccess file; in IIS, you can do this through the admin console.)

2. Maintain the consistency

Maintain the consistency of your internal linking. For example, do not link to:

http://www.example.com/page/ 

cmlabs

and

http://www.example.com/page 

cmlabs

also

http://www.example.com/page/index.html

cmlabs

3. Use a top-level domain (TLD)

Top level domain example
Figure 2: The image above shows some examples of top level domains (TLDs). The TLD domain is .gov for government sites; .int for international; .info for informational purposes; .net for internet/network purposes; .com for commercial sites; .org for organizational sites; .pro for professional use; .eu is the domain used in Europe; .edu for educational sites; and others not mentioned in the picture.

To help Google serve the right version of a document, use a top-level domain to handle country-specific content. Google is more likely to know that

http://www.example.de

cmlabs

contains content focused on Germany, compared with

http://www.example.com/de

cmlabs

or

http://de.example.com

cmlabs

4. Merge (Syndicate) content carefully

If you syndicate content on other sites, Google will always show the version it deems most suitable for users in each particular search, which could be the version you prefer.

However, it would be better if each site where your content was listed also provided a backlink to your original article. You can also ask the party who uses your content to use the noindex meta tag to prevent search engines from indexing their version of the content.

5. Minimize boilerplate repetition

For example, instead of including lengthy copyright text at the bottom of each page, it's a good idea to include a brief summary and then links it to a page with more details.

Additionally, you can use the Parameter Handling Tool to define how you want Google to handle URL parameters.

6. Don't publish stub

Users don't like seeing “blank” pages, so you need to avoid it. For example, don't publish pages that have no content. If you are creating a placeholder page, use the noindex meta tag to block the page from indexing.

7. Understand your content management system

Make sure you are familiar with how the content on the website looks. Blogs, forums, and related systems often display the same content in multiple formats. For example, a blog entry can appear on the blog homepage, on the archive page, and on other entry pages with the same label.

8. Minimize similar content

If you have a lot of similar pages, consider adding content for each page or merging the page. For example, if you have a travel site with separate pages for two cities, but the information on both pages is the same, you can combine them into a single page that talks about both cities, or add content on each page to have unique content about each city.

Google does not recommend blocking crawler access to duplicate content on your site, either by a robots.txt file or by other methods. If you can't crawl pages that contain duplicate content, search engines can't automatically detect that these URLs point to the same content, and you should therefore treat them effectively as separate, unique pages.

Another solution is to allow search engines to crawl URLs, but mark them as duplicates using the rel=”canonical” link element, URL parameter handling tool, or a 301 redirect. If the duplicate content causes Google to crawl your site too often, you can also adjust crawling speed in Search Console.

Duplicate content on a site is not intended to be malicious unless the duplicate content is intended to deceive and manipulate search engine results. If your site is experiencing a duplicate content issue, and you don't follow the guidelines mentioned above, Google will select the version of the content to show in search results.

However, if a Google review states that you are engaging in fraudulent practices and your site has been removed from Google search results, try to thoroughly review your site.

If your site has been removed from Google search results, review the  Google Webmaster Guidelines for more information. Once you've made the changes and are sure that your site no longer violates the guidelines, submit your site for Google reconsideration.

In some situations, Google's algorithm may select URLs from external sites that host your content without your permission. If you believe that another site is duplicating your content in a way that violates copyright laws, contact the site host to request removal.

Additionally, you can request Google to remove pages that duplicate your content from search results by making a request under the Digital Millennium Copyright Act.

That is the guideline to avoid duplicate content to avoid copyright infringement by Google.

Use the Content Guidelines for Your Website

AGENCY WEBSITE

Agency websites should have no duplicate content to avoid penalties from Google.

E-COMMERCE WEBSITE

E-commerce websites are particularly prone to duplicate content because the sales pitch is similar

BRAND WEBSITE

The brand website must be free from duplication so that the quality of the content is good  for the search engines

BLOG WEBSITE 

Blog websites are able to reduce duplicate content by regularly checking it with a plagiarism checker.

cmlabs

cmlabs

Note: We have attached some of the most common questions asked by users below, along with their answers. To use the cmlabs Keyword Ranking Tracker application, you don't need to request for a quote from marketing. Please click login to the application.

WDYT, you like my article?

Need help?

Tell us your SEO needs, our marketing team will help you find the best solution

Marketing Teams

Laras

Marketing

Ask Me
Marketing Teams

Vanessa

Marketing

Ask Me

As an alternative, you can schedule a conference call with our team

Schedule a Meeting?