How Google Might Identify Duplicate Content

As SEOs We have all wondered just how Google decides which piece of content is the original source over the duplicate. It is extremly frustrating when you notice somebody ranking with content that you have produced instead of your own page. So how does Google decide which is the duplicate content?

We will never know for certain but reading into a new patent today it seems that in order to decifer which page to display out of a selection of teh same or extremly similar content they look for “quality signals”. Just what these signals are we can’t be 100% but we can assume that they will take into account:

Link information

Creation date

Some form of page scoring (possibly page rank as a page with some PR may indicate it has been round longer)

Anchor text information

Popularity information (social buzz surrounding a piece)

Age of the website producing the content

Don’t get me wrong I am not saying that the above triggers are indications of ranking factors. All I am saying is that the above may be used when deciding upon the original content so don’t get confused.

What we can take from the new patent is that Google is definatly going to be getting better at identifying original content so always ensure you produce the new stuff rather than copying.

Another takeaway that might help us understand what may be coming from Google that may help identify original content is the release of the new “News Attribution Metatags”

These tags are here for the benefit of the news producers. The idea is that if you are the source of a news article you tag it to say so. if you are syndicating that content you tag it as the syndicator. Google can then quickly look and display accordingly.

You may be thinking that if its that easy I will tag all my duplicate content as being the source. Google has clearly stated that it will come down hard on any websites abusing these tags and based on the above they have more ammunition to use to find duplcate content so is it worth abusing?

The “Agent Rank” patent from Google seems to be what Google are aiming for by releasing the “News Attribution Tags” in that they work in a simialr way in allowing you to tag a page as the source but on a much smaller more trackable vertical.

If successful I think we will see “Agent Rank” rolled out. Agent rank will also come in handy in terms of Google places in that Google places seems to use reviews as a ranking factor. What I mean is, that if agent rank comes into force and I have a high agent rank and leave a review on somebodies Google places page it will count more for that page than somebody with a lower agent rank.

All in all the whole article above revolves around one thing. Content is still king and if you continue to write top quality original content you wont go far wrong.

