Polluting user-contributed reviews

A recent First Monday article by David and Pinch (2006) documents an interesting case of book review pollution on Amazon. A user review of one book critically compared it to another. Immediately following a “user” entered another review blatantly plagiarizing a favorable review of the first book, and further user reviews did additional plagiarizing.

When the author of the first book discovered the plagiarism, he notified Amazon which at the time had a completely hands-off policy on user reviews, so it refused to intervene even for blatant plagiarism. (The policy since has changed.) Another example of the problem of keeping bad quality contributions out.
David and Pinch remind us that when an Amazon Canada programming glitch revealed reviewer identities,

a large number of authors had “gotten glowing testimonials from friends, husbands, wives, colleagues or paid professionals.” A few had even ‘reviewed’ their own books, and, unsurprisingly, some had unfairly slurred the competition.

David and Pinch address the issue of review pollution at some length. First, the catalogue six discrete layers of reputation in the Amazon system, including user ratings of reviews by others, and a mechanism to report abuse. Then they conducted an analysis of 50,000 reviews of 10,000 books and CDs. Categories of review pollution they identified automatically (using software algorithms):

  • Reviews copied from one item to another in order to promote the sales of a specific item.
  • Reviews posted by the same author on multiple items (or multiple editions of the same item) trying to promote a specific product, agenda, or opinion.
  • Reviews posted by the same author using multiple reviewer identities to bolster support for an item.
  • Reviews (or parts thereof) posted by the same reviewer to increase their own credibility and/or to build their identity.

They also make an interesting point about the arms-race limitations of technical pollution screens:

The sorts of practices we have documented in this paper could have been documented by Amazon.com themselves (and for all we know may have indeed been documented). Furthermore if we can write an algorithm to detect copying then it is possible for Amazon.com to go further and use such algorithms to alert users to copying and if necessary remove material. If Amazon.com were to write such an algorithm and, say, remove copied material, this will not be the end of the story. Users will adapt to the new feature and will no doubt try and find new ways to game the system.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s