Wikipedia is in trouble

I’m going out on a limb here: unless Wikipedia comes up with a coherent contribution policy that is consistent with the economic value of its content, it will start to deteriorate.
In a widely published Associated Press story, Brian Bergstein reports that Jimmy Wales, Wikipedia founder, Board Chair Emeritus, and currently President of for-profit Wikia, Inc., blocked the account of a small entrepreneur, Gregory Kohs, who was selling his services to (openly, with attribution) write Wikipedia articles about businesses. Wales reportedly told Kohs that his MyWikiBiz was “antithetical to Wikipedia’s mission”, and that even posting his stories on his personal page inside Wikipedia so independent editors could grab them and insert them in the encyclopedia was “absolutely unacceptable”.
Before I get into my dire forecast, what is antithetical about someone who is paid as a professional writer to prepare content, especially if he is open about that fact? There are three “fundamental” Wikipedia editorial policies with which all contributions must comply:

  1. Neutral point of view (NPOV)
  2. Verifiability
  3. No original research

The first two are relevant here. NPOV means all content “must be written from a neutral point of view (NPOV), representing fairly and without bias all significant views.” Verifiability means “any reader should be able to check that material added to Wikipedia has already been published by a reliable source.” Kohs stated in his corporate materials that he is committed to compliance with these two policies: he would prepare the content for interested parties, but it would be neutral and verifiable. Of course, on any particular contribution other editors might disagree and choose to revise the content, but that is the core process of Wikipedia.
The problem is deep: arguably all contributors have a subjective (non-neutral) point of view, no matter how much they may wish, and believe otherwise. What is rather remarkable about Wikipedia is how well the group editing process has worked to enforce neutrality (and verifiability) through collective action. In any case, there is no clear reason to believe a paid professional writer is going to be systematically non-neutral any more or less than a volunteer writer.
In part, this is just a simple statement about incentives. A reasonable starting point is to accept that everyone who makes the effort to research and write material for Wikipedia is doing it for some motivating reason. Research and writing take time away from other desirable activities, so unless the writer is consistently irrational, she by revealed preference believes she is getting some benefit out of writing greater than the opportunity cost of the foregone time. It follows directly that point of view might be biased by whatever is motivating a given writer. To believe otherwise is naive. Dangerously naive, for the future of Wikipedia.
Even if the “everyone is motivated by someone” argument is too subtle for some true believers in massive social altruism, there is an obvious problem with Wikipedia’s position on Gregory Kohs: surely there are many, many writers who are being paid for time and effort they devote to Wikipedia, but who are not being open about it. For example, employees of corporations, non-profits, educational institutions, etc., asked to maintain a Wikipedia entry on the corporation, who do so from an IP address not traceable to the corporation (e.g., from home). We already know from past experience that political operatives have made sub rosa contributions.
So, the problem of distinguishing between a priori neutral and a priori non-neutral contributors is deep and possibly not amenable to any reasonably effective solution. This is a fundamental problem of hidden information: the contributor knows things about her motivations and point of view that are not observable by others. Rather, others can only infer her motivations, by seeing what she writes, and at that point, the motivations are moot: if her content is not neutral or verifiable, other editors can fix it, and if she systematically violates these principles, she can be banned based on what she did, not who she purports to be.
Indeed, given the intractability of knowing the motivations and subjective viewpoints of contributors, it might seem that the sensible policy would be to encourage contributors to disclose any potential conflicts of interest, to alert editors to be vigilant for particular types of bias. This disclosure, of course, is exactly what Kohs did.
And now, for my prediction that Wikipedia is in trouble. Wikipedia has become mainstream: people in all walks of life rely on it as a valuable source of information for an enormous variety of activities. That is, the content has economic value: economic in the sense that it is a scarce resource, valuable precisely because for many purposes it is better than the next alternative (it is cheaper, or more readily available, or more reliable, or more complete, etc.). Having valuable content, of course, is the prime directive for Wikipedia, and it is, truly, a remarkable success.
However, precisely because the content has economic value to the millions of users, there are millions of agents who have an economic interest in what the content contains. Some are interested merely that content exist (for example, there are not many detailed articles about major businesses, which was the hole that Kohs was trying to plug). Others might want that content to reflect a particular point of view.
Because there is economic value to many who wish to influence the content available, they will be willing to spend resources to do the influencing. And where there are resources — value to be obtained — there is initiative and creativity. A policy that tries to ex ante filter out certain types of contributors based on who they are, or on very limited information about what their subjective motivations might be, is as sure to be increasingly imperfect and unsuccessful as is any spam filtering technology that tries to set up ex ante filtering rules. Sure, some of this pollution will be filtered, but there will also be false positives, and worse, those with an interest in influencing content will simply find new clever ways to get around the imperfect ex ante policies about who can contribute. And they will succeed, just as spammers in other contexts succeed, because of the intrinsic information asymmetry: the contributors know who they are and what their motivations are better than any policy rule formulated by another can ever know.
So, trying to pre-filter subjective content based on extremely limited, arbitrary information about the possible motivations of a contributor will just result in a spam-like arms race: content influencers will come up with new ways to get in and edit Wikipedia, and Wikipedia’s project managers will spend ever increasing amounts of time trying to fix up the rules and filters to keep them out (but they won’t succeed).
This vicious cycle has always been a possibility, and indeed, we’ve seen examples of pollution in Wikipedia before. The reason I think the problem is becoming quite dangerous to the future of Wikipedia its very success. By becoming such a valuable source of content, content influencers will be willing to spend ever increasing amounts to win the arms race.
Wikipedia is, unavoidably (and hooray! this is a sign of the success of its mission) an economic resource. Ignoring the unavoidable implications of that fact will doom the resource to deteriorating quality and marginalization (remember Usenet?).
Ironically, at first blush there seems to be a simple, obvious alternative right at hand: let Wikipedia be Wikipedia. The marvel of the project is that the collective editorial process maintains very high quality standards. Further, by allowing people to contribute, and then evaluating their contributions, persistent abusers can be identified and publicly humiliated (as Jimmy Wales himself was when he was caught making non-neutral edits to the Wikipedia entry about himself). Hasn’t Wikipedia learned its own key lessons? Let the light shine, and better the devil you know.
(Wikipedia itself offers an enlightening summary of the battle of Kohs’s efforts to contribute content. This summary serves to emphasize the impossibility of Wikipedia’s fantasy of pre-screening contributors.)

Good in or bad out?

In his New York Times Circuits Newsletter, David Pogue writes about Microsoft’s recent gift of $2200 laptops to about 90 bloggers who write about technology — laptops loaded with about-to-be-released Vista and Office 2007.
Reviewers need access to the technology they are reviewing, but as Pogue notes, MS could lend the computers.
But I’m more interested in the general point Pogue makes: we live in a culture in which most journalists are trained, and managed by editors who direct them to adhere to ethical guidelines that among other things prohibit accepting gifts from subjects of stories and reviews presented as objective. But technology is moving faster than culture, and a whole new class of influential communicators has emerged — bloggers — who for the most part are not trained or managed to follow a specific code of ethics.
If bloggers want durable credibility and success, the culture (theirs and the greater context in which they are embedded) will need to evolve practices and standards that establish and maintain trust. Without trust — especially at blogs that specialize in providing information for costly decisions, like purchasing consumer electronics and software — bloggers will lose their audiences. The speed of the development of reliable practices and reputation mechanisms may determine which parts of the blogosphere succeed, and whether much of it degenerates into a morass of spam-like paid (but disguised) product placement announcements.

Spam as security problem

Here is the blurb Rick Wash and I wrote for the USENIX paper (slightly edited for later re-use) about spam as a security problem ripe for ICD treatment. I’ve written a lot about spam elsewhere in this blog!

Spam (and its siblings spim, splog, spit, etc.) exhibits a classic hidden information problem. Before a message is read, the sender knows much more about its likely value to the recipient than does the recipient herself. The incentives of spammers encourage them to hide the relevant information from the recipient to get through the technological and human filters.

While commercial spam is not a traditional security problem, it is closely related due to the adversarial relationship between spammers and email users. Further, much spam carries security-threatening payloads: phishing and viruses are two examples. In the latter case, the email channel is just one more back door access to system resources, so spam can have more than a passing resemblance to hacking problems.


Here’s a paragraph Rick Wash and I wrote for the USENIX paper, somewhat revised for later use, concerning spyware:

An installer program acts on behalf of the computer owner to install desired software. However, the installer program is also acting on behalf of its author, who may have different incentives than the computer owner. The author may surreptitiously include installation of undesired software such as spyware, zombies, or keystroke loggers. Rogue installation is a hidden action problem: the actions of one party (the installer) are not easy to observe. One typical design response is to require a bond that can be seized if unwanted behavior is discovered (an escrowed warranty, in essence), or a mechanism that screens unwanted behavior by providing incentives that induce legitimate installers to take actions distinguishable from those who are illegitimate.

ICD for information security

Rick Wash and I published a paper at the USENIX Hot Topics in Security Workshop in August 2006 titled “Incentive-Centered Design for Information Security“. From the abstract:

Humans are “smart components” in a system, but cannot be directly programmed to perform; rather, their autonomy must be respected as a design constraint and incentives provided to induce desired behavior. Sometimes these incentives are properly aligned, and the humans don’t represent a vulnerability. But often, a misalignment of incentives causes a weakness in the system that can be exploited by clever attackers. Incentive-centered design tools help us understand these problems, and provide design principles to alleviate them. We describe incentive-centered design and some tools it provides. We provide a number of examples of security problems for which incentive- centered design might be helpful. We elaborate with a general screening model that offers strong design principles for a class of security problems.

I will start posting some short examples from this position paper concerning thoughts about the relevance of ICD to a variety of information security problems.