Jump to content

Submissions/Wikipedia's health: A socio-technical overview

This is an accepted submission for Wikimania 2015.

Submission no.
Title of the submission
Wikipedia's health: A socio-technical overview
Type of submission (discussion, hot seat, panel, presentation, tutorial, workshop)
Author of the submission
Aaron Halfaker
E-mail address
Country of origin
Affiliation, if any (organisation, company etc.)
Wikimedia Foundation
Personal homepage or blog
Abstract (at least 300 words to describe your proposal)

Socio-technical systems like Wikipedia are fascinating, but poorly understood. While we all celebrate Wikipedia's success, it's unclear how that success came about and how it could be applied to other open knowledge projects. We need to develop a theory of open knowledge systems in order to achieve our mission. In this presentation, I'll show how systems science can help us develop such theory and discuss some of my research that suggests we have bigger problems than the "editor decline".

Introduction & call for theory

Why do we need Theory?

Wikipedia works in practice, not in theory -- so they say. And that's great until you want to create another Wikipedia -- or Wikipedia stops working.

I'm guessing that a lot of people in this room would love to have simple and robust answers to the following questions:

  • What makes an open knowledge community like Wikipedia work and not work?
  • How do such communities mature with time and how do their needs change at scale?
  • How can we best support open knowledge construction with technology?

I work for the Wikimedia Foundation. Right now, some of our primary concerns are related to Wikipedia's community health. But what the heck is "community health"? Right now, I don't think that we have a very good way of thinking about it. We tend use crude metrics like the number of "active users" and some intuitive canaries like the survival rate of newcomers -- both of which have been tanking. So, is Wikipedia unhealthy? Could the trends we see be benign -- or maybe even desirable? Either way, they may be inevitable. What's most important is that we all need to know how community health works in order to do achieve our mission and do our jobs.

System theory, the Paramecium, and Wikipedia

Systems theory can be helpful here. By thinking about Wikipedia as a living system within an ecology, I think that we can start to address our questions.

I don't have enough time to give an introduction to systems theory and a reflection on recent work. Instead, I'll make a comparison that systems theory affords in an attempt to get us thinking about how to apply such strategies to the projects we are responsible for.

To give you a sense for what I mean, let's examine a living system that may be more intuitive.

A biological example

Single cellular organisms have permeable membranes. By this I mean that water and many other chemicals can flow in an out of the cell relatively freely. This is critical for a lot of reasons, but can also cause problems. One problem is osmosis in freshwater environments. When the PPM of salt inside of a cell is greater than outside, water rushes in. If this continues without remedy, a cell will literally burst.

In order to survive in fresh water, Paramecia have evolved "contractile vacuoles". These specialized sub-systems allows paramecia to thrive in freshwater environments -- occupying an ecological niche that would have otherwise been impossible.

Knowledge of contractile vacuoles and the types of problems that they solve can be powerful when understanding the health of a paramecium. Imagine we are were looking at a particular paramecium with as much scrutiny as we tend to bring to the systems that we study. We might notice that our paramecium is having issues maintain the right amount of water pressure and it seems to be suffering for it. Because we know something about the problem of osmosis and the subsystem of contractile vacuoles, we'd be likely to look there first.

A socio-technical example

Like other open knowledge communities, Wikipedia also has a permeable membrane. This is critical for a lot of reasons, but it can also be a problem. One problem is bad-faith users who much cause vandalism and damage. We might imagine that, if the amount of vandals began to outnumber good-faith editors, Wikipedia succumb to damage and cease to be a relevant source of information.

Enter the socio-technical quality control subsystems in Wikipedia. By combining a specialized set of editor roles (CVU, NPP, TeaHouse Hosts, etc.) and intelligent software tools (ClueBot NG, Huggle, Snuggle, HostBot, etc.)[1][2], English Wikipedia editors are spending less and less time filtering damage from the recent changes stream[3][4][5][6].

This subsystem allows Wikipedia to keep is open, permeable membrane and also occupy an interesting ecological niche among the attention of internet users. I'd argue that occupying this ecological niche is one of the things that allowed Wikipedia to become so dominant.

Knowledge of this subsystem and the roles that it plays in Wikipedia can be powerful when considering Wikipedia's community health. If we were to examine Wikipedia and find that a lot of good newcomers were getting reverted for "vandalism" -- and that this was causing a decline in the rate which they stick around -- which is actually happening[7] -- this knowledge would help us critically consider what might be going wrong with Wikipedia's socio-technical quality control system.

System thinking in open knowledge projects

In this example we see how conceptualizing a paramecium as a system composed of specialized sub-systems helps us think critically about the organism's health. Just as the paramecium has many sub-systems that contribute to its overall health, Wikipedia seems to have many as well. Personally, I like to think of Wikipedia's subsystems as fitting into 5 major categories.

  • Quality control -- Filters new content & contributors for bad-faith
    • Health: Most vandalism is removed very quickly. Newcomers who make mistakes (not bad-faith) are welcomed and trained. Bad-faith actors are quickly identified and blocked.
    • Unhealthy: Much vandalism persists for a long time. Newcomers who make mistakes (not bad-faith) are often reverted, warned and banned. Bad-faith actors persist and continue causing harm.
  • Work allocation -- Directs people towards work
    • Healthy: The people with the right skills and interest write the most important articles first. Editors fill specialized roles at near-optimal rates.
    • Unhealthy: The people with the wrong skills or interest are often directed towards work on less important content. Many roles are understaffed -- others overstaffed.
  • Regulation of behavior -- Ensure consistent application of rules and process
    • Health: Application of rules is consistent. Power is distributed widely. When rules don't make sense under local conditions, they are ignored.
    • Unhealthy: Rules are often unknown by editors. When they are known, they are applied inconsistently and or under conditions where they do not make sense.
  • Community management -- Maintains a healthy, active, & motivated population of volunteers
    • Healthy: Newcomers take advantage of training resources. Social spaces exist to help work out technical and interpersonal troubles without sanctions. Long-term editors have many options for reducing wiki-stress without leaving.
    • Unhealthy: Newcomers are adrift in complex unknowns and they fail because of it. The primary recourse for disagreements is sanctioning. Long-term editors regularly leave due to stress.
  • Reflection / adaptation -- Discovers problems/opportunities and proposes changes in process/technology
    • Healthy: Analysis, discussion and experimentation are common. Experiments lead to learning if not actual improvements. Improvements are adopted broadly.
    • Unhealthy: Analysis and experimentation rarely happen. Many discussions about problems/opportunities lead nowhere. What improvements are demonstrated are not adopted.

I suspect that these subsystems are probably relatively common across other open knowledge systems as well. If I knew how all of these systems worked, I could tell you how to build the next open knowledge project so that it works and is sustainable. I think that this work is a critical part of Reflection / adaptation and I'll have succeeded partially in my work if you leave here agreeing with me.

  • WikiCulture & Community
Length of session (if other than 30 minutes, specify how long)
30 minutes
Will you attend Wikimania if your submission is not accepted?
Slides or further information (optional)
Slides for a previous presentation about Wikipedia's community health

Special requests

Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

  1. Sage (Wiki Ed) (talk) 01:12, 25 February 2015 (UTC)[reply]
  2. mako 00:05, 28 February 2015 (UTC)[reply]
  3. Jean-Frédéric (talk) 13:44, 1 March 2015 (UTC)[reply]
  4. guillom (talk) 23:59, 7 March 2015 (UTC)[reply]
  5. Tar Lócesilion (talk) 11:48, 10 March 2015 (UTC)[reply]
  6. CT Cooper · talk 23:59, 10 March 2015 (UTC)[reply]
  7. Marcio De Assis (talk) 15:50, 21 June 2015 (UTC)[reply]
  8. Ad Huikeshoven (talk) 07:00, 3 July 2015 (UTC)[reply]
  9. Add your username here.


  1. Geiger, R. S., & Ribes, D. (2010, February). The work of sustaining order in wikipedia: the banning of a vandal. In Proceedings of the 2010 ACM conference on Computer supported cooperative work (pp. 117-126). ACM.
  2. Morgan, J. T., Bouterse, S., Walls, H., & Stierch, S. (2013, February). Tea and sympathy: crafting positive new user experiences on wikipedia. In Proceedings of the 2013 conference on Computer supported cooperative work (pp. 839-848). ACM.
  3. https://meta.wikimedia.org/wiki/Research:Vandal_fighter_work_load
  4. https://meta.wikimedia.org/wiki/Research:Patroller_work_load
  5. Geiger, R. S., & Halfaker, A. (2013, August). When the levee breaks: without bots, what happens to Wikipedia's quality control processes?. In Proceedings of the 9th International Symposium on Open Collaboration (p. 6). ACM.
  6. West, A. G., Kannan, S., & Lee, I. (2010, July). Stiki: an anti-vandalism tool for wikipedia using spatio-temporal analysis of revision metadata. In Proceedings of the 6th International Symposium on Wikis and Open Collaboration (p. 32). ACM.
  7. Halfaker, A., Geiger, R. S., Morgan, J. T., & Riedl, J. (2012). The rise and decline of an open collaboration system: How Wikipedia’s reaction to popularity is causing its decline. American Behavioral Scientist, 0002764212469365.