Submissions/The Pleasures and Pains of Analyzing All the Wikis in Realtime
After careful consideration, the Programme Committee has decided not to accept the below submission at this time. Thank you to the author(s) for participating in the Wikimania 2015 programme submission, we hope to still see you at Wikimania this July. |
- Submission no.
- 3046
- Title of the submission
The Pleasures and Pains of Analyzing All the Wikis in Realtime
- Type of submission (discussion, hot seat, panel, presentation, tutorial, workshop)
Tutorial & Discussion
- Author of the submission
- Max Klein
- Anthony Di Franco
- E-mail address
isalixgmail.com di.francogmail.com
- Username
- Country of origin
U.K. and U.S.A.
- Affiliation, if any (organisation, company etc.)
- Personal homepage or blog
- Abstract (at least 300 words to describe your proposal)
What started off as the problem of tracking citations eventually lead us to develop a much more general solution - a tool to track all the edits of all Wikis in realtime. With this a new world of possibilities opens up: tracking the trends in what people are writing about, allowing users to receive alerts on edits based on custom queries on article and edit content. These ideas are far away, but we can bring them closer by joining together in building the platform. This introduction is a tutorial in what exists so far, and will present an agenda for collaborating on the next steps.
The name "Cocytus" shared a lot of syllables with words describing our original goal, tracking citations as they appear in the recent changes stream. Cocytus is the river of lamentation that flows around Hades.
In this presentation we will cover the state of the art technologies and efforts and pitfalls in monitoring Wikipedia in real-time.
Technologies we will cover are:
- RCstream and websockets.
- Wikimedia labs.
- Mediawiki diff API.
- Wikitext parsing.
- Stream rebroadcasting.
We also hope to brainstorm and organize future uses and development of a community platform.
Our Future Uses Brainstorm:
- Using the changes queue directly
- Trend tracking with dynamic topic modeling (More on this here)
- Real-time wikimedia analytics in the style of social media analytics and search
- Alerts based on stream queries.
Lastly we hope to record the experiences of all developers contributing as advice to submit as feedback back to Wikimedia Foundation and Wikimedia Labs.
- Track
- Technology, Interface & Infrastructure
- Length of session (if other than 30 minutes, specify how long)
- 30 minutes
- Will you attend Wikimania if your submission is not accepted?
- Yes
- Slides or further information (optional)
- Special requests
Interested attendees
If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).
- Daniel Mietchen (talk) 01:05, 15 February 2015 (UTC)
- EpochFail (talk) 13:43, 27 February 2015 (UTC)
- DarTar (talk) 00:41, 6 March 2015 (UTC)
- What a pain in the toe it is that there are so many languages and projects :) Amir E. Aharoni (talk) 16:16, 6 March 2015 (UTC)