Submissions/Content Translation - what’s running under the hood

This is an accepted submission for Wikimania 2015.

Submission no.
3027
Title of the submission

Content Translation - what’s running under the hood

Type of submission (discussion, hot seat, panel, presentation, tutorial, workshop)

Presentation

Author of the submission

Kartik Mistry, Niklas Laxström, Santhosh Thottingal

E-mail address

kmistry@wikimedia.org, nlaxstrom@wikimedia.org, sthottingal@wikimedia.org

Username

KartikMistry, Nikerabbit, Santhosh.thottingal

Country of origin

India, Finland

Affiliation, if any (organisation, company etc.)

Wikimedia Foundation

Personal homepage or blog
Abstract (at least 300 words to describe your proposal)

Content Translation tool from Wikimedia Foundation consists of many parts including Content Translation server (cxserver), Machine Translation backend (Apertium), few external Machine Translation services, Dictionary service and much more.

This talk will also explore the "dark side" of server side setup, how we worked with nice upstream developers of projects like Apertium, Technical Operation and Release team and other people. Are you interested how does one run a translation service at Wikimedia scale? Or how do changes go from developers’ machines to production through multiple testing and verification phases. This talk will tell you the highlights.

We will explain how the tool is capable of taking an article in source language, exposing it in editor not unlike the VIsual Editor and saving the translated version in wikitext again. This complex pipeline depends on Parsoid and clever algorithms to preserve markup like bolding and links even when machine translation backend only supports plain text. The tool also understand concepts unique to wiki pages links links and categories and adapts them automatically using WikiData and APIs of the wikis.

This talk will provide a better view of how Content Translation works and the various issues that are necessary to be dealt with when setting up such a complex service.

Track
  • Technology, Interface & Infrastructure
Length of session (if other than 30 minutes, specify how long)

30 minutes

Will you attend Wikimania if your submission is not accepted?
Slides or further information (optional)
Special requests


Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

  1. Runa Bhattacharjee (talk) 18:15, 24 February 2015 (UTC)[reply]
  2. Correogsk or Gustavo (Editrocito or Heme aquí) 00:40, 1 March 2015 (UTC)[reply]
  3. (Like, obviously :) --Amir E. Aharoni (talk) 14:39, 6 March 2015 (UTC)[reply]
  4. Roxyuru (talk) 12:50, 21 May 2015 (UTC)[reply]
  5. eranroz (talk) 08:14, 4 July 2015 (UTC)[reply]
  6. Krinkle (talk) 19:27, 17 July 2015 (UTC)[reply]
  7. Ale201093 (talk) 04:07, 18 July 2015 (UTC)[reply]
  8. MRG90 (talk) 19:10, 18 July 2015 (UTC)[reply]