Submissions/Analyzing 50 billion Wikipedia pageviews in 5 seconds
This is an accepted submission for Wikimania 2015.
- Submission no.
- Title of the submission
Analyzing 50 billion Wikipedia pageviews in 5 seconds
- Type of submission (discussion, hot seat, panel, presentation, tutorial, workshop)
- Author of the submission
- E-mail address
- Country of origin
Chile (Living in USA)
- Affiliation, if any (organisation, company etc.)
- Personal homepage or blog
- Abstract (at least 300 words to describe your proposal)
Wikipedia has been publishing its pageviews for years (http://dumps.wikimedia.org/other/pagecounts-raw/). This is deeply interesting data which some services take to help understand and analyze it (http://stats.grok.se/). But how can we give users a way to answer arbitrary questions, without needing hours to download and load this data into their own clusters? In this talk I'll show and share how I use Google BigQuery to answer ad-hoc questions in seconds. Even better, Google makes this data and engine available for everyone and with a monthly free processing quota. In this talk I'll feature interesting queries and results, with some jaw dropping examples (I promise).
TL;DR: By attending this talk, you'll learn how to instantly query billions of Wikipedia pageviews in seconds and for free.
Technology, Interface & Infrastructure
- Length of session (if other than 30 minutes, specify how long)
- 30 minutes
- Will you attend Wikimania if your submission is not accepted?
- Not sure
- Slides or further information (optional)
- Special requests
If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).