Sei sulla pagina 1di 14

The Data Journalism Handbook

Version 0.1

Contributors
Contributors to this book include:

David Banisar, Article 19 Caelainn Barr, EU Data Journalist Mariano Blejman, Hacks/Hackers Marianne Bouchart, Data Journalism Blog Liliana Bounegru, European Journalism Centre Brian Boyer, Chicago Tribune Jane Park, Creative Commons Paul Bradshaw, City University London Lucy Chambers, Open Knowledge Foundation Helen Darbishire, Access Info Europe Steve Doig, Cronkite School of Journalism David Erwin, New York Times Lisa Evans, Guardian Datablog Tom Fries, Bertelsmann Stiftung Duncan Geere, Wired.co.uk Rich Gordon, Northwestern University Jonathan Gray, Open Knowledge Foundation Ted Han, DocumentCloud Kate Hudson, Open Journalism Francis Irving, ScraperWiki Lizzie Jackson, Ravensbourne College Nicolas Kayser-Bril, Data Journalist John Keefe, New York Public Radio Friedrich Lindenberg, Open Knowledge Foundation Lorenz Matzat, OpenDataCity Aidan McGuire, ScraperWiki Cynthia O'Murchu, Financial Times Aron Pilhofer, New York Times

Anthony Reuben, BBC Simon Rogers, Guardian Datablog Amanda Rossi, freelance journalist Fabrizio Scrollini, London School of Economics Adam Thomas, Source Fabric Sascha Venohr, Zeit Online Jerry Vermanen, De Stentor Csar Viana, Csar Viana, Estacio de Sa University Farida Vis, University of Leicester Lulu Pinney, Infographic design (Telling Information)

This work is licensed under a Creative Commons Attribution Sharealike license.

Tables of contents
The Data Journalism Handbook Contributors Tables of contents 0. Preface 0.1 The purpose of this book 0.2 Add to this book 0.3 Share this book 1. Introduction 1.1 What is data journalism? 1.2 Why is it important? 1.3 How is it done? 1.4 Examples, case studies and interviews 1.4.1 Data powered stories 1.4.2 Data served with stories 1.4.3 Data driven applications 1.5 Making the case for data journalism 1.5.1 Measuring impact 1.5.2 Sustainability and business models 2. Getting data 2.1 Where does data live? 2.1.1 Open data 2.1.2 Social data services 2.1.3 Research data 2.2 Asking for data 2.2.1 Freedom of Information laws 2.2.2 Helpful public servants 2.3 Getting your own data 2.3.1 Scraping data 2.3.2 Crowdsourcing data 3. Understanding data 3.1 Data literacy 3.2 Working with data 3.3 Tools for analysing data 3.4 Annotating data 4. Delivering data

4.1 From datasets to stories 4.2 Publishing data 4.3 Visualising data 4.4 Data driven applications 4.5 Engagement, outreach and community 5. Appendix 5.1 Further resources

Notes: First draft deadline: Sunday, November 6th, 17.00 GMT (Please inform us if you finish your contribution earlier so we can start editing it) Project hashtag: #ddjbook Project URL:

0. Preface
0.1 The purpose of this book
Overview: Explain what this book does and doesnt aim to do Authors: Jonathan Gray, Liliana Bounegru Length: 0.5-1 page

0.2 Add to this book


Overview: Explain how to contribute to future versions of this book Authors: Jonathan Gray Length: 0.5 page

0.3 Share this book


Overview: Encourage people to share this book Authors: Jonathan Gray Length: 0.5 page

1. Introduction
1.1 What is data journalism?

Overview: Define and describe data journalism and how it is different from other forms of journalism. Authors: Paul Bradshaw, Jonathan Gray, [Heather Brooke], [Simon Rogers], [Nicolas Kayser-Bril], [Richard Gordon] Length: 1-2 pages (with quotes from different people)

UPDATE: input from Paul Bradshaw, Jonathan Gray STILL NEED: Snappy quotes from different people on what data journalism is, and what it isnt. EDITOR: Liliana

1.2 Why is it important?


Overview: Put data journalism into context and explain why it matters and what potential it has. Authors: Tom Fries, [Paul Bradshaw], [Jonathan Gray], [Heather Brooke], [Simon Rogers], [Nicolas Kayser-Bril], [Richard Gordon] Length: 1 page (with quotes)

UPDATE: input from Tom Fries and Nicholas Kayser-Bril STILL NEED: Snappy quotes from different people on why data journalism is important. EDITOR: Liliana

1.3 How is it done?


Overview: Explain different ways of doing data journalism (e.g. journalists who can code vs coders for hire, off the shelf tools vs. custom web applications, in house graphics departments vs hired data visualisation experts, etc). Give examples of how it is being done in different newsrooms. Authors: Lucy Chambers, [Aron Pilhofer], [Simon Rogers], [Anthony Reuben], [Cynthia O'Murchu], [Sascha Venohr], [Caelainn Barr] Length: 2-3 pages (with examples and quotes) UPDATE: input from Zeit Online, notes from the Guardian and Chicago Tribune STILL NEED: More case studies, quotes and examples. In particular get input from BBC, Chicago Tribune, FT, Guardian and NYT. And talk about how to find developers, designers and issue experts.

EDITOR: Liliana

1.4 Examples, case studies and interviews


1.4.1 Data powered stories
Overview: Give and describe successful examples of data powered stories you worked on. Describe how you produced these stories. The aim is to give journalists and decision-makers in newsrooms who might be interested in data journalism a sense of what the potential of data powered stories is and how they could go about producing them. What data did you use and how did you obtain it? What determined you to start this project? What did the project aim to achieve? How long did you work on the project? How many people worked on it? What was the cost of the project? What were the skills necessary for this project? (domain knowledge, coding, research, visualisation, etc.) What is the role of datasets in these stories? (e.g.: give rise to new stories, enrich stories, contextualize stories, help journalists explore topics in new ways, etc.) What was your approach? (exploratory vs. hypothesis approach) What techniques and tools did you use? How did you present the data powered story? What is the potential of data powered stories? Why should journalists/newsrooms be interested in producing such projects? What were the challenges in producing these stories? What tips and advice would you give to journalists who want to work on similar projects? Please include relevant links, videos and images. Authors: Caelainn Barr, James Ball, Sascha Venohr, [Anthony Reuben], Cynthia O'Murchu, [Heather Brooke] Length: 1.5-3 pages per example

UPDATE: Zeit Online STILL NEED: More case studies - e.g. from Amanda on Brazilian citizen journalists, from Chicago Tribune, data journalism on the radio, Guardian (Lisa or James). EDITOR: Lucy/Kat

1.4.2 Data served with stories


Overview: Give and describe successful examples of data served with stories

you worked on. Describe how you produced these projects. The aim is to give journalists and decision-makers in newsrooms who might be interested in data journalism a sense of what the potential of data served with stories is and how they could go about producing them.

What data did you use and how did you obtain it? What determined you to start this project? What did the project aim to achieve? How long did you work on the project? How many people worked on it? What was the cost of the project? What were the skills necessary for this project? (domain knowledge, coding, research, visualisation, etc.) What is the role of datasets in these stories? (e.g.: provide additional context or insight, etc.) What was your approach? (exploratory vs. hypothesis approach) What techniques and tools did you use? How did you present the story and the data served with it? What is the potential of such projects? Why should journalists/newsrooms be interested in producing such projects? What were the challenges in producing these projects? What tips and advice would you give to journalists who want to work on similar projects? Include relevant links, videos and images. Authors: Caelainn Barr, James Ball, Sascha Venohr, [Anthony Reuben], [Cynthia O'Murchu], [Heather Brooke] Length: 1.5-3 pages per example

UPDATE: needs doing! STILL NEED: Guardian, BBC, Who else serves data with stories? EDITOR: Lucy/Kat

1.4.3 Data driven applications


Overview: Give and describe successful examples of data driven applications you worked on. Describe how you produced these applications. The aim is to give journalists and decision-makers in newsrooms who might be interested in data journalism a sense of what the potential of data driven applications is and how they could go about producing them. What data did you use and how did you obtain it? What determined you to start this project? What did the project aim to achieve? How long did you work on the project? How many people worked on it? What was the cost of the project? What were the skills necessary for this project? (domain knowledge, coding, research, visualisation, etc.) What was your approach? What techniques and tools did you use? How did you present the outcome? What is the potential of such projects?

Why should journalists/newsrooms be interested in producing such projects? What were the challenges in producing these projects? What tips and advice would you give to journalists who want to work on similar projects? Include relevant links, videos and images. Authors: Aron Pilhofer, Marcus Bsch Length: 1.5- 3 pages per example UPDATE: needs doing! STILL NEED: Guardian, NYT, BBC, EDITOR: Lucy/Kat

1.5 Making the case for data journalism


1.5.1 Measuring impact
Overview: Give overview of the potential of data journalism (e.g. engaging with new audiences, the future of journalism on the web) and how it could be measured. Include results of EJC survey on training needs for data journalism Authors: Liliana Bounegru, [Lorenz Matzat] Length: 1 page

1.5.2 Sustainability and business models


Overview: Discuss costs, sustainability and business models for data journalism. Provide successful and less successful examples and explain what lessons can be learned from them. Authors: Lorenz Matzat Length: 1-2 pages UPDATE: 1.5 still needs doing! STILL NEED: input from Guardian, Deutsche Welle, Zeit Online, NYT, etc. EDITOR: Liliana

2. Getting data
2.1 Where does data live?
2.1.1 Open data

Overview: An overview of open data sources, what they contain, how to find them, how to search them, examples of open data being used by journalists Authors: Jonathan Gray, brian boyer Length: 1-3 pages (with links and examples)

2.1.2 Social data services


Overview: An overview of community driven websites which aim to help you find the data you need - such as GetTheData.org and TheDataHub.org - and their function in enabling collaboration around datasets Authors: Jonathan Gray Length: 0.5-1 page (with links and examples)

2.1.3 Research data


Overview: An overview of sites to find research data Authors: Length: 0.5-1 page (with links and examples) UPDATE: Great input and notes from Brian Boyer/Chicago Tribune, Jane Park/Creative Commons, John Keefe/WNYC, Chrys Wu/HacksHackers. STILL NEED: Needs to be written up and expanded. EDITOR: Friedrich

2.2 Asking for data


2.2.1 Freedom of Information laws
Overview: An overview of FOI legislation, an example of making an FOI request, information on resource in this area, how to get help from FOI experts Authors: Helen Darbishire (Access Info), Fabrizio Scrollini (London School of Economics) Length: 1-3 pages (with links and examples)

2.2.2 Helpful public servants


Overview: How talking directly with public servants or engaging with official open data initaitves might help you to find the data you need Authors: [Jonathan Gray] Length: 0.5-1 page (with links and examples)

UPDATE: First draft almost done. STILL NEED: Editing and peer-review. EDITOR: Liliana/Friedrich

2.3 Getting your own data


2.3.1 Scraping data
Overview: Explaining basic idea of web scraping, why this can be necessary, examples of how this has been used by journalists and guide for absolute beginners on how it can be done based on an interesting case study Authors: Francis Irving, Aidan McGuire, [Friedrich Lindenberg] Length: 2-3 pages (with links, examples, and a basic tutorial) UPDATE: Input from Friedrich Lindenberg, Federica Cocco, Glenn McMahon and Francis Irving. STILL NEED: Needs to be written up and expanded. EDITOR: Friedrich

2.3.2 Crowdsourcing data


Overview: Explaining basic idea of crowdsourcing data, how various projects have used this, and how to do this (e.g. using Google Spreadsheets, forms, maps, Twitter hashtags, etc) Authors: [Simon Rogers], [Lisa Evans] Length: 1-3 pages (with links and examples) UPDATE: Input from Marianne Bouchart and others (not in the Google doc yet), Guardian (notes) STILL NEED: Nicolas-Kayser Bril (water data) and other examples EDITOR: Liliana/Friedrich

3. Understanding data
3.1 Data literacy
Overview: Explaining data literacy and its importance (including statistical/ numerical literacy, use of mathematics, technical literacy, etc) Authors: James Ball, Nicolas Kayser-Bril, Richard Gordon

Length: 1-3 pages UPDATE: input from Lisa Evans, Richard Gordon, Lizzie Jackson, Amanda Rossi, JV Chamary, Fabrizio Scrollini STILL NEED: Input from Nicholas Kayser-Bril, and quotes from Lisa Evans, Amanda on verifying data, citizen journalism, etc EDITOR: Liliana

3.2 Working with data


Overview: What you need to work with datasets: background knowledge, technical ability, etc. (case study approach with lessons learned from each project presented) Authors: James Ball, Steve Doig Length: 1-2 pages per case study UPDATE: Input from Claire Miller and Steve Doig STILL NEED: Further input and ideas EDITOR: Liliana

3.3 Tools for analysing data


Overview: Overview of different types of tools for analysing and working with datasets, examples of how they can be used, examples of how they have been used by journalists. Authors: [Nicola Hughes], [Lisa Evans], [Friedrich Lindenberg], [Nicolas KayserBril] Length: 1-2 pages per case study UPDATE: Needs doing! STILL NEED: Input from Friedrich. EDITOR: Friedrich.

3.4 Harnessing external expertise


Overview: How to enable people to annotate and comment on datasets

Authors: [Aron Pilhofer] Length: 1 page

UPDATE: Needs doing! STILL NEED: Input from Guardian, OWNI, NYT? EDITOR: Liliana

4. Delivering data
4.1 From datasets to stories
Overview: Explaining how to find stories in datasets (various approaches), including examples and case studies. Also looking at the broader role of data journalists in the newsroom, how they work with other journalists, etc. Authors: Caelainn Barr, [Cynthia O'Murchu], [Heather Brooke], [Lisa Evans], [Sascha Venohr] Length: 0.5-1 page per approach/case study

UPDATE: Some material STILL NEEDS: Expanding and editing EDITOR: Jonathan

4.2 Publishing data


Overview: Overview of ways to publish data including examples. Embedding data, raw data (formats), live data live data, updating data, APIs. Who is your data for. Also a section on knowing the law, ethics and privacy and open licensing. Authors: Length: 1-2 pages UPDATE: Needs doing! STILL NEED: EDITOR: Jonathan

4.3 Visualising data


Overview: How to visualise data - off the shelf tools and custom visualisations with step by step guides demonstrated on an example Authors: [Lulu Pinney], [Alastair Dant]

Length: 1-2 pages per case study UPDATE: Good start! STILL NEED: Needs expanding and editing, and more examples. EDITOR: Jonathan

4.4 Data driven applications


Overview: Step by step guide, tips and tricks for how newsrooms can produce data driven applications What are the resources (skills, costs, etc.) needed? What are the steps to take when you want to build a data driven application? What useful lessons did you learn from your own experience? Why should newsrooms be interested in producing data driven applications? What is the potential of such projects? Authors: Aron Pilhofer Length: 2-3 pages (including examples)

UPDATE: Needs doing! Aron? STILL NEED: Ideas on how to get started, design process, etc. EDITOR: Jonathan

4.5 Engagement, outreach and community


Overview: Knowing your audience (and pitching appropriately), dissemination and outreach, social media, building community, engaging with existing communities (designers, developers, etc). Authors: Length: 1-2 pages UPDATE: Duncan (Wired) working on it now. Needs more input. EDITOR: Jonathan

5. Appendix
5.1 Further resources
Overview: Lists of links, resources, examples and other bits and pieces that dont fit in the handbook

Authors: Everyone! Length: 5 pages UPDATE: Needs doing! STILL NEED: Lots of ideas from everyone. EDITOR: Jonathan

Potrebbero piacerti anche