Collection, cleaning, analysis, and visualization of data were the high points of a data journalism workshop organized by the Press Club of India on Saturday, 10 December 2022, at its premises on Raisana Road.
Anoushka Dalmia, data journalism fellow at DataLeads, was the trainer for the three-hour session organized in association with the Google News Initiative.
During data collection, the sources need to be verified with respect to authenticity (what), accuracy (who), place (where), time (when), and the last updated status, Dalmia explained, adding the key sources of data must be inspected, regulated, purchased or registered.
Some of the tools demonstrated at the session were the Google dataset search and the Wayback Machine web archives. The Wayback Machine helps search collections of digital content; recover web pages, details & data; and collect history through website archives. It explores more than 651 billion web pages.
Data cleaning involves preparing data for interpreting, understanding, and perhaps visualization. The Scraping and Tabula techniques were demonstrated to extract data from human-readable output and transfer it to PDF and Google Sheets.
Data verification is used to analyze large collections of documents. The Google Pinpoint or Journalist Studio is a collection of tools to empower reporters and data journalists to do their articles and stories in a more efficient, creative, and secure manner. It helps explore and analyze large collections of documents.
The final step is data visualization, which adds value to stories and grabs the readers’ attention. Flourish, Data Wrapper, Tableau and Infogram were some other tools discussed at the workshop for creating quality data graphics to bring stories to life through visualization and storytelling.
The workshop ended with the DataLeads team announcing a series of data journalism workshops across several Indian towns and cities.