Jobs or garbage – why is data not easily accessible in India?

Media rumble

Rakesh Dubbudu, founder of FACTLY, a well-known data journalism portal
Rakesh Dubbudu, founder of FACTLY, a well-known data journalism portal

Successive Indian governments have followed a closed-door policy for sharing government data with the public. Be it the UPA or the NDA, both during their respective regimes have tried to hide data and make it most difficult for people to get authentic data that should be normally available in any democratic nation. All the mature democracies encourage sharing data to expand the public domain. India also does this, but not so transparently.

There are three important issues here. One is the availability of data. The second is regular data update. And the third is the authentication of the data. Most government and private institutions in India do not have the most critical data that they should have. For example, we do not have any updated job data, updated garbage data or updated water data in the country. The employment data was first discontinued in 2009 by the UPA and the little that we had in form of the labor bureau annual survey data was discontinued in 2017 by the NDA. The garbage and groundwater data was last updated almost a decade ago, in the 2010-11 Census. Several data that appears in government websites keeps vanishing without notice, and gets replaced by a new data set without explanation or continuity or transition information.

Making data access difficult is not a recent phenomenon

The callous attitude to data and the lack of its transparency is not a recent phenomenon. It has been there for over a decade. The trend has been to hide it from the citizen and the press and unlock it only after user verification. The UPA went as far as to remove data from the public domain and then create a right to information act (RTI act) where getting data from the government became a skill and test of patience – a fine art that only activists and crusaders have acquired. As a result, the common citizen was left out of the discourse and a lot of critical government data disappeared from the websites. The UPA claimed it had given the citizens a right to social justice.

The present government is equally guilty, though claiming to be different. The data available has become yet more selective and disjointed. In some cases the way of calculating data has been changed. Then again data often begins at 2014 – defying logic. At least 10-year statistics should be displayed. Also to access most of the data hosted by the National Informatics Centre (NIC) you have to log in with a user ID and password. This is not an accepted global practice and does nothing for ease of business. Keeping data under lock and key is not seen in any of the mature democracies.

Also critical data like solid waste and water data has disappeared from CPCB, CWC and other government websites despite it being crucial for Swachh Bharat and other flagship programs. At the recent Media Rumble in Delhi we attended a session on sourcing data, primarily on how data makes great stories. I spoke to Rakesh Dubbudu, the speaker at the session and the founder of FACTLY (—a well-known data journalism portal.

Data entry is outsourced to contractors

Dubbudu confirmed the difficulty in finding authentic updated data but said that there are several sources that can be tried. “The RBI data is one of the most authentic and comprehensive data available. There are two types of data here – one, which RBI generates itself, for which annual data as well as quarterly releases are available. Then there is data that the RBI collates painstakingly from other sources including the states. Fifty parameters are tracked including the state’s GDP and fiscal deficit that can be fairly informative. These are available in its annual reports of states and are usually dependable,” said Dubbudu. “Then there is data from MOSPI, the Ministry of Statistics and Program Implementation which again is authentic and detailed. However, at times these may not be fully updated.”

Dubbudu further confirmed that data from parliament proceedings is normally authentic because if they lie, a privileges committee will look into it. “Also there are parliament standing committee reports which have high quality data that is usually the latest. The CAG reports are also a great source of information with wonderful insights; although they are often issued two years after the time the event occurred. From 2009-10 onwards most ministries have started publishing annual data; it gives complete data of at least two to three years. For crime data the NCRB data is there but usually outdated. Then there are scheme websites like those of the Ujjwala scheme where the data is granular but keeps changing as it is frequently updated,” said Dubbudu.

Reasons behind existence of unreliable data

When asked why the data is available in plenty but with uncertain integrity, Dubbudu shared, “One of the reasons of lack of authentic data is because government recruitment methods have not been updated. Today since all information is entered in a database, one of the biggest recruitment by the government should be of data entry operators. But there are no in-house data entry operators in the government. Everything is outsourced to contractors and there is little accountability.” This could be the reason for data errors; especially because the contract is usually short term – for six months or a year. So lack of ownership, responsibility and long-term commitment of the contractual worker could be a reason we lack data integrity.

2023 promises an interesting ride for print in India

Indian Printer and Publisher founded in 1979 is the oldest B2B trade publication in the multi-platform and multi-channel IPPGroup. While the print and packaging industries have been resilient in the past 33 months since the pandemic lockdown of 25 March 2020, the commercial printing and newspaper industries have yet to recover their pre-Covid trajectory.

The fragmented commercial printing industry faces substantial challenges as does the newspaper industry. While digital short-run printing and the signage industry seem to be recovering a bit faster, ultimately their growth will also be moderated by the progress of the overall economy. On the other hand book printing exports are doing well but they too face several supply-chain and logistics challenges.

The price of publication papers including newsprint has been high in the past year while availability is diminished by several mills shutting down their publication paper and newsprint machines in the past four years. Indian paper mills are also exporting many types of paper and have raised prices for Indian printers. To some extent, this has helped in the recovery of the digital printing industry with its on-demand short-run and low-wastage paradigm.

Ultimately digital print and other digital channels will help print grow in a country where we are still far behind in our paper and print consumption and where digital is a leapfrog technology that will only increase the demand for print in the foreseeable future. For instance, there is no alternative to a rise in textbook consumption but this segment will only reach normality in the next financial year beginning on 1 April 2023.

Thus while the new normal is a moving target and many commercial printers look to diversification, we believe that our target audiences may shift and change. Like them, we will also have to adapt with agility to keep up with their business and technical information needs.

Our 2023 media kit is ready, and it is the right time to take stock and reconnect with your potential markets and customers. Print is the glue for the growth of liberal education, new industry, and an emerging economy. We seek your participation in what promises to be an interesting ride.

– Naresh Khanna

Subscribe Now


Please enter your comment!
Please enter your name here