Jobs or garbage – why is data not easily accessible in India?

Media rumble

Rakesh Dubbudu, founder of FACTLY, a well-known data journalism portal
Rakesh Dubbudu, founder of FACTLY, a well-known data journalism portal

Successive Indian governments have followed a closed-door policy for sharing government data with the public. Be it the UPA or the NDA, both during their respective regimes have tried to hide data and make it most difficult for people to get authentic data that should be normally available in any democratic nation. All the mature democracies encourage sharing data to expand the public domain. India also does this, but not so transparently.

There are three important issues here. One is the availability of data. The second is regular data update. And the third is the authentication of the data. Most government and private institutions in India do not have the most critical data that they should have. For example, we do not have any updated job data, updated garbage data or updated water data in the country. The employment data was first discontinued in 2009 by the UPA and the little that we had in form of the labor bureau annual survey data was discontinued in 2017 by the NDA. The garbage and groundwater data was last updated almost a decade ago, in the 2010-11 Census. Several data that appears in government websites keeps vanishing without notice, and gets replaced by a new data set without explanation or continuity or transition information.

Making data access difficult is not a recent phenomenon

The callous attitude to data and the lack of its transparency is not a recent phenomenon. It has been there for over a decade. The trend has been to hide it from the citizen and the press and unlock it only after user verification. The UPA went as far as to remove data from the public domain and then create a right to information act (RTI act) where getting data from the government became a skill and test of patience – a fine art that only activists and crusaders have acquired. As a result, the common citizen was left out of the discourse and a lot of critical government data disappeared from the websites. The UPA claimed it had given the citizens a right to social justice.

The present government is equally guilty, though claiming to be different. The data available has become yet more selective and disjointed. In some cases the way of calculating data has been changed. Then again data often begins at 2014 – defying logic. At least 10-year statistics should be displayed. Also to access most of the data hosted by the National Informatics Centre (NIC) you have to log in with a user ID and password. This is not an accepted global practice and does nothing for ease of business. Keeping data under lock and key is not seen in any of the mature democracies.

Also critical data like solid waste and water data has disappeared from CPCB, CWC and other government websites despite it being crucial for Swachh Bharat and other flagship programs. At the recent Media Rumble in Delhi we attended a session on sourcing data, primarily on how data makes great stories. I spoke to Rakesh Dubbudu, the speaker at the session and the founder of FACTLY (—a well-known data journalism portal.

Data entry is outsourced to contractors

Dubbudu confirmed the difficulty in finding authentic updated data but said that there are several sources that can be tried. “The RBI data is one of the most authentic and comprehensive data available. There are two types of data here – one, which RBI generates itself, for which annual data as well as quarterly releases are available. Then there is data that the RBI collates painstakingly from other sources including the states. Fifty parameters are tracked including the state’s GDP and fiscal deficit that can be fairly informative. These are available in its annual reports of states and are usually dependable,” said Dubbudu. “Then there is data from MOSPI, the Ministry of Statistics and Program Implementation which again is authentic and detailed. However, at times these may not be fully updated.”

Dubbudu further confirmed that data from parliament proceedings is normally authentic because if they lie, a privileges committee will look into it. “Also there are parliament standing committee reports which have high quality data that is usually the latest. The CAG reports are also a great source of information with wonderful insights; although they are often issued two years after the time the event occurred. From 2009-10 onwards most ministries have started publishing annual data; it gives complete data of at least two to three years. For crime data the NCRB data is there but usually outdated. Then there are scheme websites like those of the Ujjwala scheme where the data is granular but keeps changing as it is frequently updated,” said Dubbudu.

Reasons behind existence of unreliable data

When asked why the data is available in plenty but with uncertain integrity, Dubbudu shared, “One of the reasons of lack of authentic data is because government recruitment methods have not been updated. Today since all information is entered in a database, one of the biggest recruitment by the government should be of data entry operators. But there are no in-house data entry operators in the government. Everything is outsourced to contractors and there is little accountability.” This could be the reason for data errors; especially because the contract is usually short term – for six months or a year. So lack of ownership, responsibility and long-term commitment of the contractual worker could be a reason we lack data integrity.

The Covid-19 pandemic led to the country-wide lockdown on 25 March 2020. It will be two years tomorrow as I write this. What have we learned in this time? Maybe the meaning of resilience since small companies like us have had to rely on our resources and the forbearance of our employees as we have struggled to produce our trade platforms.

The print and packaging industries have been fortunate, although the commercial printing industry is still to recover. We have learned more about the digital transformation that affects commercial printing and packaging. Ultimately digital will help print grow in a country where we are still far behind in our paper and print consumption and where digital is a leapfrog technology that will only increase the demand for print in the foreseeable future.

Web analytics show that we now have readership in North America and Europe amongst the 90 countries where our five platforms reach. Our traffic which more than doubled in 2020, has at times gone up by another 50% in 2021. And advertising which had fallen to pieces in 2020 and 2021, has started its return since January 2022.

As the economy approaches real growth with unevenness and shortages a given, we are looking forward to the PrintPack India exhibition in Greater Noida. We are again appointed to produce the Show Daily on all five days of the show from 26 to 30 May 2022.

It is the right time to support our high-impact reporting and authoritative and technical information with some of the best correspondents in the industry. Readers can power Indian Printer and Publisher’s balanced industry journalism and help sustain us by subscribing.

– Naresh Khanna

Subscribe Now


Please enter your comment!
Please enter your name here