Translate page with Google

Journalist Resource May 30, 2024

How We Investigated Welfare Algorithms in India (Part I)


Food project page2

A new era of ‘machine governance’ is increasingly replacing traditional methodologies for deciding...

author #1 image author #2 image
Multiple Authors

AI Accountability Fellow Kumar Sambhav Shrivastava and journalist grantee Tapasya investigated opaque welfare algorithms in India that wrongfully cut off benefits to thousands of its poorest citizens. In this piece, Shrivastava shares how the story got started, the questions that drove the reporting, and the lessons they learned. Part II of their methodology, with tips and insights into what it takes to access public records in India, can be read here

In India, several states have adopted algorithmic systems that draw on several government databases to create "360-degree profiles" of citizens to verify their eligibility for social welfare schemes. The country runs some of the world’s largest social welfare programs—it spends close to $256 billion, or 13% of its GDP, per year on such programs. Concerned that benefits were being usurped by ineligible claimants, federal and state governments increasingly relied on technology to weed out "fraud.” But these algorithmic systems are opaque and untested, and their effectiveness, mechanisms, and impacts had never been scrutinized.

Over the past two years, my colleague Tapasya and I investigated the use and impact of such welfare algorithms. We reported from three states, filed over 50 Right to Information (RTI) applications, reviewed government and corporate records, and interviewed dozens of officials and welfare claimants whose lives were impacted by these algorithms.

Part 1 of our series, published with Al Jazeera, revealed how India's "pioneering technology” in “automating welfare decisions" wrongly tagged slum dwellers as car owners and inflated incomes of poor widows in the state of Telangana. This deprived several thousands of poor of their right to subsidized food. And part 2 revealed how an algorithm declared several thousands of living people as "dead" in the state of Haryana. The government stopped the old-age and widow pensions of these people. They now have to fight back through the bureaucratic red tape to prove that they are alive and that the algorithm was wrong.  

Bismillah Bee with the old ‘Below Poverty Line’ card of the family issued in 2006. Her family was denied access to Telangana's food security schemes for possessing a car even though she doesn't own one. Image courtesy of The Reporters’ Collective. India.

Our reporting showed that the government had not disclosed any evidence about the accuracy or efficacy of these algorithms to the public. When erroneous decisions of algorithms led to wrongful exclusions of the poor, the onus was on the removed beneficiaries to prove to government agencies that they were entitled to the welfare benefits. Even when they did so, officials often favored the decisions of the algorithms. 

The stories generated significant debate around transparency, governance, and accountability regarding government algorithms. Tech policy experts demanded source codes of such algorithms be made public and pointed out the need for bureaucratic accountability to prevent their misuse. Representatives of the excluded families are planning to use these stories as evidence of harm in the court. NITI Aayog, the central government’s think tank, internally took note of the stories as an example of what could go wrong with “digital public infrastructure.” 

How we got started

We conceptualized this investigation as a collaborative project by a team of journalists and researchers. Our hypothesis was based on previous research conducted by Divij Joshi on automated decision-making in welfare delivery and a story I had reported on the mass surveillance threat of Samagra Vedika, the welfare algorithmic system in Telangana. We knew from our research that Indian states were increasingly deploying algorithmic systems that created 360-degree profiles of citizens for welfare delivery without presenting any evidence of their efficacy or accuracy. There were many news reports in the local press about people complaining of arbitrary exclusion from welfare schemes after deployment of such systems. 

These are the three questions we decided to explore: 

  1. How did these algorithmic systems function? We wanted to investigate the design, logics, codes, and data that these systems used to make decisions and understand how fair and technologically sound they were. 
  2. How did welfare governance change with the deployment of these systems and how was the bureaucracy handling these technologies?  
  3. How did the decisions of these algorithms affect people, and if the algorithms made inaccurate decisions, what was the extent of the harm?

The end goal of the project was to investigate the lack of transparency and accountability of the governments and corporations using these technological systems and the scale of their impact on the poor.

Our methodology

We planned to report the story through a combination of desk research, public records requests (known as RTI in India), and field reporting. Our most ambitious goal was to access the source codes of the algorithms. We hoped that by using the “file inspection” provision of the Indian RTI law, we could obtain the source code from the files related to deployment of the algorithmic systems. If we didn’t get access to the source code, as a backup, we decided to ask in the RTI applications for the datasets and the logic that the algorithmic systems used for making decisions. 

We had the following plan of action to uncover the design and functioning of the algorithm:

  • File RTI requests to access the procurement documents and source codes of the software to inspect the files related to their deployment
  • File RTI requests for the nature and structure of datasets used by the software in decision-making
  • Review the documents and literature used by the government and private corporations to promote the systems they developed
  • File RTI requests to access copies of presentations and the minutes of meetings between corporations and government authorities
  • Interview present and former employees of corporations and government officials

After uncovering the design and the functions of the algorithm, the next step was to investigate the accuracy of its outcomes and predictions. We planned to do that through the following method: 

  • File RTI applications to access the list of families excluded from welfare in specific locations over a specific period, along with their addresses and reasons for exclusion
  • File RTI requests to get details of the court cases filed by citizens against their exclusions
  • Visit some of these families and verify the accuracy of the algorithmic decisions about these families with on-the-ground reporting and other government data sources

Lessons and tips

Our team was the first set of journalists and researchers to carry out an investigation of this scale on welfare algorithms in India. For over two years, we researched and scanned through thousands of government documents, pursued RTI applications through appeals over many months, and made several reporting trips across three Indian states. While the findings of the projects were eye-opening, the lessons from the project were just as insightful to us as journalists. Here are a few that I would like to share: 

  1. Establishing the scale of harm is challenging, but it is essential to make the stories powerful:

Finding consolidated data that establishes the central argument of the story (in this case the number of erroneous decisions by the algorithm) is as important as finding individual case studies. But it is often one of the most challenging tasks in reporting. Such numbers are hidden in raw, uncleaned data sets, which need to be carefully examined, triangulated, and tabulated to find that one data point that makes the story stand out.

In this case, while we found strong case studies of individual families being harmed by erroneous decisions of algorithms through field reporting, court documents, and a network of sources, the consolidated figures on the errors of algorithms were hard to come by either through field reporting or RTI applications. Eventually, after going through a large stack of documents, we were able to deduct the numbers from jumbled data sets presented in courts and the state assembly.    

  1. Focus on the technological specifications of algorithmic systems, their functioning, and evidence of their opaqueness and unaccountability: 

Since the primary objective of the project was to examine the accountability of the algorithms, it was essential to focus on their technological specifications and show evidence of their opaqueness. As such information is not systematically recorded in government records, nor is it part of conversations among the communities, it is not easy to capture through the traditional approach of reporting on governments and communities. 

While our initial drafts captured the human stories of harm from the field and the bureaucratic process that exacerbated the harm, we could not get a sharp focus on the functioning of the algorithmic system or explain its connection with the harm. We were able to do that in the subsequent drafts after combining bits of information available in many documents and consolidating it to build the connections. 

  1. Scaling up stories from local instances to national trends:

While our field reporting was focused on the state schemes, it was essential to research and build the larger national picture of the algorithmic takeover of welfare and to show that these shifts at the state level were part of the national trend. To do this we integrated a timeline into the narrative of how the welfare policies changed nationally and algorithmic systems evolved and got deployed over years. 

  1. Breaking the reportage into series of stories:

We reported from three states on three different welfare algorithmic systems. One challenge we faced was how to break our reportage into a series of multiple stories that are distinct from each other. A series works best when we tell something new in each part and each part incrementally takes the story forward. While initially we had hoped to publish the series in three parts, eventually we had to settle on consolidating the most important pieces of the report into two parts. 

  1. Project management principles

When executing a long-term project, it is essential for all team members to be aware of basic project management principles. This involves clearly defining the roles, responsibilities, and accountabilities both in terms of the journalistic work as well as managerial roles. Some investigations can take months, sometimes even years, to be completed. Writing agreements among the team members about their responsibilities and commitments as well as guidelines  on work ethic, communication, and credit sharing can go a long way in making collaborative projects smooth. It is also essential to make sure there is legal support in place that is easily accessible to the team members to address any potential legal challenge or ethical dispute that may arise during the execution or post-publication of the project. 


Logo: The AI Accountability Network


AI Accountability Network

AI Accountability Network


an orange halftone illustration of a hand underneath a drone


AI Accountability

AI Accountability