Translate page with Google

Journalist Resource Publication logo May 15, 2024

How Digital Witness Lab Analyzed Data From BJP WhatsApp Groups Ahead of the Indian Elections

Country:

Author:
English

WhatsApp has become a battlefield for contemporary Indian elections.

SECTIONS

Image courtesy of Rest of World.

A detailed methodology of the analysis in our story 'Inside the BJP's WhatsApp Machine.'


A case study on how India’s Bharatiya Janata Party used WhatsApp to spread its message in a small Indian town. We studied messages from 20 WhatsApp groups in the weeks surrounding what many considered to be the beginning of Prime Minister Narendra Modi’s 2024 election campaign: the Ram Temple inauguration in Ayodhya in January 2024.

This methodology provides further details regarding the statistical analyses presented in Rest Of World’s investigation “Inside the BJP’s WhatsApp Machine.”


As a nonprofit journalism organization, we depend on your support to fund more than 170 reporting projects every year on critical global and local issues. Donate any amount today to become a Pulitzer Center Champion and receive exclusive benefits!


Background

Role of WhatsApp in Indian Elections

WhatsApp has around 400 million monthly active users in India, making it one of the most popular platforms in the country. The use of the platform by political parties to disseminate political messaging and mobilize voters in India started in earnest during the 2019 general election cycle. There have been many reports documenting how political parties, especially the Bharatiya Janata Party (BJP), currently led by Prime Minister Narendra Modi, have mobilized vast armies of volunteers to spread their messages to potential voters via WhatsApp messages.

There has also been significant coverage on the spread of misinformation on WhatsApp in India, and hate speech on the platform has been extensively documented by researchers and journalists. In response, WhatsApp has taken steps to prevent the rapid spread of messages on the platform by limiting group sizes as well as the number of times a message can be forwarded. The platform has also added labels to indicate to users when a message is “forwarded” or “forwarded many times.”

Messages that are forwarded five or more times are considered to be frequently forwarded by WhatsApp and are labeled “forwarded many times.” These messages have a forwarding limit of one chat at a time, unlike regular messages, which can be forwarded to five chats at a time. In the weeks after it was introduced, this limit reduced the spread of “highly forwarded” messages by 70%, according to the company, but it’s unclear what the nature of the content was in those messages. While a necessary step, this is not the only risk that arises from the use of the platform for political campaigning. 

An important aspect that gets less attention is the lack of identification of political accounts. WhatsApp does not have a mechanism for informing users if a group has been created for political campaigning purposes. The only verification service WhatsApp provides is for Business accounts, the purpose of which is to prevent impersonation and fraud. (In any case, the company states on its website that political parties and campaigns aren’t permitted to use the Business service.) This is in stark contrast to other social media platforms, including Meta’s Facebook, where political accounts and political ads are identified. WhatsApp’s new “channels” feature, which is intended to be a place for public broadcasts, provides verification of business and popular accounts, but this feature is still in a relatively early stage of deployment and does not appear to communicate to users the verified account’s category.

Ram Temple Inauguration in Ayodhya

In January, as the BJP geared up for India’s upcoming general election, Prime Minister Narendra Modi inaugurated a new temple dedicated to the Hindu deity Ram in Ayodhya, India, on January 22, 2024. The Ram Temple replaces a centuries-old mosque, the Babri Mosque, which was destroyed in 1992 amid sectarian violence that resulted in over 2,000 deaths.

The temple’s inauguration has been met with mixed reactions. Some view it as a vindication of Hindu nationalism — restoring the birthplace of the deity Ram. Others see it as a defeat for secularism — the culmination of a long Hindutva, or Hindu nationalist, campaign. Many simply celebrate the creation of a temple for Ram in his mythological birthplace. Many also saw the temple’s opening as the beginning of Modi’s 2024 election campaign.

The temple inauguration was celebrated across India, including in Mandi, a town in the northern state of Himachal Pradesh. While the location is not in the geographic vicinity of the temple inauguration, the people in this city of roughly 26,000 residents can provide a snapshot of how the event was celebrated on WhatsApp by the community and used for mobilization in political efforts.

Method

Selection and Classification of Public WhatsApp Groups

In large part because WhatsApp provides no centralized listing of groups, we could not determine the full universe of Mandi-focused WhatsApp groups. Based on reporting, there are likely hundreds of political Mandi groups alone. Our study examines just 20 Mandi groups, of various types. Although the sample is relatively small, and not necessarily representative of all such groups, they provide an unprecedented glimpse into WhatsApp activity in this community.

Srishti Jaswal, a journalist living in Mandi, contributed the data we examined. Using a convenience sample approach, she joined groups that were being promoted in the Mandi community. These included groups created by political parties, groups created by community members that share information about the local region, and some created by religious groups to share non-political content. She joined some of these groups through publicly shared links and through invitations from group administrators. 

WhatsApp makes no official distinction between public and private groups so we had to define our own criteria. We consider groups to be private (and hence off-limits) when they are small in size, where group members know one another, and there is an implicit expectation of privacy. WhatsApp groups of friends, relatives, family members, or other personal communities, such as school alumni or former colleagues, fall into this category. 

We consider WhatsApp groups to be public if they contain at least six members and are created for the purpose of broadcasting messages to a community or discussing public events. Even though we only study groups where there is no reasonable expectation of privacy we still redact members’ phone numbers and display names attached to all messages shared with us.

The 20 groups included in this study met our criteria for being considered public. The smallest had 32 members at the time of the inauguration; the largest had more than 1,500 members. We categorized them based on their content and administration. The messages in these groups shared various types of information, ranging from political to local news, religious material, and motivational content, within a specific geographical area. They broadly fell into three categories:

Political

Groups that included the name of politicians or political parties in the title, often including the word “official.” We verified that these groups were managed by party workers and volunteers for the purpose of sharing party content and talking points. The political groups in our dataset only include those affiliated with the BJP. The journalist made considerable efforts to find groups associated with other political parties, but these either didn’t exist or weren’t as popular in the region.

Surrogate

Groups that are not identified as being political but are either managed by party workers or have party workers as administrators. From interviews with party workers we verified that these groups are used to share political content in an unofficial capacity.

Organic

Groups formed for sharing local news and information about Mandi and Himachal Pradesh, religious content, inspirational quotes, morning greetings, jokes, et cetera. Political discussion may occur in these groups but they are not run by party workers or designed specifically for this purpose.

Message Selection

In mid-March 2024, we selected all messages collected from January 8, 2024, through February 3, 2024, based on Coordinated Universal Time (UTC) timestamps. (Due to limitations of the infrastructure, it is possible that the corpus is missing some auto-expiring and user-deleted messages that disappeared before collection.)  

We then converted all timestamps to Indian Standard Time (IST), which is 5.5 hours ahead of UTC. For this reason, we have two partial days of data when converted to IST: 

  • The first 5.5 hours of January 8 in India are missing.
  • Only the first 5.5 hours of February 4 in India are included.

For charts below that indicate daily message volumes, we exclude those two partial dates. (All dates in the analyses and charts below reflect Indian Standard Time.)

Highly Forwarded Message Definition

Of those messages, we focused our content annotation/labeling on messages that were forwarded two or more times. In the parlance of this methodology document, we considered those to be “highly forwarded” messages. (See the “Limitations” section of this methodology for the differences between what we call “highly forwarded” and other meanings of that term.)  

These messages are displayed to the user as “Forwarded.” We chose to study messages that were forwarded two or more times. From the 8,169 messages in our data, 751 messages were forwarded two or more times.

An alternative approach could have been to limit the study to messages that receive the “Forwarded Many Times” label — those that are forwarded five or more times. Because fewer than 150 such messages reached that threshold in our sample, we decided to use the looser criterion. We also took into account the fact that Mandi has a population of roughly 26,000 and it is plausible that a message could be popular in the region without being forwarded five times. 

Content Annotation Procedure 

Of the 751 highly forwarded messages, our system failed to download necessary media for 38 of them, leaving us with 713 messages we could properly assess. We manually labeled those 713 messages based on the following themes:

  • If a message was related to the inauguration of the Ram Temple in Ayodhya 
  • If a message promoted national identity
  • If a message expressed positive or negative sentiments about:
    • The BJP
    • The Congress Party
    • Hindus
    • Muslims
  • If a message referenced Hinduism or Islam (the religions, rather than their adherents) more generally
  • If a message contained disinformation
  • If a message contained only content that wasn’t relevant to our study, such as satire or memes, promotion of local events, initiatives and issues directly impacting the local community.

Given the history of the Ram Temple in Ayodhya and how closely the issue is tied to the BJP’s Hindutva ideology, we tried to distinguish messages that expressed positive sentiment about Hinduism generally from messages that expressed positive sentiment about Hinduism in the context of Hindutva ideology. For exact definitions used by the researchers for labeling, please refer to the Appendix – Label Definitions

We conducted two pilot studies to help refine the label definitions. In each, two coders assessed a random subset of 100 messages. For the main study, researchers labeled all the messages based on revised definitions and provided an additional label for “needs review” for messages deemed more complex. A third researcher labeled all messages for which there was disagreement between the initial labelers or which were tagged “needs review,” and the project’s principal investigator resolved any remaining confusions. The final labels were based on the third researcher and principal investigator’s final conclusions. Messages marked for review were labeled by the third labeler and the final decision was made by the principal investigator after reviewing all three labels and the reasons provided by the labelers. Ultimately, we focused our analyses on the few labels we felt most confident in assessing. See the Appendix for a more complete discussion.

In order to understand the messages in the WhatsApp environment, we designed a tool to display the messages in context, providing the name of the group in which the message appeared, plus the nine messages preceding it. The design of this context viewer roughly mimics WhatsApp where messages appear in bubbles and media is loaded inline. This interface allowed labelers to consider the context of the message and view any media associated with the messages to be labeled. Of the 713 messages labeled, 648 were media messages, of which 364 were images, 196 video or audio, and 88 PDFs.

Findings

Defining the Pre-Inauguration, Inauguration, and Post-Inauguration Timeframes

Several of our analyses below compare the volume of messages sent before, during, and after the inauguration period. We defined these (using the Indian Standard Time the message was sent) as:

  • Pre-inauguration: January 14–20, 2024 (seven days)
  • Inauguration: January 21–23, 2024 (three days)
  • Post-inauguration: January 24–30 (seven days)

Although that timeframe could be extended earlier/later, we found that such modifications change the overall findings very little.

Total Message Volume

Observing the total number of messages received in the groups examined, we can see a clear increase on January 22: 

Top Senders

We observed 819 WhatsApp accounts sending messages to the groups during our full timeframe, 244 of which sent at least one message on January 22. 

In the full set of messages across the full timeframe, 270 accounts (33% of the 819) sent only one message; 170 (21%) sent 10 or more. The most active account sent 414 messages.

On January 22, 114 accounts (47% of the 244) sent one message; 22 (9%) sent 10 or more. The most active account sent 41 messages.

Comparing Political, Organic and Surrogate Groups

Through reporting, we categorized the 20 groups according to the rubric described above:

  • 10 “political” groups, with 3,662 messages in total
  • 5 “surrogate” groups, with 3,089 messages in total
  • 5 “organic” groups, with 1,418 messages in total

Looking at the number of messages by category over time revealed that the substantial increase in messages circulating in our sample during the inauguration period were largely driven by political and surrogate groups, not organic groups.

Organic groups’ total daily average went from 62 messages during the pre-inauguration period to 72 messages during the inauguration period.

In contrast, political groups’ total daily average went from 124 messages before the inauguration to 234 messages during the inauguration period; and for surrogate groups the total daily average went from 129 messages before the inauguration to 255 during the inauguration period.

The significance of this spike can be observed by normalizing the daily counts for all three categories, so that their message volumes are divided by their average daily volume in the seven-day “pre-inauguration” timeframe noted above.

‍While the political groups on average had an increase in content, the two groups with the biggest spike in content around the temple inauguration were surrogate groups.

Increase in message volume by group

Of the 20 WhatsApp groups in our sample, 16 received messages on January 22. (The remaining four had low message volumes even before the inauguration.) Of those, nearly all received more messages on January 22 than during the pre-inauguration period. Eleven received at least 10 messages; for all of them their message count on January 22 was above the pre-inauguration period daily average. 

The most active group on January 22 was a surrogate group that received 251 messages — more than three times its average of 69 messages per day during the pre-inauguration period. 

Whether measured by raw difference, ratio, or standard deviations, all of the seven largest increases came in political and surrogate groups; none were organic groups.

Distribution of Forwarding Counts

The vast majority of messages in our study had been forwarded (using WhatsApp’s forwarding mechanism) zero times or one time:

Given this distribution, and as noted above, we focused our content annotation on the most forwarded messages — those that had been forwarded two or more times.

Examining the Impact of the Inauguration on Forwarding Behavior

While the overall proportion of highly forwarded messages was small, the volume of highly forwarded messages was much higher during the inauguration period (the day before, of, and after the event) — an average of 69 per day, compared to roughly 24 per day during the week before and week after. 

That increase appears to be driven entirely by an increase in Ram Temple-related messages:

(Notes: The spikes on January 21 and 23 were each primarily driven by just one group, though the particular group differed those two days. Additionally, the majority of the 38 messages that we could not label due to missing media were concentrated on two days, Jan. 13 and Jan. 16; although we cannot know what they contained, the remaining messages those days suggest that relatively few of those missing-media messages would have been labeled as Ram-related.)

We observed no clear pattern, conversely, in highly forwarded messages that were not about the Ram Temple Inauguration:

This led us to conclude that the Ayodhya temple was a very popular topic of discussion in the WhatsApp groups we were monitoring around the time of its inauguration.

Comparing Inauguration Content to Non-Inauguration Content 

We compared highly forwarded messages in our time period that were about the temple inauguration (36% of the corpus, 95% credible interval: 33%–40%) to those that were not (64%). We found that the vast majority (84% [79%–88%]) of messages about the inauguration were also about Hinduism (as opposed to merely logistical or informational, for example), a rate much higher than among non–Ram Temple messages (15% [12%–18%]). We also found that they were more likely to express anti-Muslim sentiment (14% [11%–19%] of Ram Temple related messages vs. 5% [3%–7%] of non–Ram Temple messages).  None of the messages that were about the temple inauguration were pro-Congress (Congress being the main opposition party in India) or anti-BJP.

We didn’t see substantial differences in disinformation between messages mentioning the temple inauguration compared to the rest. However, our analysis on this front was not comprehensive. While some of the messages clearly spread disinformation that had been debunked by fact checking organizations, many required further investigation to be conclusively labeled as disinformation and ultimately we did not have the capacity to do this work in the given time period. 

During the annotation process, labelers tried to tease out the difference between messages that were promoting Hinduism broadly versus messages that were specifically promoting a more nationalist Hindutva ideology. This proved to be very challenging since there are many concepts, symbols, and phrases that could apply to both. We decided to take a conservative approach in our analysis and only label messages where the reference to Hindutva was explicit or the ties to Hindutva principles were clear as “pro-Hindu”. Even with our conservative labeling we found a positive correlation between messages that were about the temple that were also “pro-Hindu”. 

Limitations

Sample size and convenience sample

The groups in our data collection are a convenience sample and represent only a small portion of groups in Mandi. To protect user privacy we collected data only from groups that we deemed public. We do not have insight on how political messaging was carried out in private chats.

Lack of diversity in political representation

We have only BJP groups in our dataset because those were the only ones that the reporter was able to find even after making a considerable effort. Ideally we would have liked to have compared content from different political parties.

Forward-counts do not capture full scope of message dissemination

While the forward counts that WhatsApp attaches to message data are a good indicator of popular content, they do not fully capture a message’s popularity. We have observed anecdotally that forwarding limits are often circumvented by copy-pasting messages from one group to another. Moreover, we only know the number of times a message had been forwarded at the point it was received, and only for that particular chain of forwards; the message data does not indicate how many times it may have been forwarded elsewhere.

Missing media

We could not label 38 highly-forwarded messages due to issues with our data pipeline collecting their media content. It is possible that these messages would have been different in their general tendencies than the other 713. We observed, however, that most of these messages came from before the core inauguration period.

Ephemeral messages

Due to limitations of the infrastructure, it is possible that our corpus is missing some auto-expiring and user-deleted messages that disappeared before collection.

Exposure vs. counts

While we know the number of members in each of the groups, we do not know how often the members actively see or check messages in any group.

Ambiguous nature of the content

Many of the messages in our dataset did not fall clearly into our label definitions. For example, we found instances of videos of violence that seemed to suggest violence against a religious group, but it wasn't clear what exactly happened or if the video had been creatively edited. Given our time constraints we weren't able to independently verify these videos and out of caution decided not to label messages that didn't clearly fit our definitions.

Temporal nature of WhatsApp Groups

WhatsApp groups are highly dynamic in nature and we observed a constant fluctuation of members and admins in the groups that were shared with us. We also observed changes to group admins and group names. Our group definitions are accurate for the time period of the study but it is possible that since then the specific groups from our dataset no longer serve the same purpose.

Conclusion  

We examined 8,169 messages (and carefully labeled a subset of 713 “highly-forwarded” messages) sent in 20 public WhatsApp groups in one town in India. Although the data, and our findings, are not necessarily representative of activity on WhatsApp more broadly in the country, they provide a key glimpse into how political parties use the platform to drive home their talking points with voters. 

We found a substantial increase in message volume on the day of the Ram Temple inauguration that was almost entirely driven by political and surrogate groups belonging to the BJP (rather than organic community groups). 

We also observed a substantial increase in highly-forwarded messages in the few days surrounding the inauguration, entirely attributable to an increase in forwarding of messages related to the Ram Temple. 

Our analysis also found that highly-forwarded messages about the temple were much more likely to promote Hinduism than messages not about the temple, and were slightly more likely to express anti-Muslim sentiment. In addition, we identified no messages about the temple inauguration that were pro-Congress or anti-BJP.

Taken together, our findings demonstrate how WhatsApp allows political parties to blur the lines between political and personal speech. Because WhatsApp labels neither accounts nor groups as belonging to political parties, an average voter might have difficulty distinguishing campaign messaging from organic community activity. It appears that campaigns have taken advantage of this ambiguity in their messaging about the Ram Temple inauguration, ramping up messaging not only in their official groups but also in “surrogate” groups that may appear more organic to voters than they really are. 

Appendix - Label Definitions

We attempted to apply the following 16 labels to our dataset. Through the labeling process we realized that we could not use some of them due to ambiguity in their definitions as well as a lack of consistency in how they were being applied by labelers. Ultimately we only used 11 labels for the quantitative analysis presented here.

Labels we used

  • Ram-Temple-Ayodhya (36% of messages labeled)
    • Content related to the inauguration of the Ram Temple in Ayodhya on January 22, 2024. This label also includes content related to the Ram Temple in Ayodhya more generally. 
  • Pro-BJP (12%)
    • Pro-BJP messages support the Bharatiya Janata Party (BJP), its leaders, and its political agenda. Content in this category promotes the party's achievements, policies, or positions on various issues. Materials that are informing party members of meetings or other administrative issues are not considered pro-BJP, even if the people who attend these events are supporters of the party. Messages that contain documentation of party meetings and events without any further context were not considered “Pro-BJP.”
  • Anti-BJP (1%)
    • Anti-BJP messages express opposition or criticism towards the BJP. This may include content highlighting perceived flaws in the party's actions, policies, or leadership.
  • Pro-Congress (2%)
    • Pro-Congress messages support the Indian National Congress party, its leaders, policies, and ideologies. This category includes content that advocates for the party's positions, achievements, or initiatives. Materials that are informing party members of meetings or other administrative issues are not considered pro-Congress, even if the people who attend these events are supporters of the party.
  • Anti-Congress (8%)
    • Anti-Congress messages express opposition or criticism towards the Indian National Congress party. This may include content highlighting perceived flaws in the party's actions, policies, or leadership. 
  • Pro-Muslim (None)
    • The Pro-Muslim category encompasses content that promotes a positive perspective and appreciation for the Muslim community. Pro-Muslim messages may highlight the positive aspects of Islam, its traditions, values, and the peaceful coexistence of its followers within a global context.
  • Anti-Muslim (8%)
    • This category includes content that demonstrates a negative bias or hostility towards the Muslim community. This label includes messages that create an inaccurate interpretation of Muslims and Islam by reconstructing the faith traditions of the religion and stereotyping them into themes of violence, civilizational subversion, and fundamental otherness.  
  • Pro-Hindu (16%)
    • Messages in this category are focused on Hindutva, encompassing discussions on the political and cultural ideology associated with Hindu nationalism. This includes discourse on promoting Hindu values, traditions, and beliefs. Content in this category aims to advocate and celebrate the principles of Hindutva within the group. Images that contain phrases or symbols commonly found in content promoting Hindutva, such as the phrase “Jai Shree Ram” or images of Hindu gods but where the ties to Hindutva principles are not clear, shouldn't be labeled Pro-Hindu.
  • Anti-Hindu (None)
    • Anti-Hindu content expresses views contrary to Hinduism and its associated practices, including critiques of religious rituals, traditions, festivals, and spiritual beliefs. This label is intended for discussions that may challenge or oppose aspects of the Hindu faith within the group. 
  • Religious-Hinduism (40%)
    • Messages under this label specifically discuss or reference Hindu religious beliefs, traditions, rituals, and values. This category focuses on Hinduism, covering topics such as scriptures, mythologies, religious ceremonies, festivals, and philosophical discussions.
  • Religious-Islam (None)
    • Messages under this label specifically discuss or reference Islamic religious beliefs, traditions, rituals, and values. This category focuses on Islam, covering topics such as the Quran, Hadith, religious practices, festivals, and theological discussions.

Labels we applied, but ultimately didn’t use

  • Promotion of National Identity 
    • WhatsApp messages or media that intend to invoke a sense of loyalty and devotion towards India are labeled under Promotion of National Identity. This category encompasses content promoting national identity and patriotism, often highlighting Hindu traditions and values, and associating them with a shared national spirit.
  • Irrelevant Content
    • This category includes messages that lack relevance to the primary theme or purpose of the WhatsApp group. It encompasses content such as generic good morning messages, greetings related to festivals, or inspirational quotes that, while positive, do not directly contribute to the discussions or activities within the group.
  • Disinformation
    • Messages labeled as disinformation contain false or misleading information. The purpose behind such content is to manipulate readers, spreading inaccurate details intentionally. Messages labeled under disinformation are also fact-checked.
  • Local Community
    • Local Community messages focus on topics relevant to a specific community or geographical area. This includes discussions about local events, issues, and initiatives directly impacting the community. Local newspapers often fall into this category as well.
  • Satire/Meme
    • Satirical or meme content is intended for humor, often using satire, irony, or visual elements in the form of memes. This label is a supplementary label used to understand the messaging of the media and will be tagged along with a label corresponding to the theme of the media. 

Acknowledgments

We would like to thank Arvind Narayanan, Kiran Garimella, and Andy Guess for reviewing this report. We would also like to thank Victoria Turk from Rest of World for editorial guidance. 

RELATED INITIATIVES

Logo: The AI Accountability Network

Initiative

AI Accountability Network

AI Accountability Network

RELATED TOPICS

a pink halftone illustration of a woman speaking a microphone while raising a fist

Topic

Democracy and Authoritarianism

Democracy and Authoritarianism
technology and society

Topic

Technology and Society

Technology and Society
orange halftone illustration of three newspapers stacked on each other

Topic

Misinformation and Disinformation

Misinformation and Disinformation
teal halftone illustration of praying hands

Topic

Religion

Religion