& Miriam Milzner. (2023) Challenges of and approaches to data collection across platforms and time: Conspiracy-related digital traces as examples of political contention. Journal of Information Technology & Politics 0:0, pages 1-17. Annett Heft, Kilian Buehling, Xixuan Zhang, Dominik Schindler
Summary
The study scrutinized data from Telegram, emphasizing the impact of message deletion on the platform. It explored how message deletion compromises dataset integrity and reliability and alters the outcomes of computational analyses.
Using a series of messages from public Telegram entities, the research repeatedly sampled and analyzed the data and discovered that message deletion introduces significant biases in both data collection and analysis processes. The deletion of a considerable number of text messages from group chats skewed distributions and potentially impacted the accuracy of computational content analysis methods, highlighting the need for careful interpretation of results from such platforms.
Key Findings
Message Deletion & Biases: Message deletion on Telegram introduces biases to the computational collection and analysis of data, affecting the consistency of datasets, the quality of social network analyses, and the results of computational content analysis methods.
Disproportionate Prone to Deletion: The study identified that different data types are disproportionately prone to deletion on Telegram, impacting the reliability of the collected data.
Effects of Message Ephemerality: Message ephemerality reduces dataset consistency, making it challenging to obtain complete and reliable data for analysis, thus impacting the outcomes of content analysis using computational methods such as topic modeling and dictionaries.
Impact on Computational Methods: Message deletion has significant effects, and it can introduce biased results in computational content analysis methods, affecting the reliability of studies utilizing these methods.
Implications
Awareness of Deletion Impact: Researchers need to be cognizant of the extent of message deletion on Telegram and its potential impact on their analysis. Understanding the limitations and biases introduced by missing data is crucial for accurate analysis.
Caution with Analysis Methods: Computational content analysis and social network analysis methods should be approached with caution when analyzing Telegram data due to the potential biases and inaccuracies introduced by message deletion.
Collaborative Practices: Collaborative practices of privacy-sensitive data sharing among scholars using Telegram data are essential to address the issues of missing data and to improve the reliability and replicability of studies.
Strategic Approach to Data Collection: Researchers should strategically approach data collection and analysis on platforms like Telegram, considering the inherent challenges and biases, to ensure the validity and reliability of their findings.
Impact on Computational Methods: Message deletion has significant effects, and it can introduce biased results in computational content analysis methods, affecting the reliability of studies utilizing these methods.
Source
Citation
@article{infoepi_lab2023,
author = {{InfoEpi Lab}},
publisher = {Information Epidemiology Lab},
title = {Telegram {Message} {Deletion} {Affects} {Study} {Results}},
journal = {InfoEpi Lab},
date = {2023-09-25},
url = {https://infoepi.org/posts/2023/09/25-telegram-deleted.html},
langid = {en}
}