This research paper examines accusations of political bias in technology companies' moderation of misinformation, suggesting that behavioral differences in misinformation sharing between political groups may lead to uneven enforcement outcomes, despite unbiased policies. The study analyzes data from Twitter, Facebook, and surveys over multiple years and countries, finding that conservative users consistently share more low-quality news as judged by both experts and politically balanced laypeople groups. This difference in behavior is highlighted as a potential explanation for the disproportionate suspension of right-leaning users, indicating that unequal outcomes might not necessarily reflect platform bias.
Description: Table 1 categorizes 60 news domains into 'Mainstream', 'Hyper-partisan', and 'Fake News', providing corresponding quality scores from fact-checkers and politically balanced layperson ratings.
Relevance: This table is crucial for understanding how news quality is measured and categorized in the study, forming the basis for analyzing the relationship between political affiliation and shared news quality.
Description: Figure 1 illustrates the relationship between political leanings and the quality of shared news, using density plots and charts to demonstrate the association of conservatism with low-quality news sharing.
Relevance: This figure visually supports the study's core finding, showing that differences in user behavior, rather than platform bias, may contribute to unequal enforcement outcomes.
The study provides important insights into the debate about political bias in social media misinformation moderation. By demonstrating that conservative users share more low-quality news across various platforms, it suggests that behavioral differences may explain uneven enforcement outcomes, rather than platform bias. This has significant implications for how social media companies are perceived and how they develop and implement moderation policies. Future research could further explore the causal mechanisms behind these behavioral differences, examine the role of political elites in misinformation dissemination, and test the robustness of findings across additional platforms and contexts.
This research paper investigates political bias accusations against technology companies moderating misinformation. It argues that differing rates of misinformation sharing between political groups can lead to uneven enforcement outcomes, even with unbiased policies. Using data from Twitter, Facebook, and surveys across multiple years and countries, the study finds a consistent pattern of conservative users sharing more low-quality news, as judged by both experts and politically balanced layperson groups. This difference in behavior, the research suggests, can explain the disproportionate suspension of right-leaning users, highlighting that unequal outcomes don't necessarily indicate platform bias.
The abstract effectively summarizes the research question, methodology, key findings, and implications in a clear and concise manner, making it easy for readers to grasp the main points of the study.
The abstract clearly establishes the relevance and importance of the research question by highlighting the ongoing debate surrounding political bias in social media moderation and the need for a more nuanced understanding of the issue.
The abstract presents key findings that are both interesting and potentially impactful, suggesting that observed political asymmetries in social media sanctions may be driven by differences in behavior rather than platform bias.
While the abstract mentions key findings, providing specific numbers or effect sizes would strengthen its impact and provide readers with a more concrete understanding of the results.
Rationale: Quantifying the difference in suspension rates or the extent of low-quality news sharing would make the findings more impactful and persuasive.
Implementation: Include specific statistics, such as the relative risk of suspension or the difference in the average number of low-quality links shared, to illustrate the magnitude of the observed effects.
Acknowledging the limitations of the study, such as the focus on specific platforms or time periods, would enhance the abstract's credibility and provide a more balanced perspective.
Rationale: Acknowledging limitations strengthens the research by demonstrating a nuanced understanding of the study's scope and potential generalizability.
Implementation: Add a brief sentence acknowledging the study's limitations, such as "While these findings are based on specific datasets and timeframes, they offer valuable insights into the complex relationship between political behavior and social media moderation."
Social media companies face pressure to combat misinformation, leading to policies like post removals and user suspensions. However, these policies have sparked accusations of political bias. This paper argues that differing misinformation sharing behaviors among political groups can lead to unequal enforcement outcomes even with unbiased policies. The introduction highlights public concern about misinformation and the resulting actions taken by social media companies, emphasizing the controversy surrounding perceived political bias in these actions.
The introduction clearly articulates the problem of perceived political bias in social media's handling of misinformation, setting the stage for the research question.
The introduction provides relevant background information on the prevalence of misinformation and the public's desire for platform intervention, establishing the context for the study.
The introduction effectively introduces the paper's central argument, explaining how differential behavior can lead to unequal outcomes even under neutral policies, using a clear analogy.
While the core argument is presented, a more explicitly stated research question would enhance clarity and focus.
Rationale: A specific research question helps guide the reader and provides a clearer framework for evaluating the study's findings.
Implementation: Formulate a concise research question, such as, "To what extent can differential misinformation sharing behaviors explain the observed political asymmetries in social media enforcement actions?"
Briefly previewing the main findings of the study would increase reader engagement and provide a roadmap for the paper.
Rationale: Previewing the key findings helps readers understand the direction and significance of the research.
Implementation: Include a brief sentence or two summarizing the main findings, such as, "Our analysis reveals a consistent pattern of...leading to...This suggests that..."
While the introduction mentions the implications of the research for platform bias accusations, briefly elaborating on the broader societal implications would strengthen its impact.
Rationale: Highlighting the broader implications of the research increases its relevance and potential impact.
Implementation: Add a sentence or two discussing the potential implications for public discourse, political polarization, or trust in social media.
This section presents the results of the study, focusing on Twitter suspensions after the 2020 US election. It shows that while pro-Trump users were more likely to be suspended, they also shared significantly more low-quality news, even when judged by politically balanced groups. This pattern of conservatives sharing more low-quality news is found across multiple datasets and platforms, suggesting that unequal suspension rates may stem from differences in online behavior rather than platform bias. The section also explores how simulating politically neutral suspension policies based on low-quality news sharing or bot likelihood still results in disproportionate impacts on certain political groups.
The section presents a thorough analysis of multiple datasets, using various metrics and methods to support its claims. This strengthens the validity and generalizability of the findings.
Addressing potential bias in news quality evaluation, the study incorporates ratings from politically balanced groups of laypeople, adding an important layer of objectivity to the analysis.
The use of simulations to explore the impact of hypothetical, politically neutral suspension policies provides valuable insights into the potential for disparate impact even in the absence of bias.
While the study acknowledges the correlational nature of the data, some language could be further refined to avoid implying causation where it cannot be established.
Rationale: Maintaining precision in causal language is crucial for ensuring accurate interpretation of the findings.
Implementation: Rephrase sentences like the quoted example to explicitly state the observed association without suggesting a causal link. For example, "This observed association between low-quality news sharing and suspension rates among right-leaning users warrants further investigation to determine the underlying causal mechanisms."
While the section reports effect sizes, providing more context and interpretation of their magnitude would enhance understanding.
Rationale: Providing context for effect sizes helps readers understand the practical significance of the findings.
Implementation: Explain what the reported effect sizes mean in practical terms. For example, "This substantial difference in average quality suggests..." or provide comparative benchmarks.
While the simulations are valuable, explicitly discussing their limitations, such as the simplified nature of the modeled policies, would strengthen the analysis.
Rationale: Acknowledging the limitations of the simulations increases transparency and helps readers understand the scope of the inferences that can be drawn.
Implementation: Add a paragraph discussing the limitations of the simulations, such as the assumptions made about the suspension policies and the potential for real-world policies to be more complex.
Table 1 categorizes 60 news domains into 'Mainstream', 'Hyper-partisan', and 'Fake News', providing corresponding quality scores from professional fact-checkers and politically balanced layperson ratings. The fact-checker rating is based on assessments from 8 professional fact-checkers, while the politically balanced layperson rating is an average of trustworthiness scores from Democrats and Republicans (from a sample of 970 laypeople). Higher scores indicate higher quality (trustworthiness).
Text: "by fact-checkers and journalists; see Table 1 for a list of the domains used and ref. 38 for details) from 8 professional fact-checkers38"
Context: of 60 news domains (the 20 highest volume sites within each category of mainstream, hyper-partisan and fake news, as determined
Relevance: This table is crucial for understanding how news quality is measured and categorized in the study. It provides the foundation for analyzing the relationship between political affiliation and the quality of news shared.
Figure 1 illustrates how social media users' political leanings relate to the quality of news they share. Subfigures (a) and (b) use density plots to compare the distribution of low-quality news sharing scores for users associated with Biden and Trump hashtags, based on fact-checker and layperson ratings, respectively. Subfigure (c) is a table showing the top 5 most shared news domains by each group. Subfigure (d) presents a bar chart showing the correlation between conservatism and low-quality news sharing across seven different datasets.
Text: "t(8,943)= 1.2 × 102, P<0.0001; Fig. 1a)."
Context: than people who used Biden hashtags (t-test,
Relevance: This figure visually demonstrates the core finding of the study: a strong association between political leaning and the quality of news shared on social media. It supports the argument that differences in behavior, rather than platform bias, may contribute to unequal enforcement outcomes.
This figure illustrates how politically neutral enforcement policies related to low-quality news sharing and bot activity can lead to disproportionate suspension rates for Republicans. It contains three subplots. Subplot (a) shows the predictive accuracy (AUC) of various factors, including political orientation and low-quality news sharing, in predicting Twitter suspensions. Subplots (b) and (c) use simulations to show the expected suspension rates for Democrats and Republicans under different policy 'harshness' levels. Subplot (b) focuses on low-quality news sharing, where harshness is the probability of suspension per low-quality link shared. Subplot (c) focuses on bot activity, where harshness is the minimum probability of being human required to avoid suspension.
Text: "Fig. 3 | Suspending users for sharing links to low-quality news sites or for having a high bot score would disproportionately affect Republicans."
Context: This figure is referenced in the context of discussing how unbiased policies can still lead to unequal suspension rates due to differences in behavior between political groups.
Relevance: This figure is crucial for understanding the central argument of the paper. It visually demonstrates that even with politically neutral enforcement policies, differences in behavior, such as sharing low-quality news or bot activity, can lead to unequal outcomes, with Republicans being disproportionately affected.
This figure displays the distribution of politically balanced layperson ratings of news domains, plotted against the ratings provided by professional fact-checkers. It's a scatter plot where each point represents a news domain. The x-axis shows the fact-checker rating, and the y-axis shows the average rating from politically balanced layperson groups (average of Democrat and Republican ratings). Orange diamonds highlight the news domains classified as 'low-quality' in the study's simulations.
Text: "Extended Data Fig. 2 | Distribution of politically balanced layperson ratings of news domains."
Context: This figure is mentioned in the context of explaining how 'low-quality' news sources were defined for the simulations of politically neutral suspension policies.
Relevance: This figure is important because it shows how the 'low-quality' news sources used in the simulations were determined. It helps address potential concerns about bias in the selection of low-quality sources by showing the agreement between layperson and expert ratings.
This figure uses density plots to show the distribution of bot scores, generated by Bot Sentinel, for Twitter users who primarily shared either Biden or Trump hashtags during the 2020 election. A higher bot score indicates a higher likelihood of being a bot. The x-axis represents the bot score (from 0 to 1), and the y-axis represents the relative frequency of users with that score. The distribution for Trump hashtag users (red) is shifted noticeably to the right compared to the distribution for Biden hashtag users (blue). This shift suggests that users who shared Trump hashtags were, on average, rated as more likely to be bots.
Text: "Indeed, as with sharing links to low-quality news sites, users on the political right had significantly higher estimated likelihoods of being a bot (0.70 < r < 0.76 depending on political orientation measure, P < 0.0001 for all; Extended Data Fig. 4), and simulating suspension on the basis of likelihood of being a bot leads to much higher suspension rates for Republican accounts than Democrat accounts (Fig. 3c; see the Methods and Supplementary Information section 2 for details)."
Context: The authors are discussing how conservative users not only shared more low-quality news but also had higher bot scores, which could contribute to the higher suspension rates.
Relevance: This figure is relevant because it provides evidence for another behavioral difference between the two political groups, beyond just news sharing quality. The higher bot scores among Trump supporters could contribute to their higher suspension rates, even under a politically neutral anti-bot policy.
This table presents the results of four different regression models (Probit, Probit Ridge, Logit, and Logit Ridge) predicting Twitter account suspension during the 2020 election study. The independent variables include political orientation, low-quality news sharing, bot scores, toxic language use, number of followers, number of friends, and other control variables. Each cell in the table shows the estimated regression coefficient and its standard error in parentheses. Asterisks indicate statistical significance (*p<0.05, **p<0.01, ***p<0.001). The table shows that low-quality news sharing and bot scores are significant predictors of suspension across all models, while political orientation is not a significant predictor in the non-ridge regression models.
Text: "We then use probit regression to predict whether the user was suspended as of the end of July 2021, with P values Holm–Bonferroni corrected to adjust for multiple comparisons (see Supplementary Information section 1 for a full list of control variables and Extended Data Table 1 for regression models)."
Context: The authors are explaining their statistical approach to analyze the factors contributing to Twitter suspensions, using regression models to predict suspension based on various user characteristics and behaviors. They refer to Extended Data Table 1 for the full regression results.
Relevance: This table is crucial because it directly addresses the question of whether political orientation or other factors, like low-quality news sharing and bot activity, better explain Twitter suspensions. The regression results help disentangle the effects of these correlated variables.
This figure shows that conservatives shared more misinformation than liberals on Twitter. It uses violin plots to display the distribution of false news URLs shared by each group, based on both fact-checker ratings (a) and politically balanced crowd ratings (b). The y-axis represents the log10(count of primary posts containing the URL + 1), allowing for visualization of the distribution across a wide range of share counts. Panels (c) and (d) show the correlation between conservatism and the fraction of shared COVID-19 claims rated as false by fact-checkers (c) or inaccurate by layperson crowds (d) across 16 countries. The overall effect is calculated using random effects meta-analysis, and error bars indicate 95% confidence intervals.
Text: "Conservatives shared more false claims than liberals"
Context: This is the title of Figure 2, which explores the relationship between political leaning and sharing misinformation on Twitter and in a cross-national survey about COVID-19.
Relevance: This figure directly supports the paper's central argument by demonstrating the behavioral asymmetry in misinformation sharing between conservatives and liberals. This difference in behavior is crucial for understanding how politically neutral enforcement policies can still lead to disparate outcomes.
This figure compares the distribution of low-quality news site sharing scores between Twitter users who used Trump hashtags and those who used Biden hashtags. It uses four separate density plots, each representing a different news quality rating set: Lasser et al. (2022), Media Bias/Fact Check, Ad Fontes Media, and Republican-Only Layperson ratings. The x-axes show standardized (z-scored) low-quality news sharing scores, where higher scores indicate lower-quality sharing. The y-axes represent the relative frequency of each score within each group. The consistent rightward shift of the Trump hashtag user distribution in all four plots indicates that these users shared lower-quality news regardless of the rating system used.
Text: "Extended Data Fig. 1"
Context: Mentioned on page 2 in the context of comparing the average quality of domains shared by people who used Trump hashtags versus Biden hashtags, using various quality rating sources.
Relevance: This figure strengthens the main finding by showing that the observed difference in low-quality news sharing between Trump and Biden supporters is robust across multiple independent news quality rating systems. This robustness helps rule out the possibility that the results are driven by biases in any single rating system.
This section details the methodology used in the 2020 election Twitter suspension study. Researchers collected tweets from users who used either #Trump2020 or #VoteBidenHarris2020, focusing on those who shared links to news domains. They used various methods to assess news quality, including ratings from professional fact-checkers and politically balanced groups of laypeople. User political orientation was determined through hashtag usage, followed accounts, and shared news sites. Finally, the researchers simulated politically neutral suspension policies based on low-quality news sharing and bot likelihood to assess potential disparate impacts.
The methods section provides a thorough description of the data collection process, including the specific hashtags used, the sample size, and the data retrieval methods. This level of detail enhances the reproducibility of the study.
Using multiple methods to measure political orientation, including hashtag usage, followed accounts, and shared news sites, provides a more robust and nuanced assessment of user political leanings.
Incorporating news quality ratings from politically balanced groups of laypeople helps mitigate potential concerns about bias in the evaluation of news sources.
The methods section mentions excluding "elite" users with more than 15,000 followers. Providing a clearer justification for this exclusion would strengthen the methodology.
Rationale: A clear rationale for excluding certain users is essential for transparency and understanding the potential limitations of the sample.
Implementation: Provide a more detailed explanation of why elite users are considered unrepresentative and how their exclusion might affect the study's findings. Consider providing data on the characteristics of the excluded users to support the rationale.
While the methods section outlines the general approach to the policy simulations, more specific details on the simulation parameters and assumptions would be beneficial.
Rationale: Providing more details on the simulations would enhance the reproducibility of the study and allow readers to better understand the limitations of the simulated scenarios.
Implementation: Specify the range of probabilities used for the suspension policies, the criteria for defining "low-quality" news sites, and any other relevant parameters or assumptions made in the simulations. Consider providing the code used for the simulations in a supplementary material or repository.
The methods section only briefly mentions the additional datasets used. Providing more details about the data collection, sample characteristics, and methodologies for each dataset would improve the transparency and reproducibility of the analyses.
Rationale: A more detailed description of the additional datasets would allow readers to better understand the context and limitations of the analyses performed on these datasets.
Implementation: For each additional dataset, provide information on the sample size, demographics, data collection methods, time period covered, and any relevant exclusion criteria. Consider adding a table summarizing the key characteristics of each dataset.
This figure investigates whether there are specific topics for which liberals share more misinformation than conservatives on Twitter. It presents two forest plots. The top plot uses fact-checker ratings of falsity, while the bottom uses ratings from politically balanced crowds. Each plot shows the coefficient from a linear regression predicting the log10 (number of misinformation shares + 1) for each topic (US Politics, Social Issues, COVID-19, Business/Economy, Foreign Affairs, Crime/Justice). A positive coefficient would indicate that conservatives shared more, while a negative coefficient would indicate that liberals shared more. The plots also include an overall estimate across all topics and the weight (%) each topic contributes to the overall analysis. The error bars represent 95% confidence intervals. Importantly, none of the confidence intervals cross zero in the negative direction, and the overall estimates are positive, suggesting no evidence of topics where liberals share more misinformation.
Text: "For methodological details, see the Methods; for further analyses, see Supplementary Information section 3.6 and Extended Data Fig. 3."
Context: This is mentioned in the context of analyzing sharing of URLs deemed inaccurate by fact-checkers or politically balanced layperson ratings, estimating user ideology based on followed accounts, and finding conservatives shared more inaccurate URLs.
Relevance: This figure is relevant as it addresses a potential counter-argument: are there any topics where liberals share more misinformation? The findings reinforce the overall trend of greater misinformation sharing by conservatives, showing this pattern holds across various topics and isn't reversed for any specific subject.
This section contains supplementary figures that provide additional context and support for the findings discussed in the main text of the research paper. These figures offer further details and robustness checks related to the relationship between political affiliation, news quality, and Twitter suspensions.
The figures provide visual and statistical support for the claims made in the main text, strengthening the overall argument.
The use of multiple news quality rating systems and the comparison with layperson ratings demonstrate the robustness of the findings to different evaluation methods.
The figures and table provide detailed information about the data and analyses, allowing for a more in-depth understanding of the results.
The justification mentions that the figures support findings in the main text, but it would be helpful to explicitly state which specific findings each figure supports.
Rationale: Clear cross-referencing would improve the connection between the extended data figures and the main narrative of the paper.
Implementation: Add specific references to the relevant sections or figures in the main text when discussing each extended data figure. For example, "Extended Data Figure 1 supports the findings presented in Figure 2 of the main text."
While the figures present data, a brief discussion of the implications of each figure would enhance their value and connect them more directly to the research questions.
Rationale: Explaining the implications of each figure would help readers understand their significance and how they contribute to the overall argument.
Implementation: Add a short paragraph after each figure caption summarizing the key takeaways and their relevance to the research questions.
Given the amount of data presented, interactive figures could enhance exploration and understanding. For instance, an interactive version of Extended Data Table 1 could allow readers to filter and sort the data.
Rationale: Interactive figures can make complex data more accessible and engaging for readers.
Implementation: Explore the possibility of creating interactive versions of the figures and table, perhaps as supplementary online material.
This supplementary table provides further details on the regression analyses predicting Twitter account suspensions. It expands on the findings presented in the main text, showing the results of different regression models (Probit, Probit Ridge, Logit, and Logit Ridge) and including a wider range of control variables.
The table provides detailed statistical information, including coefficients, standard errors, and significance levels, allowing for a thorough assessment of the regression results.
The inclusion of multiple regression models (Probit, Probit Ridge, Logit, and Logit Ridge) allows for comparison and assessment of the robustness of the findings across different statistical approaches.
The inclusion of control variables helps to account for potential confounding factors and isolate the effects of the key predictors of interest.
While the table provides coefficients and significance levels, adding effect sizes (e.g., standardized coefficients, odds ratios) would enhance the interpretability and practical significance of the findings.
Rationale: Effect sizes provide a standardized measure of the magnitude of the association between predictors and suspension, allowing for easier comparison and understanding of the practical importance of the findings.
Implementation: Calculate and include effect sizes for each predictor variable in the table. For example, for Probit and Logit models, report average marginal effects or standardized coefficients. For ridge regression models, consider reporting standardized coefficients or other appropriate effect size measures.
The table includes four different regression models, but the rationale for choosing these specific models is not explicitly stated.
Rationale: Explaining the reasons for selecting these particular models would enhance the transparency and methodological rigor of the analysis.
Implementation: Add a brief explanation in the table caption or a footnote justifying the choice of Probit, Probit Ridge, Logit, and Logit Ridge models. Discuss the assumptions of each model and why they are appropriate for the data and research question. For example, if ridge regression is used to address multicollinearity, mention this explicitly.
The table mentions the inclusion of control variables, but it doesn't specify which control variables were used.
Rationale: Providing a list of the control variables would improve the transparency and reproducibility of the analysis.
Implementation: Include a list of the control variables in the table caption or a footnote. Briefly describe each control variable and its potential relevance to the outcome variable. If the full list is extensive, consider providing it in a supplementary table.
This reporting summary outlines the statistical methods, software, data availability, and ethical considerations of the study. It confirms adherence to Nature Portfolio's reporting standards for reproducibility and transparency. The summary details the software used for data collection and analysis (Python, R, and STATA), affirms that the data necessary for reproducing the results is available online, and addresses the study's focus on publicly available social media data, which did not require ethical approval beyond MIT's observational study protocol.
The summary follows a clear structure, addressing key aspects of reporting in a concise and organized manner, facilitating easy access to essential information.
The summary highlights the importance of reproducibility and transparency, aligning with Nature Portfolio's policies and promoting rigorous research practices.
The summary provides a clear and accessible link to the publicly available dataset, enabling others to reproduce and verify the findings.
While the summary mentions data availability, it would be beneficial to explicitly state any restrictions on data access or use, even if none exist.
Rationale: Providing information about data restrictions, or lack thereof, enhances transparency and clarifies the terms of data use for other researchers.
Implementation: Add a sentence explicitly stating whether there are any restrictions on data access or use. For example, "There are no restrictions on data access or use." or specify any applicable limitations.
While the summary states that ethical approval was not required, providing a brief explanation of the ethical considerations taken into account when using public social media data would strengthen the ethical reporting.
Rationale: Addressing ethical considerations, even in observational studies using public data, demonstrates a commitment to responsible research practices and can help anticipate potential ethical concerns.
Implementation: Add a sentence or two discussing the ethical considerations related to using public social media data, such as privacy concerns and potential risks to individuals. For example, "While the data used in this study is publicly available, we took steps to ensure the privacy of individuals by anonymizing user data and avoiding the collection of sensitive personal information."
The summary mentions adherence to reporting standards but lacks specific details about the statistical methods employed. Providing more information about the specific tests used and any corrections applied would enhance transparency.
Rationale: Providing specific details on the statistical methods used would allow readers to better understand the analytical approach and assess the validity of the findings.
Implementation: Include a brief description of the specific statistical tests used in the study, such as t-tests, chi-square tests, or regression analyses. Mention any corrections applied for multiple comparisons or other statistical adjustments.