Combating Online Scientific Misinformation

PhD Thesis at EPFL

The drastic shift towards digital communication in our mediasphere has caused a profound change in the production and consumption of information, which in turn has substantial implications on the social and political landscape. Misinformation, as a side effect of mass information diffusion, has become a fundamental problem for governments, platforms, and the general public in light of critical events such as elections, pandemics, and wars. In this thesis, we focus on the problem of online scientific misinformation, proposing methods that tackle the problem at different granularity levels: at the level of claims, articles, and sources.


SciClops

For combating claim-based scientific misinformation, we introduce SciClops, a method for detecting and contextualizing scientific claims for assisting manual fact-checking. Our method involves three steps: (1) extracting scientific claims using a domain-specific, fine-tuned transformer model, (2) clustering similar claims together with related scientific literature using a method that exploits their content and the connections among them, and (3) highlighting check-worthy claims broadcasted by popular yet unreliable sources. Our experiments show that SciClops effectively assists non-expert fact-checkers in verifying complex scientific claims, facilitating them to outperform commercial fact-checking systems.

SciClops Overview

SciClops Overview


SciLens

For combating article-based scientific misinformation, we introduce SciLens, a method for evaluating the quality of scientific news articles. Our method involves a series of quality indicators for news articles that derive from: (1) their content, including the use of attributed quotes, (2) their scientific context, including their semantic similarity and web proximity to the scientific literature, and (3) their social context, including their social media reach and stance. Our experiments show that these indicators help non-experts evaluate the quality of articles more accurately compared to non-experts that do not have access to these indicators. Moreover, SciLens can also produce completely automated quality scores for articles, which agree more with expert evaluators than manual evaluations done by non-experts.

SciLens Overview

SciLens Overview


SciLander

For combating source-based scientific misinformation, we introduce SciLander, a method for learning representations of news sources reporting on scientific topics. Our method involves heterogeneous source indicators that capture: (1) the copying of news stories between sources, (2) the semantic shift of terms across sources, (3) the usage of jargon, and (4) the stance towards specific citations. SciLander uses these indicators as signals of source agreement to train unsupervised source embeddings. Our experiments show that the learned source representations outperform state-of-the-art baselines on the task of news veracity classification while encoding information about the reliability, political leaning, and partisanship bias of these sources.

SciLander Overview

SciLander Overview


Details

Publications
  • [ICWSM'23] M. Gruppi, P. Smeros, S. Adali, C. Castillo, K. Aberer. SciLander: Mapping the Scientific News Landscape. [pdf, bib]
  • [CIKM'21] P. Smeros, C. Castillo, K. Aberer. SciClops: Detecting and Contextualizing Scientific Claims for Assisting Manual Fact-Checking. [pdf, slides, bib]
  • [VLDB'20] A. Romanou, P. Smeros, C. Castillo, K. Aberer. SciLens News Platform: A System for Real-Time Evaluation of News Articles. [pdf, bib]
  • [WWW'19] P. Smeros, C. Castillo, K. Aberer. SciLens: Evaluating the Quality of Scientific News Articles Using Social Media and Scientific Literature Indicators. [pdf, slides, bib]
System
Code, Models, Datasets
Support