Authors: Akash Suri, Aryan Pathania, Divyayush Verma, Rajat Takkar
Abstract: The proliferation of fake news on social media plat- forms poses significant threats to public discourse and democratic processes. While numerous machine learning approaches have been proposed for fake news detection, limited attention has been given to understanding why different models classify news as fake and whether these explanations are consistent across algorithms. This paper presents a comparative and explainable machine learning framework that addresses two critical research questions: (1) Do different ML models agree on which textual fea- tures indicate fake news? (Trust Gap Analysis), and (2) Do fake news patterns learned from one domain generalize to another? (Cross-Dataset Robustness). We evaluate four classical machine learning algorithms—Logistic Regression, Naive Bayes, Support Vector Machine, and Random Forest—using TF-IDF features on two distinct datasets: ISOT (political news, 44,898 articles) and WELFake (general news, 72,134 articles). Using SHAP (SHapley Additive exPlanations) for model interpretability, we compute Jaccard similarity and Spearman rank correlation to quantify agreement between model explanations. Our results reveal that different models exhibit varying levels of agreement on fake news indicators, with implications for model selection in real- world deployment. Furthermore, cross-dataset analysis identifies “universal” fake news features that generalize across domains versus “topic-specific” features that are domain-dependent. This work contributes a novel analytical framework for evaluating the trustworthiness and generalizability of fake news detection systems.