The Explainable AI Paradox: When Transparency Improves Decision Quality And When It Creates Overconfidence

Authors: Dr. Harsha Sammangi, Poloju Pravalika

Abstract: Explainable artificial intelligence (XAI) is widely promoted as a remedy for algorithmic opacity, premised on the assumption that revealing a model's reasoning improves human oversight and decision quality. This study investigates the Explainable AI Paradox: the possibility that explanations simultaneously increase trust, reliance, and adoption while also producing overconfidence in flawed models — improving decision quality when models are valid but amplifying errors when models are not. Using a controlled experiment with 921 managers making pricing, lending, or inventory decisions assisted by AI systems of deliberately varied validity, participants were randomly assigned to one of five explanation conditions: no explanation, feature-importance, counterfactual, uncertainty-aware, or a combined feature-importance-plus-uncertainty condition. The study measured decision accuracy, confidence calibration, reliance behavior, override justification quality, and simulated financial outcomes. Results show that feature-importance explanations increased reliance (+9.8 percentage points, p < .001) and confidence (+0.61 scale points, p < .001) relative to no explanation, but produced the largest overconfidence increase (+0.084, p < .001) and the only negative financial outcome effect (–0.14 SD, p < .01) among all explanation types — concentrated specifically in the flawed-model conditions, where a significant Explanation × Model-Quality interaction (β = 0.047 to 0.089 across outcomes, all p < .001) confirms that feature-importance explanations' benefits accrue under valid models while their overconfidence costs accrue under flawed models. Uncertainty-aware explanations, by contrast, improved calibration (–0.058, p < .001), reduced overconfidence (–0.047, p < .001), and produced the only significant positive financial outcome (+0.29 SD, p < .001) relative to no explanation. A twelve-stage design intervention pilot demonstrates that combining five calibration-oriented design principles — feature reliability tagging, confidence-first ordering, disagreement prompts, active verification nudges, and explanation-accuracy feedback — reduces the Overconfidence Index by 83% (from 0.084 to 0.014, p < .001) relative to unmanaged feature-importance explanations. Thematic analysis of 40 participant interviews identifies six mechanisms underlying these patterns, including a 'plausibility heuristic substitution' through which surface-level explanation coherence substitutes for independent verification. The paper contributes a theory of the Explainable AI Paradox to behavioral information systems research, identifies model-quality and explanation-type interactions as the central moderating mechanism, and provides a five-level maturity roadmap and design decision framework for deploying explainable, uncertainty-aware managerial AI systems.

DOI: http://doi.org/10.5281/zenodo.20843984

Related posts

Follow Us on