Due to a growing number of initiatives and regulations, predictions of modern Artificial Intelligence (AI) systems increasingly come with explanations about why they behave the way they do. In this paper, we explore the impact of feature-based explanations on users' information processing. We designed two complementary empirical studies where participants either made incentivized decisions on their own, with the aid of opaque predictions, or with explained predictions. In Study 1, laypeople engaged in the deliberately abstract investment game task. In Study 2, experts from the real-estate industry estimated listing prices for real German apartments. Our results indicate that the provision of feature-based explanations paves the way for AI systems to reshape users' sense-making of information and understanding of the world around them. Specifically, explanations change users' situational weighting of available information and evoke mental model adjustments. Crucially, mental model adjustments are subject to the confirmation bias so that misconceptions can persist and even accumulate, possibly leading to suboptimal or biased decisions. Additionally, mental model adjustments create spillover effects that alter user behavior in related yet disparate domains. Overall, this paper provides important insights into potential downstream consequences of the broad employment of modern explainable AI methods. In particular, side effects of mental model adjustments present a potentialrisk of manipulating user behavior and promoting discriminatory inclinations. Our findings may inform the refinement of current efforts of companies building AI systems and regulators that aim to mitigate problems associated with the black-box nature of many modern AI systems.
SAFE Working Paper No. 315