This paper explores the interplay of feature-based explainable AI (XAI) tech- niques, information processing, and human beliefs. Using a novel experimental protocol, we study the impact of providing users with explanations about how an AI system weighs inputted information to produce individual predictions (LIME) on users’ weighting of information and beliefs about the task-relevance of information. On the one hand, we find that feature-based explanations cause users to alter their mental weighting of available information according to observed explanations. On the other hand, explanations lead to asymmetric belief adjustments that we inter- pret as a manifestation of the confirmation bias. Trust in the prediction accuracy plays an important moderating role for XAI-enabled belief adjustments. Our results show that feature-based XAI does not only superficially influence decisions but re- ally change internal cognitive processes, bearing the potential to manipulate human beliefs and reinforce stereotypes. Hence, the current regulatory efforts that aim at enhancing algorithmic transparency may benefit from going hand in hand with measures ensuring the exclusion of sensitive personal information in XAI systems. Overall, our findings put assertions that XAI is the silver bullet solving all of AI systems’ (black box) problems into perspective.
SAFE Working Paper No. 315