Using experimental data from a comprehensive field study, we explore the causal effects of algorithmic discrimination on economic efficiency and social welfare. We harness economic, game-theoretic, and state-of-the-art machine learning concepts allowing us to overcome the central challenge of missing counterfactuals, which generally impedes assessing economic downstream consequences of algorithmic discrimination. This way, we are able to precisely quantify downstream efficiency and welfare ramifications,
which provides us a unique opportunity to assess whether the introduction of an AI system is actually desirable. Our results highlight that AI systems’ capabilities
in enhancing welfare critically depends on the degree of inherent algorithmic biases. While an unbiased system in our setting outperforms humans and creates substantial
welfare gains, the positive impact steadily decreases and ultimately reverses the more biased an AI system becomes. We show that this relation is particularly concerning in selective-labels environments, i.e., settings where outcomes are only observed if decision-makers take a particular action so that the data is selectively labeled, because
commonly used technical performance metrics like the precision measure are prone to be deceptive. Finally, our results depict that continued learning, by creating feedback loops, can remedy algorithmic discrimination and associated negative effects over time.