Artificial intelligence GPT behaves more cooperatively than humans in decision-making situations

In a game-theoretic experiment at the Leibniz Institute for Financial Research SAFE, researchers Lena Liebich, Kevin Bauer, Oliver Hinz, and Michael Kosfeld investigated the willingness of the two latest versions of GPT, 3.5 and 4 (*), to cooperate. As the experiment shows, GPT behaves more cooperatively in decision-making situations than humans in the same situation.

“Understanding how large language models, i.e. machine learning technologies such as GPT, interact with humans is extremely important because GPT is increasingly being integrated into real-world applications. In these applications, GPT works with people or supports them in their work, for example in call centers,” adds Kevin Bauer, SAFE Research Affiliate and Junior Professor of E-Business and E-Government at the University of Mannheim, explaining the starting point of the experiment.

When it is known whether the opponent has already decided to cooperate, GPT-3.5 (GPT-4) cooperates in 65 (93) percent of the cases. If the players are uncooperative, the machine opponent still decides to cooperate in 69.5 (68.5) percent of the cases. For humans, the percentages are significantly lower at 47 and 16 percent, respectively. In addition, the researchers tested the GPT’s tendency to cooperate when the model has to make the first decision to cooperate – without knowing how the other person will react. “Our study shows that both GPT versions are significantly more optimistic than humans when it comes to assessing their counterpart’s willingness to cooperate,” says Michael Kosfeld, who heads the SAFE Experiment Center as Bridge Professor for Organization and Management at the Goethe University in Frankfurt

GPT behaves more fairly than humans in the experiment

The results of the study suggest that the cooperative behavior of GPTs is significantly different from human cooperative behavior. The researchers used two models that, according to the behavioral economics literature, can explain human behavior in the so-called prisoner’s dilemma (**): a model of pure material self-interest (“homo oeconomicus”) and a model in which participants are motivated by considerations of fairness and efficiency, provided that they do not achieve a worse outcome than their fellow players. “Our results show that the homo oeconomicus model can explain the cooperative behavior of 26 percent of human subjects whereas it can explain at most 2.5 percent of GPT-4 and 0.5 percent of GPT-3.5 decisions,” says Kosfeld.

In contrast, both GPT models are found to act purposefully in accordance with a conditional valuation of fairness and efficiency of the participants: 84 (GPT-3.5) to 97 (GPT-4) percent of all observations can be explained with the help of this second model. “These findings raise an important question: Have machine learning technologies implicitly learned human values, behaviors, and goal orientations through their training on human-generated data? To ensure that modern AI systems contribute positively to social interaction, regulation and research need to address the ethical and social implications of the increasing integration of AI systems into our everyday lives,” Kevin Bauer concludes.

Further information

(*) GPT (Generative Pre-trained Transformer) is a text-based artificial intelligence (AI) developed by the US company OpenAI. It is a so-called large language model (LLM) that uses a transformer architecture, i.e. a type of neural network for processing data sequences. The ChatGPT chatbot, released by OpenAI in November 2022, whose architecture is based on variants of the GPT model, uses machine learning to communicate with people using texts that sound as “human” as possible. GPT-3.5 has 175 billion parameters and has been trained on a wide variety of text data, from online content to traditional literature. GPT-4 is multimodal, processing both images and text. Humans can interact with machine learning technologies through prompts.

(**) The sequential prisoner’s dilemma used by the SAFE researchers is a two-stage game in which the individual payoff conflicts with the collective payoff. If both parties cooperate, the highest collective payoff is paid out, but for players, the individual payoff is highest if they decide not to cooperate after observing the cooperation of the other party.

Download SAFE Working Paper No. 401