AI Safety in Multi-agent LLM Systems

Day

Time

Session ID

Location

Feb 7, 2025

2:30–4pm

Track 11

CC2

IASEAI Program Overview

Agenda

Abstract:

Imagine a community of LLM agents. Do they learn to cooperate with one another, or will they act selfishly? We know that human greediness can cause the Tragedy of the Commons, but what about LLMs? Our AI Safety benchmarking platform GovSim aims to test whether LLMs will repeat the Tragedy of the Commons, as humans often will, and we find that the best model (GPT-4o) survives <54% of the time, raising an important AI Safety alarm for multi-agent systems.

Speakers:

Zhijing Jin