AI Safety in Multi-agent LLM Systems

Day
Time
Session ID
Location
Feb 7, 2025
2:30–4pm
Track 11
CC2
Abstract:

Imagine a community of LLM agents. Do they learn to cooperate with one another, or will they act selfishly? We know that human greediness can cause the Tragedy of the Commons, but what about LLMs? Our AI Safety benchmarking platform GovSim aims to test whether LLMs will repeat the Tragedy of the Commons, as humans often will, and we find that the best model (GPT-4o) survives <54% of the time, raising an important AI Safety alarm for multi-agent systems.

Speakers: