International AI Safety Assurance Agreements: Lessons From History, Feasibility Factors, and Hypotheses for the Future

Day
Time
Session ID
Location
Feb 6, 2025
11:30am–1pm
Track 01
CC9-CC13
Abstract:

For reasons including lack of trust, worries about AI risk, the desire to regulate AI domestically without jeopardizing competitiveness, and others, states may want to make credible claims about the level of safety of AI development within their jurisdiction—and to verify similar claims by other states. Under which conditions could international agreements for mutual assurance of safe AI development be established? Through a qualitative analysis supported by expert interviews and case studies on international agreements in aviation safety, nuclear safety and financial intelligence, I identify critical factors influencing the desirability and feasibility of such agreements. These include the availability of verification mechanisms that protect sensitive information, the degree of interstate rivalry, and the costs associated with safety policies. I highlight two primary challenges in fostering international cooperation on AI safety: (1) the trade-off between security and transparency and (2) the trade-off between feasibility and effectiveness of the agreements. To address these challenges, I propose a framework for the institutional design of "minimum viable agreements" and examine strategies to prevent defection while encouraging participation. The feasibility of various governance arrangements is evaluated based on their accountability targets (whether they focus on models, AI labs, or jurisdictions), types of obligations (technology-based, performance-based, or process-based), and assurance mechanisms (including mutual recognition, treaty-based reporting, and unilateral or multilateral inspections).

Speakers: