📢 We just released the initial findings from our collaboration with the Center for Security and Emerging Technology (CSET) at Georgetown University to Map the AI Governance Landscape - a first step toward systematically analyzing how governance documents address AI risks and mitigations.
🔍 What we did
• We tested an approach for using LLMs to categorize legal and governance documents from CSET’s ETO AGORA (AI GOvernance and Regulatory Archive).
• To develop and validate our approach, we compared the performance of five LLMs and human reviewers at classifying six documents.
• We then used Claude Sonnet 4.5 to classify more than 950 governance documents according to several taxonomies covering AI risks, mitigations, and related governance concepts
📊 What we found
1️⃣ Models can outperform humans at classification:
Our analysis suggested that, on average, for the six documents we examined, Claude Sonnet 4.5, Claude Opus 4.1 and GPT-5 achieved comparable or greater agreement with human consensus than the agreement achieved between two human reviewers.
2️⃣ Differences in coverage across the full CSET AGORA archive:
• The most covered risk subdomains were ‘Governance failure’, ‘AI system security vulnerabilities & attacks’ and 'Lack of transparency or interpretability'.
• The sectors with most coverage were 'Public Administration (excluding National Security)', 'Scientific R&D', and 'National Security'
• The least covered risk subdomains were ‘AI Welfare and Rights’, ‘Multi-agent risks’ and 'Economic and cultural devaluation of human effort'.
• The least covered sectors were ‘Accommodation, Food, and Other Services’, ‘Arts, Entertainment, and Recreation’ and ‘Real Estate and Rental and Leasing’.
See links in comments for more information.
💡 What’s next?
• We aim to continue to test and improve our approach and add additional taxonomies.
• Outputs will include reports, visualizations, and a searchable database to show which risks and mitigations are addressed or neglected in current AI governance.
• All outputs will be shared under open-access terms.
🔗 How can you engage?
We welcome feedback and support to refine our and improve our approach; please see the attached report and links in the comments.
🙏 Thanks to the team behind this pilot: Simon Mylius (the project leader), Yan Zhu, Mina Narayanan, Adrian Thinnyun, Alexander Saeri, Jess Graham, Michael Noetel, and Neil Thompson
Tagging a few people who may be interested in this work:
Kevin Fumai Ravit Dotan, PhD Yonah Welker Sam Burrett Chris Kraft Kuba Szarmach Rafah Knight Bugge Holm Hansen Tony Moroney Nafis Alam Elena Gurevich Luiza Jarovsky, PhD Katharina Koerner Oliver Patel, AIGP, CIPP/E, MSc Samuel Salzer Phil Venables Pascal BORNET Reid Blackman, Ph.D. Renan Araujo Kevin Klyman Martin Ebers Dean Whitehouse