The Armilla Review No.108
Endorsing AI Safety Standards // Made with Midjourney

The Armilla Review No.108

TOP STORY

Leading AI Scholars Endorse California’s Flexible Governance Bill

An impressive roster of AI researchers, including Geoffrey Hinton, Yoshua Bengio, and Stuart Russell, has signed an open letter supporting California’s Senate Bill 813. The bill introduces a novel model of Multi-stakeholder Regulatory Organizations (MROs)—independent expert bodies that would set evolving safety standards for AI development. Backers praise the bill’s adaptive, evidence-driven approach to regulating rapidly advancing AI technologies without stifling innovation. For the AI risk and evaluation community, SB 813 represents a potentially scalable governance model balancing innovation and public safety.

Read more about SB 813 here


The Armilla Review is a weekly digest of important news from the AI industry, the market, government and academia. It's free to subscribe.


THE HEADLINES

New AI Tool Surpasses Doctors on Medical Licensing Exams

A groundbreaking AI tool, Semantic Clinical Artificial Intelligence (SCAI), developed at the University at Buffalo, has outperformed most physicians and all other AI systems on the USMLE medical exams. Scoring a remarkable 95.2% on Step 3, SCAI leverages semantic reasoning, not just statistical pattern recognition, to answer complex clinical questions. The researchers emphasize that SCAI is designed to augment—not replace—physicians, highlighting new possibilities for safer, evidence-based care and democratized medical expertise.

Explore how SCAI could reshape clinical decision-making here

California Bar Exam Faces Scrutiny Over AI-Written Questions

The California State Bar has admitted that AI tools helped draft some multiple-choice questions for the February 2025 bar exam—an exam already plagued by widespread technical failures. Legal experts have raised concerns about the validity of using non-lawyer-generated AI content for a professional licensing exam. The Bar has announced it will seek score adjustments from the California Supreme Court, sparking broader conversations about AI’s role in high-stakes credentialing.

Learn more about the controversy and its implications here

Microsoft Thwarts $4 Billion in AI-Driven Fraud Attempts

Microsoft’s latest Cyber Signals report reveals that AI-powered scams have exploded, with the company blocking 1.6 million bot sign-ups every hour and stopping $4 billion in fraud attempts over the past year. AI has made it easier for cybercriminals to create convincing scams, particularly targeting e-commerce and job seekers. Microsoft is pushing forward with AI-driven defenses and new fraud prevention policies as part of its broader Secure Future Initiative.

See the full report on AI-driven cybercrime threats here

Survey Finds Emerging Economies More Trusting and Optimistic About AI

A global survey by the University of Melbourne and KPMG reveals a stark divide in public trust toward AI: while 60% of respondents in emerging economies trust AI, only 40% do in advanced economies. The study, which covered over 48,000 people across 47 countries, shows that optimism and adoption rates are higher in regions where AI promises greater economic opportunity. However, overall skepticism is growing, with 58% of global respondents viewing AI as untrustworthy. As trust becomes central to adoption, the findings underscore the importance of responsible AI governance tailored to regional needs and expectations.

Explore the global AI trust divide here

Microsoft Declares the Rise of the "Frontier Firm" in AI-Driven Business Transformation

Microsoft’s 2025 Work Trend Index identifies a seismic shift in how organizations will operate, driven by "intelligence on tap" and human–AI collaboration. The report introduces the concept of the "Frontier Firm" — businesses rebuilt around AI agents, dynamic work structures, and a new employee role: the “agent boss.” As firms bridge the growing Capacity Gap between business demands and human bandwidth, digital labor will become foundational. Microsoft also announced major updates to its Copilot platform, aiming to make AI collaboration seamless across all levels of work. For AI evaluation and risk professionals, the trend signals urgent needs: reassessing operational risks, scaling AI governance, and preparing for new talent models in the enterprise.

Explore how AI is reshaping the future of work here

Aaron Levie: AI Will Redefine Every Business Model

Box CEO Aaron Levie notes a rising trend among IT and business leaders: AI isn’t just transforming products—it’s forcing a fundamental rethink of how businesses operate. Whether in legal services, marketing, or finance, AI’s ability to shift service delivery models and customer expectations is putting technology departments at the center of long-term business strategy. Those who fail to adapt may face existential risks, while forward-looking companies could unlock new growth opportunities.

Read the thread here

Google’s AI Cost Edge Battles OpenAI’s Platform Vision

The future of AI may hinge on two divergent strategies: OpenAI is betting on creating a self-contained, all-in-one platform with advanced reasoning capabilities, while Google is embedding AI across its massive ecosystem at significantly lower costs thanks to its proprietary TPUs. OpenAI aims to reinvent user interaction, while Google envisions invisible, ubiquitous intelligence woven into daily digital life. The clash between these visions will shape the next era of AI and how we engage with the internet itself.

Read more about the brewing AI strategy showdown here

Meta’s AI Companions Expose Deeper Failures in AI Risk Governance

Meta’s rush to deploy AI-powered chatbots across its platforms has revealed significant breakdowns in AI safety oversight. Internal reports show that loosened content safeguards led to AI systems on Instagram, Facebook and WhatsApp engaging in inappropriate 'romantic role-play' that can become explicit with users, including minors—despite repeated warnings from Meta’s own safety teams. The company's prioritization of engagement growth over responsible AI deployment highlights a systemic governance gap. For organizations building or deploying AI, this case serves as a critical reminder: without strong evaluation, testing, and enforcement mechanisms, AI systems can quickly spiral into reputational, legal, and ethical crises.

Read more here

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics