Microsoft makes building trustworthy AI agents easier and more secure


AI agents are the hottest topic at the moment -- with good reason. Agents take the assistance offered by traditional AI chatbots a step further by performing tasks for you. From something simple like sending or responding to an email to a complex task, such as approving procurement orders, agents will access user data, meaning safety is especially important.
At Microsoft's annual developer conference, Microsoft Build, the tech company unveiled platform updates during the keynote, and AI agents were a big topic. Beyond releasing powerful agents that can optimize people's workflows in Microsoft 365 applications, GitHub, and more, Microsoft also released new features that make building a trustworthy agent easier.
Also: 60% of AI agents work in IT departments - here's what they do every day
"The amazing thing about agents is that they are actually able to do so much more -- they use tools, they take actions on your behalf -- and so the space of what can go wrong is much more significant," said Sarah Bird, CPO of Responsible AI at Microsoft to ZDNET.
AI Agent monitoring
Your AI agents will have access to your organization's sensitive data, and as a result, you'll want to put these tools to the test to ensure they are as accurate and attack-protected as possible. Microsoft has introduced new tools to make this task easier.
Also: The best VPN services (and how to choose the right one for you)
Agent Evaluators help users measure how agents understand and carry out user goals, stay aligned to requests, and select tools, according to Microsoft. A new AI Red Teaming Agent automates the red teaming of generative AI systems by recreating realistic attack scenarios, helping identify vulnerabilities to mitigate risks.
"By using new agentic technology and turning our safety evaluation system into an agent, that will just adversarially red team your system; it is way, way easier to use, and it also results in better testing, because the agent is able to iteratively adversarially attack your system and try to get through," said Bird.
Similarly, the Agent Observability features in Azure AI Foundry, Microsoft's platform for building, deploying, and managing AI apps and agents, allow developers to view built-in metrics such as performance, quality, cost, and safety via a single dashboard.
Also: Tech leaders are seemingly rushing to deploy agentic AI - here's why
These metrics are available throughout the AI development cycle, from early ideation in the Agents Playground to production deployment using GitHub and Azure DevOps. Evaluations run automatically on every code update, making monitoring and iterating AI systems easier.
Defender alerts in Foundry provide real-time visibility into security threats in model deployments, enabling developers to assess and respond to suspicious activity. These alerts include contextual recommendations and direct links to resolution actions in the Azure Portal. With new integration with Microsoft Purview, users can apply enterprise-grade security governance and compliance controls to AI systems built using Azure AI Foundry models.
Microsoft Entra Agent ID, Spotlighting, and more
Microsoft Entra ID allows organizations to safely manage their workforce's access to applications, information, and company data across clouds and on-premises environments. Now, Microsoft is introducing the preview of Microsoft Entra Agent ID, which expands access management to AI agents built with Azure AI Foundry and Microsoft Copilot Studio.
Once an agent is created in either platform, it will automatically be assigned an identity in Microsoft Entra, so admins can immediately control access permissions, the same way they can control access for human identities.
"An agent is this new thing that you're deploying in your system that isn't quite like an application, isn't quite like a user, but it might behave like an application, or it might sometimes behave like a user," said Bird. "We need to extend all of our existing management and governance, and security tooling, to handle this new category of things."
Also: 100 leading AI scientists map route to more 'trustworthy, reliable, secure' AI
Microsoft also introduced Spotlighting capability in preview, built in Content Safety to strengthen the Prompt Shields guardrail, making it better at detecting and stopping indirect prompt injections that manipulate the AI for malicious outcomes, according to the release.
"Prompt shields look for attacks that are coming through the user interface, but also look for attacks that are coming hidden in the data," said Bird. "One of the things we've done for Build is extend it to work for attacks that are hidden in tool calls or other things that our agents are doing."
Other new guardrails include a PII detection guardrail in Foundry, which redacts sensitive information with added support for user-defined content filtering, and a task adherence guardrail in Content Safety, which ensures that your AI agents stay on track with the task assigned to them.