From Firewalls to Agent Governance: How AI Security Reinvented Itself

GenAI Innovation
Blog

Key Takeaways

Generation One AI security tools were built to inspect prompts on the network path. That path no longer carries the highest-risk AI activity in the modern enterprise.
When Copilot or a similar platform reaches into SharePoint, email, or a CRM, the exposure happens at the data layer, inside applications a network firewall cannot see.
The agentic era introduces at least five actors that matter for security: the data, the human identity, the agent identity, the model, and the trigger. Generation One tools were built to track one conversation, not five actors.
Most enterprises cannot answer basic questions about their agent footprint: how many agents are running, who built them, what data they can access, and whether their behavior matches their intended purpose.
Generation Two AI security starts with agent inventory, continues with data and identity profiling, and hinges on behavioral baselining: knowing what an agent normally does so anomalous behavior becomes visible.

Written by Oz Wasserman (CPO and Co-Founder, Opsin) and Yaron Singer (CEO and Co-Founder, Robust Intelligence - acquired by Cisco)

The first generation of AI security was right about the threat, but in an evolving agentic world, it's no longer enough.

The team at Robust Intelligence built one of the first AI firewalls back in 2019, at a time when enterprises were just starting to understand that AI models could be weaponized. We were years ahead of the market, and the Cisco acquisition in 2024 proved the category was real. But something has changed in the last eighteen months, and the tools that defined Generation One of AI security, as groundbreaking as they were, were not designed for the world enterprises now operate in.

Here, we breakdown how the threat surface moved, why the old architecture cannot follow it, and what enterprise AI security has to look like next.

Generation One: The AI Firewall Era

The original AI security problem was relatively contained. A user sat at a keyboard, typed a prompt, and a model responded over a network. That network traffic was visible. You could intercept it, inspect it, and make a real-time decision: should this information flow or not?

Robust Intelligence's AI Firewall was built precisely for this interaction model. A sentinel sitting between the user and the model, watching every input and output. AI Validation stress-tested models before deployment. The AI Firewall protected them in production. It was the right architecture for the world at the time.

The mental model was simple, and it worked: one human, one model, one conversation. Block jailbreaks. Catch prompt injection. Stop sensitive data from leaking out in a response. Inspect the wire, and you could secure the interaction.

That assumption held for as long as AI was something that lived at the edge of the enterprise. As soon as it moved inside, everything changed.

The Paradigm Shift: Enterprise AI Changes Everything

Then came the enterprise AI wave. Microsoft Copilot. ChatGPT Enterprise. Claude Enterprise. Google Gemini for Workspace. These were not research tools or developer APIs. They were productivity platforms deployed inside organizations with decades of accumulated, often overprivileged, often unclassified data. The interaction model changed completely.

It was no longer a user typing a prompt and waiting for a response. It was an AI system reaching into SharePoint, Teams, email, CRM, HR files, finance systems, source code repositories. Pulling context from everywhere, acting on it, and sometimes taking actions on the user's behalf.

The network firewall could not see inside these in-app data connections. When Copilot summarizes a document a user should never have had access to in the first place, no prompt was injected, no model was jailbroken. The AI did exactly what it was asked to do. The problem is what it was allowed to see. The threat was no longer the model itself. It was what the model could see, touch, and change.

This is the inflection point most security teams missed. They were still buying tools designed to inspect prompts and responses, while the actual exposure was happening at the data layer, inside applications their network firewalls had no visibility into.

The Agentic Layer: The Problem Multiplies

Layered on top of enterprise AI is the agentic era, and this is where Generation One security breaks down completely.

Through MCP connections and native integrations, agents now connect to every structured and unstructured system in the enterprise. They do not wait to be asked. They act autonomously. Reading files. Drafting communications. Triggering workflows. Querying databases. Updating records. All without a human in the loop for each individual action.

Consider what an AI firewall was built to inspect: a conversation between a person and a model. Two actors. The agentic era has at least five.

  1. The data the agent is touching, with its own classification and access requirements.
  2. The identity of the human who created or deployed the agent.
  3. The agent's own identity, which often does not exist as a first-class concept in the IAM stack.
  4. The model powering the agent, which may change without notice.
  5. The prompt or autonomous trigger driving the action in the moment.

None of that context exists at the network level. You cannot catch what you cannot see, and the wire only shows you the smallest fraction of what is actually happening.

The result is a governance gap. Most enterprises today cannot answer basic questions about their agent footprint. 

  • How many agents are running? 
  • Who built them? 
  • What data are they connected to? 
  • What are they doing right now? Is their behavior matched to their intended purpose? 

These are not far reaching questions. They are the questions a CISO would have asked about any other class of identity or workload years ago. Yet for agents, the answers do not exist yet in most organizations.

Why The Old Architecture Cannot Follow

It’s worth being precise about why Generation One tools cannot simply be retrofitted for this new world. An AI firewall sits on the network path between a user and a model. That path still exists for some interactions, but the highest-risk AI activity in the modern enterprise no longer travels it. When Copilot reads a SharePoint site, the model and the data live on the same side of the firewall. When an MCP-connected agent queries a CRM, the call is an authenticated API request, not a prompt traversing a network boundary you can inspect.

Validation tooling has the same problem. Stress-testing a model before deployment tells you something useful about the model. It tells you very little about how that model will behave once it is wired into a specific company's data, with a specific company's permissions, executing a specific company's workflows. The risk is no longer in the model. The risk is in the deployment.

This is not a critique of the first generation of tools. They were built for the threat as it existed, and they were correct about that threat. The point is simpler: the threat moved.

What Generation Two Looks Like

The question Generation Two has to answer is not "is this prompt safe?" but rather, "where are all your agents, what can they access, and are they behaving the way they were designed to?" That reframe changes everything about how the technology has to be built.

It starts with discovery. The average enterprise has 850 or more agents running across the organization, the vast majority of which are unknown to the security team. Some are sanctioned; most are not. Some are powerful Copilot agents with broad access, while others are lightweight automations a single team built last quarter. You cannot govern what you cannot see, so the first job is inventory.

It continues with profiling. For each agent: what data is it connected to, through MCP or direct upload or native integration? What was it built to do? Who built it, and under whose identity does it act? What systems can it touch, and at what privilege level?

Then comes the part that matters most, which is behavioral baselining. What is this agent actually doing, day to day, hour to hour? Once you have that baseline, risk evaluation becomes meaningful. An agent quietly accessing HR files at 2am looks very different from the same agent doing the same thing at 10am during a planned reporting cycle. The action is the same. The context is everything.

This is the five-actor picture: data, human identity, agent identity, model, and trigger. Generation One security was never designed to hold all five in view at once. Generation Two has to.

The Shift From AI Firewalls to Agent Governance

The story of AI security is not one of failure but of evolution.

The first generation proved the category was real and the threat was serious. It built the muscle inside enterprise security teams to think about AI as a first-class risk surface rather than a curiosity. Without that foundation, no one would be ready for what comes next - when the threat surface moves from the model to the enterprise itself. The firewall era asked: is this AI safe? The agent governance era asks a harder question: What is this AI doing to your business, right now, without anyone watching?

That question requires a fundamentally different answer: visibility into systems the network never touched, an identity model that treats agents as actors, not features, and behavioral context that no validation suite can produce in a lab. It also requires a governance posture that assumes agents will keep multiplying faster than any team can manually track.

We are early in this shift. Most enterprises are still buying Generation One tools to defend against a Generation Two problem. The teams who recognize the gap first will be the ones who keep AI moving inside their organizations without losing control of it.

The firewall did its job. The next layer has to do a different one.

Table of Contents

LinkedIn Bio >

FAQ

Why can't existing AI firewalls protect against agentic AI risk?

AI firewalls were designed to intercept traffic on the network path between a user and a model. When an AI agent operates inside the enterprise, querying SharePoint via Microsoft Graph, pulling CRM records through an authenticated API, or executing workflows through an MCP connection, that activity does not cross a network boundary the firewall can inspect. The model and the data often live on the same side of the perimeter. The firewall sees nothing because nothing relevant travels through it.

What is excessive agency, and why does it matter for enterprise AI security?

Excessive agency, classified as LLM08 in OWASP's GenAI Top 10, refers to an AI agent being granted more capability, access, or autonomy than its intended function requires. In practice, this means an agent built to summarize meeting notes that also has write access to a CRM, or a Copilot deployment that inherits legacy SharePoint permissions no human administrator would consciously grant today. The risk is not that the agent misbehaves in an obvious way. The risk is that it performs exactly as designed, on data it should never have been able to reach.

What is behavioral baselining for AI agents, and how does it differ from traditional anomaly detection?

Behavioral baselining for AI agents means establishing what a specific agent normally does across data access, query patterns, timing, and connected systems, so that deviations become meaningful signals rather than noise. This differs from traditional anomaly detection in that the baseline must account for all five actors involved in an agentic transaction: the data being touched, the human identity that deployed the agent, the agent's own identity, the underlying model, and the trigger driving the action. A query into HR records at 2am by an agent whose baseline shows activity only during business hours is a different signal than the same query at 10am during a scheduled payroll cycle. Context is what makes the detection actionable.

How does Generation Two AI security handle agent identity when most IAM stacks do not treat agents as first-class identities?

Most enterprise IAM stacks were built for human users and, more recently, service accounts. Agents fall into a governance gap: they often act under a human user's delegated permissions, inherit access that the human may not fully understand, and operate across sessions and integrations that no single identity record captures cleanly. Generation Two security approaches this by profiling agents as distinct actors, separate from the human who created them, tracking what systems each agent is connected to, at what privilege level, and through what integration path. That profile is what makes behavioral baselining possible and what gives security teams a foundation for remediation that goes to root cause rather than suppressing individual alerts.

About the Author
Oz Wasserman
Oz Wasserman is the Co-Founder and CPO of Opsin, with over 15 years of cybersecurity experience focused on security engineering, data security, governance, and product development. He has held key roles at Abnormal Security, FireEye, and Reco.AI, and has a strong background in security engineering from his military service.
LinkedIn Bio >

From Firewalls to Agent Governance: How AI Security Reinvented Itself

Written by Oz Wasserman (CPO and Co-Founder, Opsin) and Yaron Singer (CEO and Co-Founder, Robust Intelligence - acquired by Cisco)

The first generation of AI security was right about the threat, but in an evolving agentic world, it's no longer enough.

The team at Robust Intelligence built one of the first AI firewalls back in 2019, at a time when enterprises were just starting to understand that AI models could be weaponized. We were years ahead of the market, and the Cisco acquisition in 2024 proved the category was real. But something has changed in the last eighteen months, and the tools that defined Generation One of AI security, as groundbreaking as they were, were not designed for the world enterprises now operate in.

Here, we breakdown how the threat surface moved, why the old architecture cannot follow it, and what enterprise AI security has to look like next.

Generation One: The AI Firewall Era

The original AI security problem was relatively contained. A user sat at a keyboard, typed a prompt, and a model responded over a network. That network traffic was visible. You could intercept it, inspect it, and make a real-time decision: should this information flow or not?

Robust Intelligence's AI Firewall was built precisely for this interaction model. A sentinel sitting between the user and the model, watching every input and output. AI Validation stress-tested models before deployment. The AI Firewall protected them in production. It was the right architecture for the world at the time.

The mental model was simple, and it worked: one human, one model, one conversation. Block jailbreaks. Catch prompt injection. Stop sensitive data from leaking out in a response. Inspect the wire, and you could secure the interaction.

That assumption held for as long as AI was something that lived at the edge of the enterprise. As soon as it moved inside, everything changed.

The Paradigm Shift: Enterprise AI Changes Everything

Then came the enterprise AI wave. Microsoft Copilot. ChatGPT Enterprise. Claude Enterprise. Google Gemini for Workspace. These were not research tools or developer APIs. They were productivity platforms deployed inside organizations with decades of accumulated, often overprivileged, often unclassified data. The interaction model changed completely.

It was no longer a user typing a prompt and waiting for a response. It was an AI system reaching into SharePoint, Teams, email, CRM, HR files, finance systems, source code repositories. Pulling context from everywhere, acting on it, and sometimes taking actions on the user's behalf.

The network firewall could not see inside these in-app data connections. When Copilot summarizes a document a user should never have had access to in the first place, no prompt was injected, no model was jailbroken. The AI did exactly what it was asked to do. The problem is what it was allowed to see. The threat was no longer the model itself. It was what the model could see, touch, and change.

This is the inflection point most security teams missed. They were still buying tools designed to inspect prompts and responses, while the actual exposure was happening at the data layer, inside applications their network firewalls had no visibility into.

The Agentic Layer: The Problem Multiplies

Layered on top of enterprise AI is the agentic era, and this is where Generation One security breaks down completely.

Through MCP connections and native integrations, agents now connect to every structured and unstructured system in the enterprise. They do not wait to be asked. They act autonomously. Reading files. Drafting communications. Triggering workflows. Querying databases. Updating records. All without a human in the loop for each individual action.

Consider what an AI firewall was built to inspect: a conversation between a person and a model. Two actors. The agentic era has at least five.

  1. The data the agent is touching, with its own classification and access requirements.
  2. The identity of the human who created or deployed the agent.
  3. The agent's own identity, which often does not exist as a first-class concept in the IAM stack.
  4. The model powering the agent, which may change without notice.
  5. The prompt or autonomous trigger driving the action in the moment.

None of that context exists at the network level. You cannot catch what you cannot see, and the wire only shows you the smallest fraction of what is actually happening.

The result is a governance gap. Most enterprises today cannot answer basic questions about their agent footprint. 

  • How many agents are running? 
  • Who built them? 
  • What data are they connected to? 
  • What are they doing right now? Is their behavior matched to their intended purpose? 

These are not far reaching questions. They are the questions a CISO would have asked about any other class of identity or workload years ago. Yet for agents, the answers do not exist yet in most organizations.

Why The Old Architecture Cannot Follow

It’s worth being precise about why Generation One tools cannot simply be retrofitted for this new world. An AI firewall sits on the network path between a user and a model. That path still exists for some interactions, but the highest-risk AI activity in the modern enterprise no longer travels it. When Copilot reads a SharePoint site, the model and the data live on the same side of the firewall. When an MCP-connected agent queries a CRM, the call is an authenticated API request, not a prompt traversing a network boundary you can inspect.

Validation tooling has the same problem. Stress-testing a model before deployment tells you something useful about the model. It tells you very little about how that model will behave once it is wired into a specific company's data, with a specific company's permissions, executing a specific company's workflows. The risk is no longer in the model. The risk is in the deployment.

This is not a critique of the first generation of tools. They were built for the threat as it existed, and they were correct about that threat. The point is simpler: the threat moved.

What Generation Two Looks Like

The question Generation Two has to answer is not "is this prompt safe?" but rather, "where are all your agents, what can they access, and are they behaving the way they were designed to?" That reframe changes everything about how the technology has to be built.

It starts with discovery. The average enterprise has 850 or more agents running across the organization, the vast majority of which are unknown to the security team. Some are sanctioned; most are not. Some are powerful Copilot agents with broad access, while others are lightweight automations a single team built last quarter. You cannot govern what you cannot see, so the first job is inventory.

It continues with profiling. For each agent: what data is it connected to, through MCP or direct upload or native integration? What was it built to do? Who built it, and under whose identity does it act? What systems can it touch, and at what privilege level?

Then comes the part that matters most, which is behavioral baselining. What is this agent actually doing, day to day, hour to hour? Once you have that baseline, risk evaluation becomes meaningful. An agent quietly accessing HR files at 2am looks very different from the same agent doing the same thing at 10am during a planned reporting cycle. The action is the same. The context is everything.

This is the five-actor picture: data, human identity, agent identity, model, and trigger. Generation One security was never designed to hold all five in view at once. Generation Two has to.

The Shift From AI Firewalls to Agent Governance

The story of AI security is not one of failure but of evolution.

The first generation proved the category was real and the threat was serious. It built the muscle inside enterprise security teams to think about AI as a first-class risk surface rather than a curiosity. Without that foundation, no one would be ready for what comes next - when the threat surface moves from the model to the enterprise itself. The firewall era asked: is this AI safe? The agent governance era asks a harder question: What is this AI doing to your business, right now, without anyone watching?

That question requires a fundamentally different answer: visibility into systems the network never touched, an identity model that treats agents as actors, not features, and behavioral context that no validation suite can produce in a lab. It also requires a governance posture that assumes agents will keep multiplying faster than any team can manually track.

We are early in this shift. Most enterprises are still buying Generation One tools to defend against a Generation Two problem. The teams who recognize the gap first will be the ones who keep AI moving inside their organizations without losing control of it.

The firewall did its job. The next layer has to do a different one.

Get Your Copy
Your Name*
Job Title*
Business Email*
Your copy
is ready!
Please check for errors and try again.

See, secure, and scale AI

Get your free AI agent risk assessment.
Results in 24 hours.
Start Your Free Risk Assessment →