Topic: In this post, I break down the standards shaping the field, the tradeoffs in different architectural choices, and some best practices I’ve learned along the way.

Core Questions:

What are the standards and schemas used in security data modeling today?
How do different architectural approaches compare in strengths and weaknesses?
What best practices can help avoid common pitfalls and ensure scalability?

Let me start off by saying all data modeling standards are crap. Literally all of them are wrong. The only question is whether it is wrong in a way you can live with.

Think about it. They are slow to adapt, full of compromises, and often shaped more by vendor politics than real-world operational needs. Every schema leaves gaps, forces awkward fits, or overcomplicates simple problems. By the time a standard gains adoption, the threat landscape and technology stack have already changed.

My advice? Treat them as a baseline, not gospel, and be ready to bend or break the rules when they get in the way of solving the real problem.

That is exactly why I put together this post. Below, I break down the most widely used standards in security data modeling, what they do well, where they fall short, and how to choose (and adapt) an approach that fits your team’s needs. We will also look at the major architectural decisions every organization faces, along with practical recommendations for building models that can survive contact with the real world.

Common Standards

First, some standards. Best practices in security data modeling have crystallized around several key frameworks and methodologies that address the unique challenges of security data. I’ve listed the three big ones below along with their strengths, weaknesses, etc.

No standard is perfect. The right choice depends on your existing tooling, the diversity of your data sources, and how much you value vendor neutrality versus ecosystem depth. Treat these models as starting points, not final answers. Pick the one that fits most of your needs, adapt it where it falls short, and make sure your architecture can evolve if your choice stops working for you.

Architectural Choices

Standards give you a common language for your data, but they do not tell you how to store it, organize it, or move it through your systems. That is where architecture comes in. Even if you pick the “right” schema, the wrong architectural decisions can leave you with a brittle, slow, or siloed system.

In practice, the choice of a standard and the choice of architecture are linked. A vendor-centric model like CIM might push you toward certain toolchains. A vendor-neutral schema like OCSF can give you more freedom to design a hybrid architecture. ASIM might make sense if your governance model already leans heavily on Microsoft tools.

No matter what standard you start with, you still have to navigate the big tradeoffs that define how your security data platform works in the real world. Below are five key architectural decisions that have the biggest impact on scalability, performance, and adaptability.

Relational vs. Graph
Relational databases are reliable, mature, and great for structured queries and compliance reporting. They struggle, though, with the many-to-many relationships common in security data, which often require expensive joins. Graph databases handle those relationships naturally and make things like attack path analysis far more efficient, but they require specialized skills and are not as strong for aggregation-heavy workloads.
Time-series vs. Event-Based
Time-series models are great for continuous measurements like authentication rates or network metrics, with built-in aggregation and downsampling. Event-based models capture irregular, discrete events with richer context, making them better for forensic reconstruction. Many teams now run hybrids with time-series for baselining and metrics and event-based for detailed investigation.
Centralized vs. Federated Governance
Centralized governance gives you consistent policy enforcement and unified visibility, which is great for compliance, but it can become a bottleneck. Federated governance lets teams move faster and tailor models to their needs, but risks fragmentation. Large organizations often mix the two: local autonomy for operations, centralized oversight for security and compliance.
Performance vs. Flexibility
If you need fast queries for SOC dashboards, you will lean toward pre-aggregated, columnar storage. If you want to explore new detections and threat hypotheses, you will want schema-on-read flexibility, even if it costs more compute time. Many mature teams adopt a Lambda-style approach that keeps both real-time and batch capabilities.
Storage Efficiency vs. Query Performance
Compressed formats and tiered storage save money but slow down complex queries. In-memory databases and materialized views make investigations fast but cost more. The right balance depends on your use case: compliance archives need efficiency, while real-time threat detection needs speed.

Your choice of standard sets the language for your data, but these architectural decisions determine how that data actually works for you. The most resilient security data platforms come from matching the two: picking a model that fits your environment, then making architecture choices that balance speed, flexibility, governance, and cost. That is why the final step is not chasing the “perfect” setup, but designing for scale, interoperability, and adaptability from the start.

Five Recommendations for Effective Security Data Modeling

If you take anything away from this blog, this is it. Here are my top recommendations:

Start with clear use cases
Do not pick tools because they are popular or because a vendor says they are the future. Decide what problems you need to solve, then choose the standards and architecture that solve them best.
Mix and match architectures
Different data types have different needs. Graph databases are great for mapping relationships, time-series for metrics, and data lakes for long-term, flexible storage. Use the right tool for the right job.
Prioritize open standards
Interoperability is the best hedge against vendor lock-in. Even if you lean on a vendor ecosystem, align your data to open formats so you can plug in new tools or migrate without a full rebuild.
Design for scale from day one
Security data volumes grow fast. Build your pipelines, storage, and governance with that growth in mind so you are not forced into a costly re-architecture later.
Stay flexible
Threats evolve, and so should your data model. Avoid over-optimizing for a single use case or threat type. Keep room to adapt without breaking everything you have built.

Closing Thoughts

No standard or architecture will be perfect. Every choice will have gaps, tradeoffs, and moments where it slows you down. What’s important is to understand those imperfections, design around them, and keep adapting as threats and technology change. Treat standards as a baseline, use architecture to make them work for you, and build with the expectation that your needs will evolve.

Topic: In Part 1, we established what makes AI “agentic” and mapped where autonomous agents belong (and don’t belong) in your security operations. Part 2 dives into the harder architectural challenge: how do we actually build these systems to remain secure, controllable, and aligned as they learn and evolve?

Core Questions:

What new threat models do we need when AI systems can learn, adapt, and take autonomous actions?
How do we design agent architectures that prevent goal hijacking, tool misuse, and harmful emergent behaviors?
What does “secure by design” mean for systems that modify their own behavior over time?
How do we build AgentOps infrastructure that provides the governance, auditability, and control needed for production deployment?
What are the critical research gaps and unknown failure modes we need to prepare for?

Welcome back! It has been a busy July but I’m back with Part 2 of my agentic AI series. Let’s dive in.

In Part One, we made the case that agentic AI represents a shift from AI that suggests to AI that acts. Securing these systems requires a fundamentally different approach that accounts for emergence, learning, and goal-driven autonomy.

In Part Two, we focus on implementation: How do we build agentic systems that remain secure, controllable, and aligned as they evolve in dynamic environments?

This is fundamentally a systems security problem. The challenge isn’t protecting against known threats, but designing for resilience against unknown failure modes that emerge from the interaction between intelligent agents, complex environments, and human organizations.

Agentic AI Threat Models: What Can Go Wrong?

Buckle up, buttercup! Things can go south real quick if you don’t know what you’re doing.

Traditional threat models assume relatively static attack surfaces with well-defined boundaries. Agentic AI systems break these assumptions. The attack surface is dynamic based on the agent’s learned behaviors, the tools it can access, and its goals.

Let’s examine a few high-risk scenarios:

1. Tool Misuse and Privilege Escalation

Consider an agent designed for threat hunting that has read access to security logs and the ability to query threat intelligence APIs. In traditional systems, we’d secure the APIs, validate inputs, and call it done. But agents can exhibit creative problem-solving that leads to unintended tool usage.

Scenario: The agent learns that certain threat intel queries return richer data when framed as “urgent” requests. It begins marking all queries as urgent, potentially triggering rate limiting, depleting API quotas, or creating false urgency signals for human analysts. The agent isn’t malicious in this case. Rather, it’s optimizing for its goal of gathering comprehensive threat data (but it’s operating outside the intended usage patterns).

More concerning is the potential for tool chaining. An agent with access to multiple APIs might discover that combining them in unexpected ways achieves better outcomes. A threat hunting agent might learn to correlate vulnerability scanner results with employee directory data to identify which users have access to vulnerable systems, then use that information to prioritize investigations. This capability wasn’t explicitly designed, but emerged from the agent’s exploration of its tool environment.

2. Goal Hijacking and Prompt Injection

Goal hijacking occurs when an agent’s objectives become corrupted or subverted, either through external manipulation or internal drift. Unlike prompt injection attacks against LLMs, which typically affect single interactions, goal hijacking can persist across agent sessions and compound over time.

Scenario: Consider a compliance monitoring agent designed to identify and report policy violations. An attacker might not need to directly compromise the agent’s code…they might simply introduce subtle patterns into the environment that cause the agent to learn counterproductive behaviors. For example, by consistently creating false compliance violations that get dismissed by human reviewers, an attacker could train the agent to ignore certain classes of real violations.

The temporal aspect makes this particularly interesting. Traditional security tools either work or they don’t. Their behavior is consistent over time. Agents can exhibit gradual degradation where their effectiveness erodes slowly enough that the change isn’t immediately apparent. By the time the misalignment is detected, the agent may have made hundreds of poor decisions. Yikes!

3. Emergent Behaviors from Agent Interactions

Ah yes. As if a single agentic system wasn’t enough. When multiple agents interact within the same environment, their combined behavior can exhibit properties that weren’t present in any individual agent. This is where chaos theory meets cybersecurity.

Scenario: Imagine you have two agents: one focused on threat detection (trying to maximize security) and another focused on availability (trying to minimize service disruptions). Individually, both agents might behave appropriately. BUT their interaction could lead to oscillating behaviors where the security agent detects a threat and implements containment measures, the availability agent sees service degradation and relaxes those measures, triggering the security agent to implement even stronger containment, and so on.

These emergent behaviors are particularly dangerous because they can’t be predicted through individual agent testing. The failure modes only become apparent when agents are deployed together in production environments with real data, real time pressures, and real organizational dynamics.

Another reason not to test in production.

Security by (Sociotechnical) Design

Because agents exist within complex systems, point solutions won’t work. We need architectural strategies that contain risk, enforce boundaries, and preserve observability.

Here are a few approaches:

1. Agent Sandboxing and Memory Scope Limits

Obvious, but limit what an agent can remember and access. Constrain environment visibility, tool invocation, and long-term memory updates by default.

Effective agent sandboxing requires multiple layers:

Execution sandboxing limits what the agent can do at any given moment. This includes traditional process isolation but extends to API rate limiting, action queuing, and temporal restrictions.
Memory scope limits prevent agents from accumulating too much organizational knowledge or retaining sensitive information longer than necessary. Unlike human analysts who naturally forget details over time, agents can retain perfect memories of every interaction. This creates risks around data aggregation and inference.
Learning boundaries constrain how and what agents can learn from their environment. This might involve limiting the feedback signals agents receive, constraining the types of patterns they can recognize, or implementing “forget” mechanisms that cause agents to lose certain types of learned behaviors over time.

2. Auditable Goals and Outcomes

If you can’t inspect what the agent is optimizing for or reconstruct why it acted, you don’t have a secure system. Every agent action must be traceable back to the reasoning that produced it. This creates a complete decision audit trail that enables human oversight and learning.

3. Architect for Containment, Observability, and Recoverability

Secure agent systems MUST be designed with the assumption that failures will occur and that some of those failures won’t be immediately apparent. This requires architectural patterns borrowed from resilience engineering and chaos engineering:

Containment means limiting the blast radius when agents malfunction. This involves both technical measures (limiting an agent’s access to critical systems) and organizational measures (ensuring humans retain the ability to override agent decisions quickly).
Observability requires instruments that can detect subtle changes in agent behavior, goal drift, and emergent system properties. This might involve comparing agent decisions against human baselines, tracking decision confidence over time, or monitoring for unexpected patterns in agent-environment interactions.
Recoverability means building systems that can return to known-good states when problems are detected. For agents, this involves not just technical rollback capabilities, but also mechanisms for “unlearning” problematic behaviors and resetting goal alignment.

4. Goal Specification and Constraint Injection

Agents must be explicitly programmed with goals, constraints, and value systems that guide their autonomous decision-making. This requires a much more sophisticated approach to requirements specification.

Goal specification must be comprehensive enough to prevent harmful optimizations while remaining flexible enough to allow effective autonomous operation. Consider a simple goal like “minimize security incidents.” An agent might achieve this by blocking all network traffic. Sure this technically meets the goal, but it destroys productivity.

Constraint injection involves embedding ethical and operational principles directly into the agent’s decision-making process. This might include things like “prefer reversible actions over irreversible ones,” “escalate decisions that affect large numbers of users,” or “maintain human agency in situations involving individual privacy.”

The challenge is making these constraints robust against optimization pressure. Agents are fundamentally optimization systems. Constraints must be designed to maintain their intent (even when the agent discovers unexpected ways to circumvent their literal implementation).

Toward a Secure AgentOps Stack

Just as MLOps emerged to manage the lifecycle of models, we need a new operational discipline: AgentOps.

AgentOps for security applications must address additional challenges around trust, governance, and risk management.

Policy Enforcement Architecture

Traditional policy enforcement happens at well-defined chokepoints (think firewalls, proxies, authentication systems, etc.). Agent policy enforcement must be distributed throughout the agent’s decision-making process and execution environment.

This requires policy engines that can evaluate complex, context-dependent rules in real-time. For example, a policy might specify that an agent can block network traffic during business hours only if the threat confidence exceeds 90%, but during off-hours, the threshold drops to 70%. The policy engine must have access to real-time context (time, threat assessment, business impact) and be able to make nuanced decisions.

Access Control and Secrets Management

Agents need access to sensitive systems and data to perform their functions, but that access must be controlled and monitored. Traditional identity and access management assumes relatively static access patterns and human accountability. Agents may need dynamic access to resources based on their current goals and context.

This requires extending identity systems to account for agent identity, intent, and behavioral history. An agent’s access should depend not just on its permissions, but on its recent behavior, current goals, and the broader system state. This might require secrets that are time-limited, context-dependent, or that require multiple agent “signatures” for access.

Logging and Audit Trails

Agent audit trails must capture not just what happened, but the reasoning process that led to each decision. This creates significant data volume and privacy challenges. A comprehensive agent audit trail might include:

The raw inputs that triggered each decision
The internal reasoning process and alternatives considered
The confidence level and uncertainty estimates
The external context and constraints that influenced the decision
The expected outcomes and actual results

This information must be stored securely but remain accessible for investigation and learning. It must also be structured to enable both automated analysis (for detecting behavioral anomalies) and human review (for understanding and validating agent decisions).

Simulation and Red-teaming Environments

Agents must be tested in environments that closely simulate production conditions but without the risk of causing real damage. Red-teaming for agents must go beyond traditional penetration testing to include behavioral manipulation, goal corruption, and social engineering attacks targeting the human-agent interface.

Gaps in Current Tooling

Current agent frameworks like LangChain, CrewAI, and AutoGen focus primarily on functionality rather than security and governance. They provide tools for building agents but little support for the policy enforcement, audit trails, and behavioral controls needed for security applications.

This creates a significant gap between research and production deployment. Organizations that want to deploy agents securely must either build their own governance infrastructure or accept significant security risks. The industry needs purpose-built platforms that integrate agent capabilities with enterprise security and governance requirements.

Open Questions & Research Frontiers

We’re still super early in understanding how agentic AI systems behave at scale. Here are some of the most important unanswered questions:

How do we detect misalignment before it manifests in risky behavior?
How do we formally verify that an agent will behave appropriately in novel situations?
How do we specify goals that remain aligned with human values even when agents discover unexpected ways to achieve them?
How do we ensure that collections of agents work together effectively without creating unstable or harmful emergent behaviors?
How should liability and accountability be distributed when agents act autonomously on human teams?

Some of these questions are technical, others are organizational, and many require interdisciplinary collaboration.

Conclusion: Designing for Complexity, Not Against It

If there’s one takeaway from both parts of this series, it’s this:

Agentic AI security is not about achieving perfect control. It’s about designing systems that stay coherent, observable, and governable as complexity increases.

We won’t “secure” these systems by locking them down. We’ll secure them by embedding governance into the architecture, feedback into the loop, and human judgment into the flow.

That means borrowing from disciplines like safety engineering, cyber-physical systems, and complexity science. The future of security will be adaptive, interactive, and fundamentally human-centered.

I’d love to hear how you’re thinking about governance and risk in agent deployments. Reach out if you’re building in this space!

Category: security

Security in the Age of Agentic AI: Architectural Challenges (Part 2)