Back
#249
October 27, 2025

EP249 Data First: What Really Makes Your SOC 'AI Ready'?

Guest:

Topics:

SIEM and SOC
29:29

Subscribe at Spotify

Subscribe at Apple Podcasts

Subscribe at YouTube

Topics covered:

  • We often hear about the aspirational idea of an "IronMan suit" for the SOC—a system that empowers analysts to be faster and more effective. What does this ideal future of security operations look like from your perspective, and what are the primary obstacles preventing SOCs from achieving it today?
  • You've also raised a metaphor of AI in the SOC as a "Dr. Jekyll and Mr. Hyde" situation. Could you walk us through what you see as the "Jekyll"—the noble, beneficial promise of AI—and what are the factors that can turn it into the dangerous "Mr. Hyde"?
  • Let's drill down into the heart of the "Mr. Hyde" problem: the data. Many believe that AI can fix a team's messy data, but you've noted that "it's all about the data, duh." What's the story?
  • “AI ready SOC” - What is the foundational work a SOC needs to do to ensure their data is AI-ready, and what happens when they skip this step?  
  • And is there anything we can do to use AI to help with this foundational problem?
  • How do we measure progress towards AI SOC? What gets better at what time? How would we know? 
  • What SOC metrics will show improvement? Will anything get worse? 

Do you have something cool to share? Some questions? Let us know:

Transcript

The podcast discussion centered on the shift from viewing AI in the SOC as a complete replacement for human analysts (the "Terminator" metaphor) to seeing it as an augmentative tool—the "Iron Man suit" for the analyst. Merza identified the current opportunity for AI adoption as stemming from the widespread availability and observed capabilities of large language models (LLMs) and agentic frameworks.

However, he introduced the concept of the "Mr. Hyde" of AI in the SOC, rooted in three core challenges: Data, Process, and Governance. The most critical insight discussed was that for an AI system to be robust and effective in a dynamic security environment, it must be wrong some of the time. Striving for zero errors leads to an overfitted model that misses real threats, mirroring the failures of overly strict traditional security controls. The discussion concluded by establishing a "Triad of Value"Speed, Consistency, and Depth—as the essential metrics for measuring progress toward an AI-ready SOC.

Detailed Summary and Analysis

The Aspiration: Iron Man Suit vs. Terminator

The conversation opened by exploring the long-held aspiration for an Iron Man suit for the SOC analyst—a tool that augments and empowers the human operator, making them vastly more productive. The distinction was immediately drawn between this concept and the alternative, the "Terminator"—a fully autonomous machine intended to replace the analyst.

The Opportunity: The current momentum for this aspiration is driven by the visible success of LLMs and agentic frameworks. Security practitioners, facing mission-critical challenges, see these tools as a potential solution to their chronic problems. Early, simple testing with general-purpose LLMs generates a wave of optimism.

The Philosophical Shift: Merza noted a recent and necessary shift in the industry away from the replacement narrative (eliminating Tier 1 or Tier N analysts) and towards the augmentation model. The Iron Man suit analogy is preferred because the human remains the driver of the capability. The complexity of security operations (investigations, threat hunting, compliance) is currently too great for any system to run autonomously across all pillars.

The "Jekyll and Hyde" of SOC AI

The discussion pivoted to the dual nature of AI in the SOC, presenting the beneficial Dr. Jekyll and the challenging Mr. Hyde.

Dr. Jekyll (The Noble Promise): The positive reality is the elimination of boring, repetitive work. Customers are currently using AI to automatically resolve thousands of well-understood alerts (e.g., "open port on EC2 instance"). This is achieved by connecting to multiple data lakes/sources and chaining investigative steps to produce a kill chain-driven result. The key benefit is that humans only see the alert when it is truly important, preventing desensitization caused by alert fatigue.

Mr. Hyde (The Dark Side): The negative side manifests in two primary areas:

Vendor Hype and Buyer Misconception: Product manufacturers often overpromise, selling the belief that a "super awesome thing" can be simply acquired to deliver mass benefits, cost savings, and workforce reduction. This feeds into the struggle of security leaders who are convinced there is a shortage of "security unicorns" (those who can master 17 different platforms and topics) and thus seek technological shortcuts to solve a skills and complexity problem.

Technological Gaps and the "Generative" Problem: The core technical challenge lies in the "generative" part of Generative AI. Security practitioners require speed, consistency, and depth. A system that is fundamentally generative struggles to provide the necessary consistency for security work, which is governed by policy, regulation, and the need to follow best practices for auditor and adversary defense.

Challenges of AI-Native SOC Development

Merza outlined the common pitfalls encountered by well-resourced organizations attempting to build an AI SOC capability in-house:

Data Class: Organizations often lack a proper inventory of their security-relevant data. They may believe all data is in one place when, in reality, it's fragmented across multiple cloud vendors and source systems.

Process Class: A lack of standardized alert response procedures makes it impossible to model the analyst's behavior accurately, preventing the AI from performing reliable, repeatable work.

Governance Class: This is the critical risk. Concerns include:

Data Access: Should a security analyst (or their AI agent) have access to sensitive data like HR records?

SaaS Risk: Sending all alerts, investigations, and results to a third-party SaaS service raises compliance and liability concerns. An auditor could argue that the organization "knew about a threat or compliance violation" based on the AI's findings and failed to act.

The AI-Ready SOC and the Value of Error

The transition to an AI-Ready SOC requires fundamental changes, not just tooling acquisition.

AI-Readiness Checklist:

Data Inventory: Have a plan for where the necessary investigative data will come from, even if it's currently fragmented across multi-cloud environments.

Learn from Prior Work: Recognize the value of past analyst investigations, even if inconsistent, to learn from history and train future actions.

Mindset Shift: Be comfortable with a machine closing an alert and concluding that "there is nothing to see here," acknowledging that humans are already doing this without proper investigation.

The Critical Insight: The Machine Must Be Wrong Sometimes: This was identified as the most important takeaway.

Overfitting Risk: If a vendor claims their machine is never wrong, the system is likely overfitted to avoid false positives, leading to a high rate of missed true attacks (false negatives).

Transparency as the Solution: Leaders must get accustomed to the machine making mistakes. The way to mitigate this is through transparency and inspection. The vendor must be able to openly state: "Under these circumstances, I will make a mistake, but I invite you to be okay with it, and here is how I made the process transparent so you can inspect and verify."

The Increasing Alert Volume: Merza noted that customers anticipate a one-third (33%) increase in alert volume year-over-year, despite better detection engineering. This is due to increasing technology footprints, more customers, and the deployment of new AI-capable tools (e.g., agent frameworks, guardrails on AI deployments) that introduce new risks that require monitoring (e.g., transit of trust issues, new cloud misconfigurations like exposed S3 buckets).

Measuring Progress: The Triad of Value

Measuring the success of AI adoption cannot rely on a single metric. Merza provided a Triad of Value (a three-legged stool) for measuring progress:

Speed: The time from alert generation to the moment a human is notified that an investigation is worth paying attention to (following contextualization, enrichment, and kill chain-driven analysis).

Consistency: The ratio ensuring that an alert occurring at 1:00 AM is investigated with the same minimal quality baseline and process as an alert occurring at 4:00 PM when the "A-team" is working.

Depth: Did the AI conduct an analysis across a sufficient set of relevant points? For example, for a malware alert, did it check for initial access vectors (USB, email), or exfiltration communications? The system must ensure a level of investigation greater than an analyst in a rush who simply closes the ticket.

Key Discussion Points: Podcast Timeline

The conversation moved through several distinct phases, shifting focus from aspirational technology to governance reality.

Aspiration and Opportunity: The episode began by defining the goal as the Iron Man suit for the SOC analyst, enabled by the observed capabilities of LLMs and agentic frameworks.

Philosophical Distinction: A critical pivot was made from the "Terminator" model (replacement) to the Iron Man suit model (augmentation), emphasizing that the complexity of security operations requires the human to remain the driver.

The Dr. Jekyll Promise: The immediate, positive value of AI is the elimination of routine, boring work and the automated contextualization of thousands of alerts, thereby combating analyst desensitization.

The Mr. Hyde Threat: The negative challenges are rooted in two areas:

Industry Hype: Vendors overpromising full analyst replacement, preying on the CISO's struggle to find "security unicorns."

Technological Gap: Generative AI’s inherent design challenges the core security requirements of consistency and depth.

The Triad of Failure: Merza outlined three common reasons why in-house AI SOC projects fail: lack of Data Inventory, immature Process modeling, and severe Governance risks (especially regarding compliance liability and SaaS exposure).

The AI-Ready Mandate: Preparing the SOC requires a mindset shift that accepts that the machine must be wrong some of the time (an error rate greater than zero) to avoid being dangerously overfitted and missing attacks. Transparency is the only acceptable mitigation.

The Triad of Value: Progress must be measured holistically across three essential metrics: Speed, Consistency, and Depth of investigation.

Call to Action: The episode concluded with a recommended focus on foundational knowledge, specifically urging the community to read the Google "Attention is All You Need" paper to ground the AI conversation in technical reality.

View more episodes