Do you have something cool to share? Some questions? Let us know:
Introduction
The episode began with an introduction from Timothy Peacock, who welcomed Dominic Swierad to the podcast. The conversation immediately delved into the topic of using AI agents to secure Google, a topic that both hosts noted was exciting and had been a long time coming. The discussion's primary focus was on the practical application of AI, moving beyond the theoretical to the operational challenges and successes of integrating this technology into Google's security workflows.
Building Trust with Seasoned Security Professionals
Dominic addressed the initial challenge of introducing AI agents to Google's seasoned security professionals, a group described as "deeply skeptical." He explained that the skepticism wasn't about automation itself, as Google has a long history of automating security workflows. The real hurdle was the shift from machines being "excellent doers of human-defined plans" to becoming "excellent planners" that could also execute those plans autonomously.
The strategy to build trust came directly from the security engineers and privacy teams. The key insight was that for people to trust the technology, they had to understand it. The team started by integrating a simple generative AI chat interface directly into their existing tooling. This allowed engineers to interact with their own data, ask questions, and get a feel for how the core technology worked without the pressure of a full-blown agent. This low-stakes approach led to a significant and sticky surge in adoption, as engineers found it a convenient way to get unstuck and explore data. This initial success created a foundation of trust that later opened the door for discussions about more complex AI agents.
Selecting the Right Use Cases
The conversation then shifted to how the team prioritized the vast number of potential AI applications. The core principle was to avoid "sprinkling AI pixie dust on everything" and instead focus on areas that would deliver the most value. The team’s goal was to advance the state of security by identifying and addressing the most critical bottlenecks.
For example, while it might seem intuitive to use AI to achieve 100% detection coverage, a deeper analysis revealed that the operational teams couldn't scale to handle the massive increase in alerts. Therefore, the team chose to focus on scaling the ops teams' capabilities to handle more incidents efficiently.
Another crucial factor in selecting use cases was data availability. The team recognized that while AI is "magical," it's not "magic." Without great examples and curated data sets, a use case would be a non-starter. They focused on problems where they had a strong data foundation and could define what success looked like.
Quantifying ROI and Measuring Success
When it came to measuring success, Dominic emphasized that efficiency, while important, wasn't the sole metric. The team focuses on two primary areas:
Risk Reduction: Are the new capabilities helping to close security gaps that were previously unaddressable? Are they discovering new threats that human analysts were missing? This metric ensures the work is aligned with the ultimate goal of improving Alphabet's security posture.
Avoiding Repetition: Dominic referenced the "fool me once" concept, stating that if a security engineer discovers something once, no one should ever have to do it again. The success of the agents is measured by their ability to scale the security team's efforts, taking on repetitive tasks and freeing up engineers to focus on novel and complex threats.
Trust in the tools is measured by the volume and quality of new ideas from security engineers. The initial chat interface led to a trickle of ideas, but as trust grew, it turned into a "flood of concepts," with engineers actively proposing new applications for the technology.
Initial Successes and Privacy Considerations
Initial successes were built on the fundamental strengths of large language models: distillation and translation. The agents were effective at:
Summarizing Tickets: Taking an entire security ticket and distilling it into the "who, what, when, where, why, and how" so an analyst could make a quicker decision.
Code Analysis: Looking at code to determine if it was potentially malicious.
File Sensitivity: Analyzing files to determine if they contained sensitive information, which had the added benefit of preserving privacy by reducing the number of human eyes that needed to view confidential data.
Scaling Challenges and the Role of Data
Scaling the AI agents presented significant challenges, primarily related to the variety and unpredictability of real-world security data. The solution involved building a robust, continuous feedback loop. The team collects feedback from users, analyzes what worked and what didn't, and then uses that data to update and retrain the models.
In an interesting turn, the team is now using Gemini itself to help analyze and distill feedback from the high volume of security tickets, essentially using AI to improve the AI agents. This leads to a parallel concern: if one agent is doing the work, and another is doing the quality assessment, how do you ensure the second agent is also trustworthy? The solution is to decouple the two, ensuring the quality agent is not just re-evaluating based on the past tickets but using core security fundamentals defined by human engineers.
Managing Risk and the Persona-Based Approach
Managing the risk of an agent making an error in a critical environment is paramount. The team's approach is to manage risk through identity and personas. A human security engineer may wear multiple hats, but an agent is given a highly scoped identity, such as a "malware investigation agent" or a "privacy agent." This approach forces a critical re-evaluation of the principle of least privilege, ensuring the agent can only perform specific, authorized actions. This is also crucial because an agent only has a "snapshot of the world" and lacks the broader context a human has from meetings, emails, and informal communication.
The Evolving Role of Humans
Dominic believes the role of the human security engineer will shift significantly. The interaction with tooling will change, as engineers will need to record information and document their steps in a way that is clear and unambiguous for machines to interpret. This is a crucial skill, as a poorly documented note could lead an agent astray.
The human role will become more focused on the novel, cutting-edge threats, while the agents handle the repetitive, known, and "boring" tasks. This creates a symbiotic relationship: humans push the boundaries of defense, and once a novel threat becomes a known pattern, the agent takes over. The agent can then scale this knowledge across the entire organization, identifying patterns that a single human might miss. There is also a future where the agent can "push left," automating the response to a point where it can be handled by standard automation or preventative systems, further freeing up human talent.
Adversarial Use of AI and Future Outlook
On the flip side, Dominic is concerned about adversaries using agents to automate their attacks, noting that they are just as "lazy" as we are. He fears that AI will breathe "new life into old attacks," creating more deceptive and obfuscated threats at a larger scale, such as highly personalized phishing campaigns. This creates an "arms race" where defensive agents are needed to keep up with and defend against offensive agents. The solution is not just to react but to use AI to distill intelligence and provide a strategic view for security leaders.
Conclusion and Recommendations
To wrap up, Dominic provided two key recommendations for listeners:
Reading Material: He recommended staying current with research papers, but suggested starting with YouTube channels that distill complex papers into understandable concepts. He also mentioned books like "LLMs from Scratch" to understand the foundational technology.
Getting Started: For practical application, he recommended the Google ADK (Agent Development Kit) as an excellent starting point for building and experimenting with agents.
The episode concluded with a recap and a call for listeners to subscribe and engage with the podcast community.
Timeline of Key Discussion Points
Building Trust: The conversation begins by addressing the challenge of gaining the trust of skeptical security professionals. The key strategy was to start with a simple, integrated generative AI chat interface to foster understanding before introducing full agents.
Prioritizing Use Cases: The discussion shifts to how the team selected initial projects, focusing on addressing security bottlenecks rather than just applying AI for its own sake. A key takeaway was the importance of having sufficient, quality data.
Defining Success: The hosts discuss the metrics for measuring ROI, moving beyond simple efficiency to focus on risk reduction and the elimination of repetitive work for security engineers.
Scaling and Feedback: The conversation explores the challenges of scaling agents and the necessity of a continuous feedback loop. This section introduces the concept of using AI to improve AI and the need for a quality-control agent.
Risk and Identity: Dominic outlines the persona-based approach to managing risk, where agents are given highly scoped identities to enforce the principle of least privilege. This highlights how agents force a more disciplined approach to access control.
The Evolving Human Role: The discussion focuses on how agents will change the job of a security engineer, shifting their focus to novel threats and requiring them to adapt how they document their work for machine consumption.
Adversarial AI: The final topic is the fear of adversaries using agents to automate and scale attacks, turning the defensive use of AI into a strategic arms race.