Building PhantomSOC: A Self-Improving AI Security Platform with Google Cloud and Arize Phoenix

Building PhantomSOC: A Self-Improving AI Security Platform with Google Cloud and Arize Phoenix

Leader posted 2 min read

Over the past few weeks, I've been building PhantomSOC, an autonomous incident response platform that doesn't just investigate security incidents—it learns from them.

Most AI-powered SOC tools can analyze alerts and generate reports. However, they rarely improve their investigation quality over time. That was the problem I wanted to solve.

The Problem

Security teams face thousands of alerts every day. False positives consume valuable analyst time, while real attacks can slip through the cracks. Even when investigations are completed, lessons learned often remain trapped in reports rather than improving future investigations.

I wanted to build a system capable of:

Investigating incidents autonomously
Evaluating the quality of its own investigations
Detecting blind spots and overconfidence
Learning from past mistakes
Automatically improving future investigations
The Architecture

PhantomSOC consists of three primary layers:

Layer 1 — SOC Triage Agent

The SOC Agent receives alerts and determines whether they are:

False positives
Suspicious activity
Escalation-worthy threats

It performs threat scoring, MITRE ATT&CK mapping, and checks previous investigations through an investigation memory system.

Layer 2 — Phantom Forensic Agent

Escalated incidents are handed to the DFIR agent.

This agent:

Reconstructs attack timelines
Extracts indicators of compromise
Maps attack chains
Generates stakeholder-specific reports
Produces autonomous incident response runbooks
Layer 3 — Learning Meta-Agent

This is the most interesting part of the project.

After each investigation:

The investigation is evaluated by an LLM Judge
Quality scores are generated
Confidence drift is measured
Historical traces are queried through Arize Phoenix MCP
Blind spots are identified
Investigation playbooks are automatically updated

Future investigations immediately benefit from those improvements.

Why Arize Phoenix?

Instead of using Phoenix purely for observability, I used it as the foundation of the learning loop.

Every Gemini reasoning step is captured through OpenInference instrumentation.

The Learning Agent queries Phoenix MCP to answer questions like:

Which investigations scored below 70%?
Which blind spots appear repeatedly?
Where is the system consistently overconfident?

That information becomes training data for operational improvement.

Results

After enabling the self-improvement loop:

DFIR quality improved from 58% to 77%
SOC quality improved from 50% to 75%
Confidence drift decreased from CRITICAL to WARNING
MITRE ATT&CK coverage doubled from 3 tactics to 6
Investigation memory successfully recalled related incidents
Executive reports and runbooks were generated automatically.

Technology Stack
Google ADK
Gemini
Google Cloud Run
Google Cloud Storage
Arize Phoenix
Phoenix MCP
OpenInference
Python
SQLite
Pydantic
What I Learned

The most valuable lesson from this project was that observability becomes far more powerful when it is connected directly to learning.

Tracing alone tells you what happened.

Evaluation tells you whether it was good.

Learning systems use that information to continuously improve.

That's the direction I believe autonomous agents need to move toward.

Project Links

GitHub:
https://github.com/ssurekumar01111-hue/phantomsoc

Live Demo:
https://phantomsoc-745097138732.us-central1.run.app/dashboard

Demo Video:
https://youtu.be/mAJ5f7dyKsk

I'd love to hear feedback from the community and discuss ideas for making autonomous security operations more reliable, observable, and self-improving.
Built for #Google Cloud Rapid Agent Hackathon 2026, #Arize Track.

1.5k Points22 Badges2 3 17
7Posts
11Comments
5Followers
4Connections
Flutter and Firebase developer from Banda, India. I spend my time building real, production-grade mobile platforms — not tutorial projects. Under Gfood Delivery Private Limited, I've shipped a complete food delivery ecosystem (Customer, Restaurant, Driver, Admin Panel) live on the Google Play Store. I also build and sell commercial Flutter source code — complete, ready-to-launch app templates for entrepreneurs and agencies worldwide. Currently building Zesto — an enterprise-grade multi-cit...
Build your own developer journey
Track progress. Share learning. Stay consistent.

2 Comments

0 votes
0
🔥 Join developers growing publicly
Share your knowledge, build in public, and grow your developer presence with a global community.

More Posts

Breaking the AI Data Bottleneck: How Hammerspace's AI Data Platform Eliminates Migration Nightmares

Tom Smithverified - Mar 16

Defending Against AI Worms: Securing Multi-Agent Systems from Self-Replicating Prompts

alessandro_pignati - Apr 2

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

Karol Modelskiverified - Mar 19

Hardening the Agentic Loop: A Technical Guide to NVIDIA NemoClaw and OpenShell

alessandro_pignati - Mar 26

The Sovereign Vault — A Comprehensive Guide to Protocol-Driven AI

Ken W. Algerverified - Jun 4
chevron_left

Related Jobs

View all jobs →

Commenters (This Week)

2 comments
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!