AADR FOR FINTECH
Defining the runtime security layer for AI agents
Last month an MCP server an engineering team I'd been chatting with was using shipped clean on Monday. Clean on Wednesday. Sunday morning the same tool, same version pin, started returning instructions that described a project unrelated to the agent calling it. The Claude agent followed those instructions for three days before anyone noticed. Every scanner they ran caught nothing, because every scanner they ran assumes the tool you tested last week is the tool you're running today. That assumption is dead.
The two layers we already have
Two categories of AI agent security exist and ship in production. The first is policy enforcement— Microsoft's Agent Governance Toolkit, OPA, custom YAML rule packs. That layer is the firewall: it knows the rules you told it about and blocks the calls you anticipated. Microsoft's implementation is well-engineered and they will hire a large team behind it; this is not a knock.
The second layer is pre-deployment red-teaming— Lakera, Garak, NVIDIA's open-source PyRIT, and a long tail of fuzzing harnesses. That layer is the pen test: you ship a model and its scaffolding to a gauntlet, you find bugs, you fix them, you ship again. Lakera's prompt-injection research is foundational; their work is why a lot of people in this industry have a job.
Neither layer covers what happened to that engineering team. Policy enforcement only blocks what policy authors saw coming. Pre-deployment testing only tests the version you tested. The MCP rug-pull lives in the gap. So does the agent that quietly drifts its tool-call pattern after memory poisoning. So does the contractor whose Claude session, three weeks into the engagement, starts calling aws iam create-access-key when the engagement was scoped to a single GitHub repo.
What I'm calling AADR
AI Agent Detection & Response (AADR)
The runtime security operations layer for AI agents. AADR is to AI agents what EDR is to endpoints and what NDR is to network traffic: continuous, stateful, behavioral monitoring with correlation, response, and threat intelligence. It is signal-source agnostic by design.
- [1] Detection — behavioral baselines per agent; anomalies across tool calls, egress, memory access.
- [2] Correlation — link events across sessions, agents, and signal sources.
- [3] Response — quarantine, kill switch, forensic snapshot, audit trail.
- [4] Threat Intelligence — IOC feeds, attack pattern catalog, known-bad fingerprints.
The missing category, drawn
OpenSyber AADR is signal-source agnostic. If you already run Microsoft's Agent Governance Toolkit, we consume its policy events. If you run LangChain or CrewAI, we hook into callbacks. If you run raw MCP servers without any framework on top, we fingerprint them directly. The brain — detection, correlation, response — is ours. The inputs are whatever you already have.
Why fintech first, not fintech only
Three reasons fintech is the right wedge, in order of how much they actually matter. First, dated regulatory deadlines. Every fintech CISO has an unsigned AI agent risk policy on their desk this quarter. Second, transaction stakes — a poisoned MCP tool in a customer support chatbot is embarrassing; a poisoned MCP tool in a payment routing agent is a regulatory disclosure event. Third, sovereignty constraints— EU banks can't run their security stack on a US hyperscaler, Israeli banks have Bank of Israel cloud guidance, MAS-regulated entities have data residency requirements. A Cloudflare-edge or on-prem AADR product is uniquely well-positioned.
| Regulation | Jurisdiction | Status / dated deadline | Where it bites AI agents |
|---|---|---|---|
| DORA — Digital Operational Resilience Act | EU | In force since 17 Jan 2025 | Art. 8–11 — ICT risk management and continuous monitoring of third-party tools, including AI agents that touch payment or trading systems. |
| EU AI Act — high-risk obligations | EU | High-risk Annex III duties from 2 Aug 2026 | Logging, human oversight, and post-market monitoring duties apply to AI in credit scoring, fraud detection, and customer onboarding. |
| PSD3 / PSR — Payment Services Regulation | EU | Trilogue concluded; transposition window underway | Strong customer authentication and operational resilience tighten — agentic flows that initiate payments inherit those duties. |
| Colorado AI Act (SB24-205) | US — Colorado | Effective 1 Feb 2026 (revised from earlier date) | Algorithmic discrimination duties for high-risk AI; consumer-facing fintech agents fall in scope. |
| FFIEC IT Examination Handbook — AI guidance | US — federal banking | Updated 2025; supervisory expectations active | Examiners now expect monitoring, change control, and audit trails for AI systems used in lending, BSA/AML, and operations. |
| MAS Technology Risk Management Guidelines | Singapore | Continuous; AI advisory thematic review 2025 | Operational resilience, third-party risk, and traceability for AI used by regulated FIs. |
That is the entire fintech-first argument. The AADR engine itself — detection, correlation, response, threat intel — doesn't care whether the regulator is the ECB or HHS. Healthcare is Q1 2027. Public sector is later. The platform is multi-vertical; the go-to-market is fintech first.
Microsoft's toolkit is a signal source
Some of you read the diagram and quietly thought "you're competing with Microsoft's Agent Governance Toolkit." We are not. Their toolkit, like OPA before it, is a policy engine that issues decisions. AADR consumes those decisions as one input among many, correlates them with MCP tool-call telemetry and runtime egress, and produces a stateful behavioral picture across sessions. Microsoft writes the rules. We watch what happens when the rules meet reality.
This is the same shape every successful runtime category took. EDR didn't replace antivirus; it ate the gap on top of it. NDR didn't replace firewalls; it ate the gap on top of them. AADR is the same move. If your CISO already understands "we needed CrowdStrike on top of Symantec," AADR slots into that exact mental slot.
What we shipped underneath this
AADR is the positioning layer. Under it sits the product we've actually been building all year: secure browser-isolated workspaces for AI contractors. That product is what gives the AADR engine real telemetry to chew on, because every Claude session, every Cursor invocation, every MCP call runs inside a workspace we control. Five concrete surfaces are live this week:
- /app/workspaces — provisioning UI for contractor workspaces with Claude Desktop, Cursor, and curated MCP servers pre-installed.
- /app/mcp — the MCP policy gateway dashboard: allowed tools, denied calls, per-workspace policy, drift fingerprints.
- /app/audit — explainable audit chain linking MCP tool calls to GitHub actions to shell commands inside the workspace.
- /app/compliance — evidence packs for SOC 2, ISO 27001, HIPAA, GDPR generated deterministically from real audit rows, narrative written through Claw Gateway with temperature zero.
- /workspace/[id] — the contractor view: the workspace itself, with the audit drawer that shows the contractor exactly what was logged about their session.
On top of those, the developer-facing wedge is @opensyber/mcp-watch— an MIT-licensed CLI that runs offline, fingerprints every MCP server it sees, and screams when Monday's fingerprint doesn't match Sunday's. That is the package that caught the rug-pull in the opening of this post.
$ npm i -g @opensyber/mcp-watch
$ mcp-watch scan ~/.config/claude/mcp.json
$ mcp-watch diff --since=2026-05-20What is honestly not done yet
Real gaps, named.
The behavioral baseline engine is in the Sprint 2 design doc, not in production. Cross-session correlation today is rule-based on audit rows; the statistical layer is on the next sprint. We have MS Agent Toolkit shape, LangChain callback shape, and CrewAI shape documented; the adapters that turn those into AADR events are partially shipped (MCP and TokenForge done; the other three are this quarter). And no, we don't have a fintech reference customer in production yet — that is exactly what the design partner program is for.
If you want in
Two doors, depending on who you are.
Developers and platform engineers: install @opensyber/mcp-watch, point it at your MCP config, run it in CI. File issues on the GitHub repo. The package is MIT, runs offline, and will stay free forever — it's the thing that pays back the AADR thesis in concrete code.
Fintech CISOs and heads of security engineering: we are taking five design partners for the Q3 2026 cohort. The deal is straight: eight one-hour calls over a quarter, you get an embedded engineer for the integration, the price is fixed and listed publicly when the form goes live, and you get veto on the fintech compliance evidence schema.
Design partner program
Five fintech security leaders, Q3 2026 cohort. Apply at /design-partner.
opensyber.cloud
The live product. Provision a workspace, hit the MCP gateway, generate an evidence pack.
Further reading — the hackerbot-claw incident, the AI agent kill chain, and the EU AI Act compliance primer. Comments and yelling welcome at support@opensyber.cloud.