Phase 1–4: 17 New Security Modules
The largest security expansion in AgentAIShield history — four phases of new detection, enforcement, and governance capabilities. AAIS now covers 27 security modules with 100% OWASP Top 10 for LLM Applications coverage and 1,000+ automated tests.
Phase 1 — Scanning
- Secret Scanner — 15 credential patterns (GitHub PAT, AWS, OpenAI, Anthropic, Stripe, JWT, PEM, DB strings, Slack, GCP). Auto-redacts before secrets leave the pipeline.
- Tool Call Scanner — Intercepts
tool_use / function_call blocks and scans arguments for URL injection, SSRF, and shell injection.
- Content Scan API (
POST /v1/scan/content) — Scan external content (email, Slack, web, docs) for injection, PII, and secrets before feeding to agents.
Phase 2 — Agent Control
- Scope Enforcer — Per-agent tool and domain allowlists/blocklists with enforce/audit/log modes. API:
GET/PUT /api/v1/agents/:id/scope.
- RAG Scanner (
POST /v1/scan/rag) — Scans retrieval chunks for prompt injection and knowledge base poisoning before they enter LLM context.
- Multi-Agent Trust — Inter-agent trust classification (trusted/verified/untrusted/blocked). API:
GET/PUT /api/v1/agents/:id/relationships.
- Session Guard — Detects session anomalies: IP change, user-agent swap, burst attacks, concurrent session collisions.
Phase 3 — Advanced Detection
- Extraction Detector — Identifies model extraction and probing attempts via semantic similarity analysis across session queries.
- Drift Detector — Tracks slow behavioral drift over weeks. API:
GET /api/v1/agents/:id/drift.
- Skill Scanner (
POST /v1/skill/scan) — Supply chain security scanning for third-party skills against 14 red-flag rules.
- Output Validator (
POST /v1/scan/output-injection) — Catches downstream injection (SQL, shell, HTML/XSS, JSON, Python) in agent outputs.
- Hallucination Detector — Flags ungrounded claims, confidence inflation, and fabricated citations in agent outputs.
- MCP Scanner — Audits MCP tool definitions for description injection, schema manipulation, and SSRF risks.
Phase 4 — Governance
- Shadow AI Discovery (
POST /v1/scan/shadow-report) — Detects unauthorized LLM usage within your organization via network traffic analysis.
- Cascade Detector — Models multi-agent blast radius for any agent compromise. APIs:
POST /api/v1/threats/analyze, GET /api/v1/threats/bulletins.
- Policy-as-Code — JSON policy rules with enforce/audit/log modes, version-controlled across your agent fleet. Full CRUD:
GET/POST/PUT/DELETE /api/v1/policies.
- Identity Anchoring — Persistent cryptographic agent identity with accumulated trust scores. APIs:
GET /api/v1/identities, POST /api/v1/identities/:id/attest.
Platform Stats
- Security modules: 27 total (up from 9)
- OWASP Top 10 for LLM Applications: 100% covered
- Automated test suite: 1,000+ tests
- API endpoints: 14 new endpoints across Scanning, Trust & Identity, Policy, and Threat Intelligence categories
Private Beta
AgentAIShield is in private beta. Sign up with an invite code to get early access and help shape the product.
What's Included
- TrustShield — Real-time hallucination detection with semantic verification scoring.
- Universal Ingest API — Provider-agnostic event ingestion for OpenAI, Anthropic, Gemini, and 10+ endpoints.
- Agent Trust Score — Continuous trust scoring with letter grades (A+ to F).
- PII Detection & Redaction — Detect and redact 40+ entity types in real time.
- Prompt Injection Defense — Classify and block injection attempts at <1ms overhead.
- Multi-provider Proxy — Drop-in proxy for OpenAI, Anthropic, Gemini, Mistral, and more.
- Dashboard — Live request stream, trust scores, cost tracking, and security alerts.
- Google & Apple Sign-In — Quick signup with your existing accounts.
Coming Soon
- Skill Scanner API — automated security vetting for AI agent plugins.
- Behavioral analytics — cross-session agent trust tracking.
- Cost arbitrage engine — intelligent model routing to reduce LLM spend.
- Team management & RBAC.