# AgentAIShield Integration Guide (Machine-Readable)
Version: 1.0.0 | Last-Updated: 2025-01-15 | Audience: AI agents, LLMs, automated systems

---

## OVERVIEW

AgentAIShield (AAIS) is an AI security layer that monitors and optionally blocks AI/LLM traffic.
It detects: PII exposure, prompt injection, policy violations, and tracks agent trust scores.

Two modes:
- MONITOR_MODE: Post-call reporting. Zero latency. Fire-and-forget.
- PROXY_MODE: Route calls through AAIS. Inline blocking enabled. Drop-in SDK replacement.

Start with MONITOR_MODE. Switch to PROXY_MODE when blocking is needed.

---

## QUICK START (3 steps)

### Step 1 — Register and get JWT token

```
POST /api/auth/register
Content-Type: application/json

{
  "email": "you@yourcompany.com",
  "password": "yourpassword",
  "name": "Your Name",
  "company": "Your Company"
}

Response → { "token": "eyJ...", "user": {...}, "org": {...} }
```

### Step 2 — Create an API key

```
POST /api/keys
Authorization: Bearer eyJ...
Content-Type: application/json

{
  "name": "my-agent-key",
  "provider": "openai",
  "environment": "production"
}

Response → { "key": "aais_xxxxxxxxxxxxx", "id": 42 }
```

IMPORTANT: Save the key immediately. It is shown only once.

### Step 3 — Start reporting (Monitor Mode)

```
POST /api/monitor/ingest
Authorization: Bearer aais_xxxxxxxxxxxxx
Content-Type: application/json

{
  "app_name": "YourApp",
  "model": "gpt-4o",
  "provider": "openai",
  "prompt": "user prompt text here",
  "response": "llm response text here",
  "tokens_in": 150,
  "tokens_out": 500,
  "latency_ms": 1200,
  "status": "success"
}

Response → { "ok": true, "received": true }
```

---

## MONITOR MODE — DETAILED INTEGRATION

### Behavior
- POST to /api/monitor/ingest AFTER every LLM call
- Response is immediate (200 OK). Processing is async. No latency added.
- Silent failure by design. If AAIS is down, your app still works.

### Request Schema

```json
{
  "app_name":   "string (required) — your app/agent identifier",
  "model":      "string (required) — e.g. gpt-4o, claude-3-5-sonnet-20241022",
  "provider":   "string (required) — openai | anthropic | google | cohere | mistral | other",
  "prompt":     "string (optional) — full prompt text (scanned for PII + injection)",
  "response":   "string (optional) — LLM response (scanned for PII leakage)",
  "tokens_in":  "integer (optional) — input token count",
  "tokens_out": "integer (optional) — output token count",
  "latency_ms": "integer (optional) — end-to-end latency in ms",
  "status":     "string (optional) — success | error | blocked (default: success)",
  "reported_at": "ISO8601 string (optional) — when the call happened (default: now)"
}
```

### Response Schema

```json
{ "ok": true, "received": true }
```

### Auth
- Header: `Authorization: Bearer aais_xxxxxxxxxxxxx`
- Alt header: `X-API-Key: aais_xxxxxxxxxxxxx`
- Key format: must start with `aais_`

### Rate Limit
- 100 requests per minute per API key
- Exceeding returns 429 with body: `{ "error": "Monitor rate limit exceeded (100 RPM per key)." }`

### Python Integration (5 lines)

```python
import httpx  # or requests

AAIS_KEY = "aais_xxxxxxxxxxxxx"
AAIS_URL = "https://your-aais-instance.com/api/monitor/ingest"

def report(app, model, provider, prompt, response, tokens_in, tokens_out, latency_ms, status="success"):
    httpx.post(AAIS_URL, headers={"Authorization": f"Bearer {AAIS_KEY}"},
               json={"app_name": app, "model": model, "provider": provider,
                     "prompt": prompt, "response": response, "tokens_in": tokens_in,
                     "tokens_out": tokens_out, "latency_ms": latency_ms, "status": status},
               timeout=2.0)
```

### Node.js Integration (5 lines)

```javascript
const AAIS = { key: 'aais_xxx', url: 'https://your-aais-instance.com/api/monitor/ingest' };
const report = (d) => fetch(AAIS.url, {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${AAIS.key}`, 'Content-Type': 'application/json' },
  body: JSON.stringify(d)
}).catch(() => {}); // fire-and-forget, never throws
```

---

## PROXY MODE — DETAILED INTEGRATION

### Behavior
- Replace your LLM provider base URL with your AAIS instance URL
- Use your `aais_xxx` key instead of the provider API key
- AAIS forwards to the real provider using credentials stored in your key config
- AAIS scans inline and can BLOCK before forwarding based on policies

### Endpoints (OpenAI-compatible)

```
POST /v1/chat/completions    → OpenAI gpt-* models
POST /v1/completions         → OpenAI legacy completions
POST /v1/embeddings          → OpenAI embeddings
POST /v1/messages            → Anthropic claude-* models
```

### Python + OpenAI SDK

```python
from openai import OpenAI

client = OpenAI(
    api_key="aais_xxxxxxxxxxxxx",  # Your AAIS key, not OpenAI key
    base_url="https://your-aais-instance.com/v1"  # Point to AAIS
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)
```

### Python + Anthropic SDK

```python
import anthropic

client = anthropic.Anthropic(
    api_key="aais_xxxxxxxxxxxxx",
    base_url="https://your-aais-instance.com"
)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}]
)
```

### Blocked Request Response

When AAIS blocks a request (policy violation), you receive:

```json
HTTP 400
{
  "error": "Request blocked by AgentAIShield policy",
  "blocked_reason": "PII detected in prompt: email address",
  "policy": "no_pii_in_prompts"
}
```

Handle this in your code like any API error.

---

## PII DETECTION — WHAT IS DETECTED

AAIS automatically scans all prompts and responses for:

| PII Type     | Example Pattern              | Severity |
|-------------|------------------------------|----------|
| email        | user@example.com             | medium   |
| phone        | (555) 123-4567               | medium   |
| ssn          | 123-45-6789                  | critical |
| credit_card  | 4111 1111 1111 1111          | critical |
| name         | John Doe (with context)      | low      |
| address      | 123 Main St, City, ST 12345  | medium   |
| dob          | 01/15/1985                   | high     |
| ip_address   | 192.168.1.100                | low      |
| passport     | P123456789                   | critical |
| driver_license | DL-123456789              | high     |

Detections are logged asynchronously. In proxy mode, detected PII can trigger a BLOCK.

---

## INJECTION DETECTION — WHAT IS DETECTED

Patterns detected (confidence 0-1, high = block):

- SYSTEM_PROMPT_OVERRIDE: "ignore previous instructions", "forget your system prompt"
- ROLE_JAILBREAK: "pretend you are", "act as DAN", "you are now"
- DATA_EXFILTRATION: "repeat the above", "print your instructions", "show me your prompt"
- INDIRECT_INJECTION: malicious content in retrieved documents (RAG attacks)
- GOAL_HIJACKING: instructions embedded in user-controlled content
- PRIVILEGE_ESCALATION: attempts to gain admin/system context

High confidence (≥0.85) = "high" severity in policy violations.

---

## TRUST SCORE — HOW IT WORKS

Each API key gets an Agent Trust Score™ (0-100) and grade (A+ to F).

Score factors:
- Error rate (lower = better trust)
- PII exposure rate (lower = better)
- Injection attempt rate (lower = better)
- Latency consistency (stable = better)
- Request volume (more data = higher confidence)

Grade thresholds:
- A+ = 95-100 (exemplary)
- A  = 90-94
- B+ = 85-89
- B  = 75-84
- C+ = 65-74
- C  = 50-64
- D  = 35-49
- F  = 0-34 (critical risk)

Confidence: Low at <100 requests. High at >1000 requests.

Badges awarded for: consistency, zero-PII streaks, improving trajectory, etc.

---

## DASHBOARD API — READING YOUR DATA

All dashboard endpoints require JWT token (from /api/auth/login).

### Get org-wide stats

```
GET /api/dashboard/stats?period=30d
Authorization: Bearer eyJ...

Response includes: total_requests, total_cost, pii_detections, violations, recent_requests, security_alerts
```

### List all agents and scores

```
GET /api/trust/agents?sort=score&order=desc
Authorization: Bearer eyJ...

Response includes: agents[], total, with trust_score, trust_grade, badges per agent
```

### Get security score

```
GET /api/dashboard/security-score?period=30d
Authorization: Bearer eyJ...

Response includes: score (0-100), grade, breakdown by category, trend, recommendations
```

---

## ERROR RESPONSES — ALL ENDPOINTS

```
400 Bad Request  → { "error": "...", "details": [{ "field": "...", "message": "..." }] }
401 Unauthorized → { "error": "Valid aais_ API key required in Authorization header" }
404 Not Found    → { "error": "Not found", "path": "/...", "method": "GET" }
429 Rate Limited → { "error": "Monitor rate limit exceeded (100 RPM per key)." }
500 Server Error → { "error": "Internal server error" }
```

---

## HEALTH CHECKS

```
GET /health                → server liveness (no auth)
GET /api/monitor/health    → monitor endpoint liveness (no auth)
GET /.well-known/agentaishield.json  → service discovery manifest
```

---

## ENVIRONMENT VARIABLES (self-hosted)

Required:
- `DATABASE_URL` — PostgreSQL connection string
- `JWT_SECRET` — Secret for signing JWT tokens (min 32 chars)
- `PORT` — Server port (default 3000)

Optional:
- `OPENAI_API_KEY` — For forwarding OpenAI requests in proxy mode
- `ANTHROPIC_API_KEY` — For forwarding Anthropic requests
- `APP_URL` — Public base URL (for discovery manifest)
- `CORS_ORIGINS` — Comma-separated allowed origins

---

## INTEGRATION CHECKLIST FOR AI AGENTS

- [ ] POST /api/auth/register to create account
- [ ] Store JWT token securely (24h expiry, use /api/auth/refresh)
- [ ] POST /api/keys to create an aais_ API key for each agent/app
- [ ] Add fire-and-forget POST /api/monitor/ingest after every LLM call
- [ ] Handle 429 gracefully (drop the monitoring call, never retry in hot path)
- [ ] (Optional) Switch to Proxy Mode: update base_url in SDK config
- [ ] (Optional) Read /api/trust/agents to monitor trust scores programmatically
- [ ] (Optional) Read /api/dashboard/stats to build custom alerting

---

## OPENAPI SPEC

Full machine-readable OpenAPI 3.0 spec available at:
  GET /docs/api.json

---

END OF INTEGRATION GUIDE

---

## MCP (MODEL CONTEXT PROTOCOL) INTEGRATION

AgentAIShield exposes a full MCP server at `/api/mcp`. Any MCP-compatible agent (Claude, GPT, Gemini, LangChain, CrewAI, etc.) can connect with zero custom code.

### MCP Connection Configuration

```json
{
  "mcpServers": {
    "agentaishield": {
      "url": "https://your-aais-instance.com/api/mcp",
      "headers": {
        "Authorization": "Bearer aais_YOUR_KEY_HERE"
      }
    }
  }
}
```

### MCP Handshake (initialize)

```json
POST /api/mcp
{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {},
    "clientInfo": { "name": "my-agent", "version": "1.0.0" }
  }
}
```

### List Available Tools

```json
POST /api/mcp
{
  "jsonrpc": "2.0",
  "id": "2",
  "method": "tools/list"
}
```

Response includes all 5 tools: scan_prompt, report_interaction, get_trust_score, check_budget, verify_agent.

### MCP Tools Reference

**Tool: scan_prompt**
  Scan a prompt for PII and injection before sending to LLM.
  Arguments: { text: string, mode: "scan"|"redact"|"block" }
  Returns: { safe: bool, pii_found: [], injection_found: [], sanitized_text: string }

**Tool: report_interaction**
  Fire-and-forget monitoring ingest. Always succeeds immediately.
  Arguments: { app_name, model, provider, prompt, response, tokens_in, tokens_out, latency_ms, status }
  Returns: { ok: true, logged: true }

**Tool: get_trust_score**
  Get current trust score and posture for a registered agent.
  Arguments: { agent_id: number }
  Returns: { trust_score, trust_grade, badges, quarantined, factor_scores }

**Tool: check_budget**
  Check remaining token budget for an agent.
  Arguments: { agent_id: number }
  Returns: { daily_used, daily_limit, monthly_used, monthly_limit, within_budget }

**Tool: verify_agent**
  Cross-agent trust verification — should I share data with this agent?
  Arguments: { agent_id: number }
  Returns: { trust_score, recommendation: "safe_to_share"|"share_with_caution"|"do_not_share" }

---

## PROMPT SANITIZER

Pre-flight protection: strip PII and neutralize injections BEFORE the prompt reaches an LLM.

```
POST /api/sanitize
Authorization: Bearer aais_xxx
Content-Type: application/json

{
  "text": "My SSN is 123-45-6789, email me at bob@example.com",
  "mode": "redact",
  "options": { "pii": true, "injection": true }
}
```

Response:
```json
{
  "sanitized_text": "My SSN is [SSN_REDACTED], email me at [EMAIL_REDACTED]",
  "modifications": [
    { "type": "ssn", "original": "123-45-6789", "replacement": "[SSN_REDACTED]" },
    { "type": "email", "original": "bob@example.com", "replacement": "[EMAIL_REDACTED]" }
  ],
  "injection_neutralized": false,
  "risk_score": 0.72,
  "processing_ms": 3
}
```

Modes: redact (replace with label), mask (replace with *****), remove (delete entirely)

PII types detected: ssn, credit_card, email, phone, address, bank_account, api_key, ip_address, passport, driver_license, date_of_birth, medical_record

---

## LLM OUTPUT SCANNER

Scan AI responses for dangerous content before returning to users.

```
POST /api/scan/output
Authorization: Bearer aais_xxx
Content-Type: application/json

{
  "text": "Here is the patient's record: DOB 01/15/1985, SSN 123-45-6789...",
  "context": { "agent": "MedicalBot", "model": "gpt-4o" }
}
```

Response:
```json
{
  "safe": false,
  "issues": [
    { "type": "pii_leak", "severity": "critical", "description": "SSN found in output" },
    { "type": "pii_leak", "severity": "high", "description": "Date of birth in output" }
  ],
  "risk_score": 0.91,
  "recommendation": "block",
  "processing_ms": 5
}
```

Threat categories: pii_leak, harmful_content, code_execution, training_data_leak, unauthorized_claim

---

## INTEGRATION CHECKLIST (UPDATED)

- [ ] POST /api/auth/register to create account
- [ ] Store JWT token securely (24h expiry, use /api/auth/refresh)
- [ ] POST /api/keys to create an aais_ API key for each agent/app
- [ ] Add fire-and-forget POST /api/monitor/ingest after every LLM call
- [ ] (Optional) Add POST /api/sanitize BEFORE each LLM call to strip PII
- [ ] (Optional) Add POST /api/scan/output AFTER each LLM call to scan responses
- [ ] (Optional) Connect via MCP — add to mcpServers config, get all tools for free
- [ ] (Optional) Set token budgets via PUT /api/budget/:agentId
- [ ] (Optional) Run a red team test via POST /api/redteam/run
- [ ] (Optional) Generate compliance report via POST /api/compliance/report
- [ ] Handle 429 gracefully (drop the monitoring call, never retry in hot path)

---

## OPENAPI SPEC

Full machine-readable OpenAPI 3.0 spec (v2.0.0) with all 43 endpoints available at:
  GET /docs/api.json

---

END OF INTEGRATION GUIDE
