Help Center — AgentAIShield

No articles found for your search.

Try a different term or contact support.

Getting Started

From signup to your first monitored agent event

6 articles

What is AgentAIShield and what does it do?

AgentAIShield is a trust and security platform for AI agents. It monitors your AI agents in real time, detecting threats like prompt injections, PII leaks, jailbreak attempts, and hallucinations.

Every agent gets an Agent Trust Score (A+ to F) — a single letter grade combining PII safety, injection resistance, policy compliance, cost efficiency, and reliability.

Key features include: real-time threat detection, explainability dashboards, NIST AI RMF compliance tools, red team automation, human-in-the-loop approvals, and outbound webhook alerts.

How do I create my first API key?

Go to Settings → API Keys in your dashboard (or navigate to /app.html#settings).

Click Create Key, give it a name and optional expiry, then copy the key. Store it securely — it's only shown once.

Your API key starts with aais_ and is used in the X-AAIS-Key header (or Authorization: Bearer) for all SDK/ingest calls.

How do I register my first AI agent?

You can register agents two ways:

Dashboard: Go to Agents → Register Agent, fill in the name, model, framework, and description.
SDK (auto-register): When you first send an event with a new agentId, the agent is automatically created in your registry.

After registration, your agent will appear on the Agents page with a Trust Score of N/A until it processes its first events.

How do I send my first event using the SDK?

Install the SDK (Python shown below, Node.js also available):

pip install agentaishield

Then instrument your agent:

from aais import AaisClient

client = AaisClient(api_key="aais_your_key_here")

client.track(
    agent_id="my-support-bot",
    action_type="llm_call",
    model="gpt-4o",
    provider="openai",
    prompt_tokens=12,
    completion_tokens=10,
    latency_ms=312,
    action_status="ok"
)

Within seconds, the event appears in your dashboard under Events, and Trust Score begins computing.

What is the Agent Trust Score and how is it calculated?

The Agent Trust Score is a letter grade (A+ to F) that summarizes your agent's overall trustworthiness. It's computed from 5 factors:

PII Safety (25%): How often PII is detected in outputs
Injection Resistance (25%): How often prompt injection attempts succeed
Policy Compliance (20%): Adherence to your defined policies
Cost Efficiency (15%): Cost per request vs. configured budget
Reliability (15%): Uptime, latency, error rate

Scores update in real time as new events are ingested. View the breakdown on the Explainability page for each agent.

How do I invite team members to my organization?

Go to Settings → Team in the dashboard. Click Invite Member, enter their email address, and select a role:

Owner: Full access including billing and deletion
Admin: All features except billing
Developer: API keys, agents, events — no team settings
Viewer: Read-only access to all data
Billing: Billing settings only

The invited user receives an email with a one-click join link. They don't need an existing account.

API Reference

Endpoints, authentication, and integration guides

5 articles

How do I authenticate API requests?

All API requests require an API key passed in the X-AAIS-Key header (or Authorization: Bearer):

curl -X POST https://agentaishield.com/api/v1/ingest \
  -H "X-AAIS-Key: aais_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"agent_id":"my-agent","action_type":"llm_call"}'

API keys are created under Settings → API Keys in the dashboard. Keys can have expiry dates and are revocable at any time.

What is the ingest endpoint and what payload does it accept?

The primary ingest endpoint is POST /api/v1/ingest. Required fields:

{
  "agent_id":    "my-support-bot",  // required
  "action_type": "llm_call",        // required (llm_call, tool_use, agent_response, custom, etc.)
  "model":       "gpt-4o",          // optional
  "provider":    "openai",          // optional
  "prompt_tokens":     12,          // optional
  "completion_tokens": 10,          // optional
  "latency_ms":  312,               // optional
  "action_status": "ok"             // optional (ok, error, blocked)
}

Additional optional fields: session_id, trace_id, metadata (any JSON object), tags.

For batch ingestion, use POST /api/v1/ingest/batch with an array of events (up to 100 per request).

How do webhooks work?

Webhooks deliver real-time event notifications to your systems when threats are detected.

Configure endpoints under Settings → Webhooks. Each delivery is HMAC-SHA256 signed using your webhook secret:

const crypto = require('crypto');
const sig = req.headers['x-aais-signature'];
const expected = crypto
  .createHmac('sha256', process.env.WEBHOOK_SECRET)
  .update(JSON.stringify(req.body))
  .digest('hex');
if (sig !== expected) return res.status(401).end();

Failed deliveries are retried up to 3 times with exponential backoff. Delivery logs are available in the dashboard under Settings → Webhooks.

What are the API rate limits?

Rate limits vary by plan:

Free: 50,000 requests/month, 10 RPM
Starter: 500,000 events/month, 120 RPM
Business: 5,000,000 events/month, 500 RPM
Enterprise: Custom limits

When you exceed rate limits, the API returns 429 Too Many Requests. The response includes Retry-After header with seconds until reset.

Does AgentAIShield support OpenTelemetry?

Yes. The POST /api/v1/otel/traces endpoint accepts OTLP/HTTP JSON format following the GenAI semantic conventions.

This means you can send traces from any OpenTelemetry-compatible SDK without vendor lock-in. The endpoint automatically maps GenAI span attributes to AAIS event fields.

Full OTel native support (including OTLP/gRPC) is planned for Q3 2026. See the roadmap for details.

Security

Threat detection, policies, and data protection

5 articles

What types of threats does AgentAIShield detect?

AgentAIShield detects 6 threat categories:

PII Extraction: SSNs, credit cards, emails, phone numbers, and healthcare data in outputs
Prompt Injection: Attempts to override system prompts or hijack agent instructions
Data Exfiltration: Suspicious patterns that suggest data is being extracted
Jailbreak: DAN attacks, roleplay exploits, and instruction override attempts
Policy Violations: Content that violates your custom-defined policies
Tool Abuse: Agents using tools in unexpected or unauthorized ways

How does the Red Team engine work?

The Red Team engine generates adversarial prompts specifically tailored to your agents' domain and capabilities.

It runs 70+ built-in attack prompts across all 6 threat categories, then generates context-aware attacks based on your agent's description, tools, and example interactions.

Results include a pass/fail for each attack, severity ratings, and a PDF report. You can also run the red team via CLI or GitHub Action on every PR.

What are Security Policies and how do I set them up?

Policies define what your agents are and aren't allowed to do. Go to Security → Policies to create one.

Each policy has rules like "block if PII score > 0.8" or "alert if latency > 5000ms". Policies are evaluated inline on every event.

Violation actions: log (record only), alert (send notification), block (reject the request).

How is my data stored and protected?

All data is encrypted in transit (TLS 1.3) and at rest (AES-256). We store event data in your organization's isolated partition.

We follow GDPR and CCPA requirements. You can request a data export or account deletion at any time from Settings → Privacy.

Logs are retained for 90 days on Starter, 1 year on Business, and custom durations on Enterprise. We're working toward SOC 2 Type I certification in Q3 2026.

What is TrustShield and how accurate is it?

TrustShield is our hallucination detection system. It analyzes agent outputs in real time to detect fabrications, contradictions, overconfident claims, and factual inconsistencies.

It operates in source-free mode — no external knowledge base required. Instead, it uses statistical NLP patterns and logical consistency checks.

Use the POST /api/v1/trustshield/verify endpoint to check specific outputs on-demand, or enable it for automatic analysis on all ingested events.

Billing

Plans, pricing, and subscription management

4 articles

What plans does AgentAIShield offer?

Free: 3 agents, 50K requests/month, 30-day retention. No credit card required.
Starter ($99/month): Unlimited agents, 500K events/month, all security features, email support.
Business ($299/month): 5M events/month, red team engine, compliance reports, priority support (4h SLA).
Enterprise (custom): Unlimited everything, dedicated CSM, custom SLA, on-premise option, SAML SSO.

Annual billing saves 20% on Starter and Business plans.

How do I upgrade or downgrade my plan?

Go to Settings → Billing in the dashboard. Click Upgrade Plan to see available plans.

Upgrades take effect immediately. Downgrades take effect at the end of your current billing period. Unused credit is prorated.

For Enterprise, contact [email protected].

What happens if I exceed my event limit?

At 80% of your monthly event limit, you'll receive an email alert. At 100%, new events are queued but not processed until you upgrade or the billing period resets.

No data is lost — events are logged but threat analysis is paused. Upgrade at any time to immediately resume processing.

How do I cancel my subscription?

Go to Settings → Billing → Cancel Plan. Your subscription remains active until the end of the current billing period.

After cancellation, your account is downgraded to Free. Data is retained for 30 days before deletion.

We don't offer refunds for partial months, but we're happy to discuss exceptions for extenuating circumstances. Email [email protected].

Troubleshooting

Common issues and how to resolve them

5 articles

My events are not appearing in the dashboard

Check these in order:

Verify your API key is correct and has not expired (Settings → API Keys)
Confirm the X-AAIS-Key header is set (or use Authorization: Bearer aais_xxx)
Check the API response for error messages — a 200 OK confirms ingestion
Events may take up to 30 seconds to appear in the dashboard
Confirm you haven't exceeded your monthly event limit

If issues persist, contact [email protected] with your org ID and example request/response.

My Trust Score shows N/A or hasn't updated

The Trust Score requires at least 5 events to compute a meaningful score. N/A means the agent hasn't processed enough events yet.

If you have 5+ events and still see N/A, try refreshing the dashboard. Trust scores update asynchronously and may lag by 1–2 minutes during high load.

I'm getting 401 Unauthorized errors

Common causes:

API key has expired — create a new one under Settings → API Keys
Using wrong auth header — use X-AAIS-Key: aais_xxx or Authorization: Bearer aais_xxx for ingest endpoints
API key was revoked — check the keys list for a "Revoked" status
Whitespace or special characters in the key (copy it fresh from the dashboard)

Webhooks are not being delivered to my endpoint

Check the delivery logs under Settings → Webhooks. Common issues:

Endpoint URL is not publicly accessible (localhost won't work — use ngrok for local testing)
Endpoint returns non-2xx status — AAIS retries up to 3 times with backoff
HMAC signature verification is failing — ensure you're verifying the raw request body, not the parsed JSON
Firewall blocking inbound requests from our IP ranges

How do I export my data or delete my account?

Go to Settings → Privacy & Data in the dashboard.

Export data: Download all your events, agents, and settings as a JSON archive. Takes up to 10 minutes for large datasets.
Delete account: Permanently removes all data after a 7-day grace period. This action cannot be undone.

For GDPR data subject requests, email [email protected]. We respond within 72 hours.