Content Moderation & PII Guardrails

Enterprise data-safety platform: dual-path PII detection (Microsoft Presidio + LLM judge), Lakera-style bidirectional guardrails, and SOC 2 / GDPR / HIPAA-ready audit trails.

99.94%

PII recall

<35ms

P95 detection latency

60+

PII entity types

compliance frameworks

Composite scenario

Digital-bank guardrail review profile for AI support

This composite profile models 14 branches and 52 business scenarios where LLM agents must safely handle national IDs, card numbers, addresses, and health insurance IDs. The SkyAIApp guardrail replay benchmark shows:

data-egress violations

27 days

from kickoff to PIA approval

41%

drop in manual review load

“We used to keep 12 engineers around two homegrown DLP stacks. SkyAIApp folds Presidio + LLM judge + audit trail into one SDK — and policies are versioned and rolled out gradually.”

— Composite profile, Head of AI Platform

Challenge

Four unavoidable compliance gaps when shipping AI to production

Regex-only PII misses

Pure-regex pipelines miss 30%+ of non-Latin names, mixed addresses, and insurance IDs.

Prompt injection & privilege escalation

Agents get tricked into calling send_email, execute_sql, or unbounded transfers. WAFs don't see semantic attacks.

Concurrent multi-jurisdiction compliance

GDPR, CCPA, PIPL, HIPAA, PCI-DSS each have their own retention and response windows. Hand-rolled glue rots fast.

Audit trail can't be trusted

Re-playing an incident can't prove which model and policy version were active, slowing security-audit evidence preparation.

System architecture

Input → detection → policy engine → redaction & tool access → audited output. Every hop writes an immutable trace, replayable for 90 days.

Six-layer guardrail stack

Prompt-injection defense

Lakera-style classifier · prompt-shield v3

<8 ms classifier scores every incoming message; threshold hits get routed to a read-only model or queued for human review.

Real-time PII detection (dual-path)

Microsoft Presidio Analyzer + LLM judge

Presidio NER + regex + checksums run on every token. Ambiguous high-impact cases get a fast second opinion from Claude Haiku 4.5 / Gemini 3 Flash.

Configurable redaction

Mask · hash · FPE · tokenize · remove

Policy maps each entity type to a strategy — national IDs → format-preserving encryption, emails → domain-only mask, addresses → city-level downgrade.

Output-side content safety

Toxicity · bias · hallucination · policy-tag

Model replies pass through 4 classifiers before return. Sensitive topics (self-harm, political, violent) trigger rewrite or refusal templates.

Zero-trust tool access

MCP-native scopes · per-call OPA policy

Every MCP tool call carries scopes and an OPA check. High-risk actions (writes, transfers, email send) require dual-approval or step-up auth.

Tamper-proof audit & replay

Append-only ledger · WORM storage · 90-day replay

Each trace pins policy version, model, PII hits, and actions taken. One-click export for regulators and DPOs feeds SOC 2 / DPIA evidence.

SDK integration

Bidirectional guardrails are on by default — you don't hand-wire Presidio and classifiers yourself.

guardrails.ts

import { SkyAI } from "@skyaiapp/sdk";

const sky = new SkyAI({ apiKey: process.env.SKYAIAPP_API_KEY });

const response = await sky.route({
  goal: "stability",
  messages: [{ role: "user", content: userInput }],

  // 1. Input-side guardrails
  guardrails: {
    promptInjection: { action: "block", threshold: 0.7 },
    pii: {
      detector: ["presidio", "llm-judge"],     // dual-path
      entities: ["PERSON", "SSN", "CREDIT_CARD", "PHONE", "ADDRESS"],
      action: "redact",                         // redact | mask | hash | fpe
      strategy: { SSN: "fpe", CREDIT_CARD: "fpe", PHONE: "mask-tail4" },
    },

    // 2. Output-side guardrails
    output: {
      toxicity: { action: "rewrite", threshold: 0.5 },
      hallucination: { action: "warn", citations: "required" },
    },

    // 3. Audit
    audit: { policyVersion: "pol_2026_05_q2", retentionDays: 90 },
  },

  // 4. Tool access (MCP-native)
  tools: [
    {
      name: "lookup_customer",
      mcpServer: "crm.internal",
      scopes: ["customer:read"],     // OPA check injected automatically
    },
  ],
});

console.log(response.guardrails.piiHits);      // entities found
console.log(response.guardrails.actionsTaken); // how each was handled
console.log(response.routing.traceId);         // audit-trail id

PII coverage (60+ entity types)

Identity

National ID
Passport
Driver's license
SSN
TIN

Financial

Card (Luhn)
IBAN
SWIFT/BIC
Crypto wallet
Phone

Health

Insurance ID
Medical record #
NHS Number
MRN
Prescriptions

Credentials

API keys
JWT
Private keys
AWS / Azure / GCP creds
SSH keys

Compliance framework coverage

Framework	Coverage	Evidence
SOC 2 readiness	CC6.1 access controlCC7.2 monitoringCC8.1 change mgmt	Immutable traces + policy-version diffs
GDPR / PIPL / CCPA	MinimizationPurpose limitationDSAR rights	DSAR auto-lookup + 30-day export / delete flow
HIPAA	§164.312 technical safeguards§164.514 de-identification	PHI fields FPE + BAA template
PCI-DSS v4.0	Req 3 storage protectionReq 10 logging	PAN tokenize + WORM audit

Modeled results

99.94%

PII recall

Dual-path detection beats pure regex by 30+ percentage points.

<35ms

P95 detection latency

Guardrails run in parallel with the LLM — no perceptible end-to-end overhead.

100%

Audit coverage

Every model + tool call lands in an immutable ledger.

high-risk leakages

No national-ID / card-number egress in the replay sample.

Rollout cadence

Week 1

Discovery

Catalog sensitive fields, compliance scope, and current traffic shape.

Week 2

Shadow deploy

Mirror 5% traffic, calibrate PII / injection thresholds.

Week 3-4

Gradual rollout

Tenant- or surface-scoped ramp with false-positive monitoring.

Week 5+

Audit & renewals

DPO export templates + SOC 2 / DPIA evidence packs auto-generated.

Enterprise integrations

Okta / Azure AD

SSO + SCIM

Microsoft Purview

DLP signal exchange

Splunk / Datadog

SIEM streaming

HashiCorp Vault

KMS / FPE keys

Ship AI to production without the compliance scramble.

Default SOC 2 / GDPR / HIPAA paths included. Pilot and launch review in 4 weeks.