Back to use cases

Content Moderation & PII Guardrails

Enterprise data-safety platform: dual-path PII detection (Microsoft Presidio + LLM judge), Lakera-style bidirectional guardrails, and SOC 2 / GDPR / HIPAA-ready audit trails.

99.94%
PII recall
<35ms
P95 detection latency
60+
PII entity types
4
compliance frameworks
Composite scenario

Digital-bank guardrail review profile for AI support

This composite profile models 14 branches and 52 business scenarios where LLM agents must safely handle national IDs, card numbers, addresses, and health insurance IDs. The SkyAIApp guardrail replay benchmark shows:

0
data-egress violations
27 days
from kickoff to PIA approval
41%
drop in manual review load

“We used to keep 12 engineers around two homegrown DLP stacks. SkyAIApp folds Presidio + LLM judge + audit trail into one SDK — and policies are versioned and rolled out gradually.”

— Composite profile, Head of AI Platform

Challenge

Four unavoidable compliance gaps when shipping AI to production

Regex-only PII misses

Pure-regex pipelines miss 30%+ of non-Latin names, mixed addresses, and insurance IDs.

Prompt injection & privilege escalation

Agents get tricked into calling send_email, execute_sql, or unbounded transfers. WAFs don't see semantic attacks.

Concurrent multi-jurisdiction compliance

GDPR, CCPA, PIPL, HIPAA, PCI-DSS each have their own retention and response windows. Hand-rolled glue rots fast.

Audit trail can't be trusted

Re-playing an incident can't prove which model and policy version were active, slowing security-audit evidence preparation.

System architecture

Input → detection → policy engine → redaction & tool access → audited output. Every hop writes an immutable trace, replayable for 90 days.

Input DataUser InputAPI RequestFile UploadDetection LayerPII DetectionPhoneEmailSSN/IDCredit Card✓ 99.9%Content ModerationHarmful • Bias • PoliticalToxicity CheckScore: 0-100Policy EngineRule ConfigThresholdsActionsRedaction🔒 Masking🔐 Encryption🗑️ RemovalAccess ControlRBACAudit LogsZero TrustSecure OutputClean DataAudit RecordAlert🛡️ Compliance & CertificationsSOC2 Type IIGDPRCCPAHIPAASecurity Flow: Data Input → Multi-Dimensional Detection → Policy Matching → Redaction → Access Control → Secure Output

Six-layer guardrail stack

01

Prompt-injection defense

Lakera-style classifier · prompt-shield v3

<8 ms classifier scores every incoming message; threshold hits get routed to a read-only model or queued for human review.

02

Real-time PII detection (dual-path)

Microsoft Presidio Analyzer + LLM judge

Presidio NER + regex + checksums run on every token. Ambiguous high-impact cases get a fast second opinion from Claude Haiku 4.5 / Gemini 3 Flash.

03

Configurable redaction

Mask · hash · FPE · tokenize · remove

Policy maps each entity type to a strategy — national IDs → format-preserving encryption, emails → domain-only mask, addresses → city-level downgrade.

04

Output-side content safety

Toxicity · bias · hallucination · policy-tag

Model replies pass through 4 classifiers before return. Sensitive topics (self-harm, political, violent) trigger rewrite or refusal templates.

05

Zero-trust tool access

MCP-native scopes · per-call OPA policy

Every MCP tool call carries scopes and an OPA check. High-risk actions (writes, transfers, email send) require dual-approval or step-up auth.

06

Tamper-proof audit & replay

Append-only ledger · WORM storage · 90-day replay

Each trace pins policy version, model, PII hits, and actions taken. One-click export for regulators and DPOs feeds SOC 2 / DPIA evidence.

SDK integration

Bidirectional guardrails are on by default — you don't hand-wire Presidio and classifiers yourself.

guardrails.ts
import { SkyAI } from "@skyaiapp/sdk";

const sky = new SkyAI({ apiKey: process.env.SKYAIAPP_API_KEY });

const response = await sky.route({
  goal: "stability",
  messages: [{ role: "user", content: userInput }],

  // 1. Input-side guardrails
  guardrails: {
    promptInjection: { action: "block", threshold: 0.7 },
    pii: {
      detector: ["presidio", "llm-judge"],     // dual-path
      entities: ["PERSON", "SSN", "CREDIT_CARD", "PHONE", "ADDRESS"],
      action: "redact",                         // redact | mask | hash | fpe
      strategy: { SSN: "fpe", CREDIT_CARD: "fpe", PHONE: "mask-tail4" },
    },

    // 2. Output-side guardrails
    output: {
      toxicity: { action: "rewrite", threshold: 0.5 },
      hallucination: { action: "warn", citations: "required" },
    },

    // 3. Audit
    audit: { policyVersion: "pol_2026_05_q2", retentionDays: 90 },
  },

  // 4. Tool access (MCP-native)
  tools: [
    {
      name: "lookup_customer",
      mcpServer: "crm.internal",
      scopes: ["customer:read"],     // OPA check injected automatically
    },
  ],
});

console.log(response.guardrails.piiHits);      // entities found
console.log(response.guardrails.actionsTaken); // how each was handled
console.log(response.routing.traceId);         // audit-trail id

PII coverage (60+ entity types)

Identity

  • National ID
  • Passport
  • Driver's license
  • SSN
  • TIN

Financial

  • Card (Luhn)
  • IBAN
  • SWIFT/BIC
  • Crypto wallet
  • Phone

Health

  • Insurance ID
  • Medical record #
  • NHS Number
  • MRN
  • Prescriptions

Credentials

  • API keys
  • JWT
  • Private keys
  • AWS / Azure / GCP creds
  • SSH keys

Compliance framework coverage

FrameworkCoverageEvidence
SOC 2 readiness
CC6.1 access controlCC7.2 monitoringCC8.1 change mgmt
Immutable traces + policy-version diffs
GDPR / PIPL / CCPA
MinimizationPurpose limitationDSAR rights
DSAR auto-lookup + 30-day export / delete flow
HIPAA
§164.312 technical safeguards§164.514 de-identification
PHI fields FPE + BAA template
PCI-DSS v4.0
Req 3 storage protectionReq 10 logging
PAN tokenize + WORM audit

Modeled results

99.94%
PII recall

Dual-path detection beats pure regex by 30+ percentage points.

<35ms
P95 detection latency

Guardrails run in parallel with the LLM — no perceptible end-to-end overhead.

100%
Audit coverage

Every model + tool call lands in an immutable ledger.

0
high-risk leakages

No national-ID / card-number egress in the replay sample.

Rollout cadence

Week 1

Discovery

Catalog sensitive fields, compliance scope, and current traffic shape.

Week 2

Shadow deploy

Mirror 5% traffic, calibrate PII / injection thresholds.

Week 3-4

Gradual rollout

Tenant- or surface-scoped ramp with false-positive monitoring.

Week 5+

Audit & renewals

DPO export templates + SOC 2 / DPIA evidence packs auto-generated.

Enterprise integrations

Okta / Azure AD

SSO + SCIM

Microsoft Purview

DLP signal exchange

Splunk / Datadog

SIEM streaming

HashiCorp Vault

KMS / FPE keys

Ship AI to production without the compliance scramble.

Default SOC 2 / GDPR / HIPAA paths included. Pilot and launch review in 4 weeks.

Content Moderation & PII Guardrails - Use Cases — SkyAIApp