Platform · The open core, deconstructed

Four surfaces.
One open core.

CloakPipe is built as four distinct product surfaces — the Rust proxy, the encrypted vault, the policy engine, and the audit layer. The proxy is open source under Apache 2.0. Everything above it is the commercial platform that closes enterprise deals in regulated industries. Each layer adds value on top of the last.

Surface 01 · Proxy

The Rust proxy.
Open source core.

Browse the repo

A Rust-native reverse proxy that intercepts AI API calls, pseudonymizes outbound data, and rehydrates inbound responses in real-time — under 50 ms p95, including the full detection pipeline. Eight Rust crates compiled to a single native binary. Your application changes exactly one line: the base URL.

OpenAI-compatible API

Change your base URL from api.openai.com to your CloakPipe endpoint. No other code changes. Works with LangChain, LlamaIndex, CrewAI, curl, and any OpenAI-compatible client.

Streaming SSE rehydration

LLMs stream token-by-token. CloakPipe maintains a sliding window, detects pseudonymized tokens mid-stream, looks up real values from the vault, and splices them in — without buffering or breaking the SSE contract.

Multi-provider routing

Route to OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, or any self-hosted model via vLLM or Ollama. Apply different masking policies per provider — strict for closed models, lighter or bypassed for self-hosted.

Native MCP server

A Model Context Protocol server exposing mask_text, mask_file, unmask_in_context, and scan_directory tools. AI agents can call CloakPipe directly as a tool from Claude Desktop, Cursor, or any MCP client.

Pluggable, tiered detection pipeline

Detection is deliberately commoditized — CloakPipe does not lock you into a single model. The proxy runs a tiered pipeline where each tier escalates only as needed: what regex catches deterministically, the neural models never see; what one model misses, the ensemble catches. OpenAI Privacy Filter scores 96% on synthetic benchmarks but Tonic.ai measured 18–65% on real-world EHR notes, call transcripts, and loan contracts. No single model catches everything — which is why the pipeline tiers.

DETECTION PIPELINE · TIERED · CUMULATIVE < 50 MS P95
Catch by escalation, not by one model
Each request flows through the tiers in order. Cheap, deterministic checks fire first; neural detection escalates only for unstructured text and custom entities.
Tier 1 → Tier 4 cumulative · < 50 ms p95
T1
Regex + checksum
Cards (Luhn) · IBAN (mod-97) · SSN range · ABA routing · email · URL · IP. Rust-native.
< 1 ms
T2
Privacy Filter
OpenAI 1.5B (50M active) on ONNX. Apache 2.0, 128K ctx. Names, addresses, dates, secrets.
30–50 ms
T3
GLiNER2-PII
300M params, multilingual, zero-shot. Fires for custom entity types — MRNs, case IDs, policy numbers.
40–80 ms
T4
Ensemble
Run multiple backends and merge for maximum recall. For when missing one entity is unacceptable.
opt-in
Surface 02 · Vault

Reversible by design.
You hold the keys.

Why reversibility matters

An encrypted tokenization vault that stores the mapping between real sensitive data and pseudonymized tokens. The vault is the source of truth — and the reason pseudonymization is reversible under policy, not a one-way redaction. Without a vault, data goes in and nothing useful comes out.

AES-256-GCM at rest

Every real value stored in the vault is encrypted with AES-256-GCM. Nothing real exists outside the vault — the clean prompt that reaches the provider contains zero original sensitive data.

Customer-managed keys

Bring your own keys via AWS KMS, GCP Cloud KMS, Azure Key Vault, or HashiCorp Vault Transit. Envelope encryption with customer-managed root keys — you never hand CloakPipe the keys to your customer's data.

Deterministic + format-preserving

The same input always produces the same token within a tenant's scope, preserving entity consistency across conversations and batches. Format-preserving FF1 (NIST SP 800-38G) keeps cards Luhn-valid and emails email-shaped.

Per-tenant isolation

Each customer gets their own vault namespace with their own encryption keys. No cross-tenant data access is possible at the cryptographic level.

TTL & session scoping

Tokens can expire after a single conversation, after a configurable TTL, or persist indefinitely for ongoing workflows. Scope the lifetime to the sensitivity of the workload.

Automatic key rotation

Encryption keys rotate on a configurable schedule without disrupting active tokens. Old tokens remain decryptable; new tokens use the latest key. No downtime, no migration.

Surface 03 · Policy

Code, not config.
Versioned in Git.

Every decision is logged

A policy engine that defines what gets masked, for which models, for which teams, and who can unmask. Policies are code, versioned in Git, and enforced automatically on every request — backed by OPA (Open Policy Agent) or Cedar for sub-millisecond authorization decisions.

Entity policies

"Mask all patient names when routing to external models." "Block financial amounts from reaching any provider." "Allow internal model calls unmasked." Per entity type, per action.

Provider policies

Maximum masking for OpenAI, Anthropic, and Google. Bypass for self-hosted vLLM. Mask only financial data for Venice TEE models. Per provider, per masking level.

Team / role policies

"Legal: mask everything, no exceptions. Engineering: mask PII, allow code. Data science: unmasked access to internal models only." Per RBAC role, mapped from SAML / OIDC.

Context-based access (CBAC)

Unmasking decided at runtime from who is asking, their role, the data's sensitivity, and the workflow context. A sales agent cannot unmask medical records; a supervising physician can — but only during an active case review.

Custom entity definitions

Define domain-specific entity types via regex, keyword lists, or NER labels: medical record numbers, case docket IDs, insurance policy numbers, internal employee IDs — whatever your domain requires.

Policy-as-code

Defined in YAML, backed by OPA or Cedar. Every change is a versioned commit; every evaluation is an audit event. Policies are testable, reviewable, and deployable through standard CI/CD.

Example policy — patient data for external models

# rules — evaluated top to bottom on every request - name: mask_patient_data_for_external_models when: provider: [openai, anthropic, google] entity_type: [PERSON, DIAGNOSIS, MEDICATION, MRN] action: pseudonymize # self-hosted models stay inside the perimeter — pass through - name: allow_internal_models_unmasked when: provider: [internal-vllm, self-hosted] action: passthrough # only clinicians may reverse clinical entities — CBAC enforced - name: restrict_unmask_to_physicians when: action: unmask entity_type: [DIAGNOSIS, MEDICATION] require_role: [physician, clinical_admin]
Surface 04 · Audit

Every decision
logged. Never raw.

See the compliance posture

A compliance and observability layer that records every privacy-relevant event, generates compliance evidence, and integrates with your existing monitoring stack. Audit logs never contain raw sensitive data — they record what types of data were processed and what actions were taken, not the values. The audit trail is itself privacy-safe.

What gets logged

Every request (timestamp, caller, source IP, destination provider), every detection event (entity types, confidence, model used), every masking action, every unmask request, and every policy evaluation — plus full latency metrics.

Never the raw values

The trail records that a DIAGNOSIS was masked for request req_8af9c2 — never the diagnosis itself. Evidence of control without becoming a second copy of the data you are protecting.

Compliance evidence on demand

Exportable evidence for HIPAA (PHI masked before processors), GDPR/DPDP (data minimization), SOC 2 Type II (access & encryption controls), EU AI Act (de-identification), and PCI-DSS (tokenized cardholder data).

OpenTelemetry-native

Structured traces, metrics, and logs from day one. Export to Datadog, Grafana, Splunk, Honeycomb, Prometheus, or any OTEL collector. Pre-built dashboards for detection rates, entity distribution, latency percentiles, and unmask patterns.

A privacy-safe trail, in practice

14:32:08 mask 7 entities · req_8af9c2 → gpt-5 healthcare-v3
14:32:09 detect T1 regex + T2 privacy-filter · F1 0.96 otel·trace
14:32:11 rehydrate 11 chunks · vault lookup 3.8 ms vault·prod
14:32:14 unmask deny role: sales · entity: MRN cbac-v1
14:32:18 unmask allow role: physician · DIAGNOSIS · case review cbac-v1
Architecture · The hot path

One request.
Under 50 ms.

Read the architecture docs

Every prompt, response, and tool call passes the same per-request hot path: authenticate, evaluate policy, detect, pseudonymize and write to the vault, forward a clean prompt, stream back, rehydrate, and emit an audit event. All of it in Rust, transparently, with a sub-50 ms p95 target.

REQUEST FLOW · PER-REQUEST · < 50 MS TARGET
Auth → Policy → Detect → Vault → Forward → Rehydrate → Audit
Built on Rust / Axum / Tower with Tokio streams. Detection escalates through tiers; the vault makes the round-trip reversible; the audit log closes every request.
  • 1 · Auth — JWT / mTLS verification at the proxy edge
  • 2 · Policy — OPA evaluation: allow / deny plus per-entity rules
  • 3 · Detection — T1 regex + checksum (<1ms) → T2 Privacy Filter (ONNX, 30–50ms) → T3 GLiNER2 for custom entities (optional)
  • 4 · Pseudonymize + vault write — deterministic tokens encrypted at rest
  • 5 · Forward — clean prompt to the LLM provider; zero real data leaves
  • 6 · Stream + rehydrate — detect tokens in the SSE stream, vault lookup, authorize, splice real values back
  • 7 · Audit — emit a privacy-safe event via OpenTelemetry, then return the response with real values restored

Core technology choices

Component Technology Why
Proxy runtime Rust / Axum / Tower Sub-millisecond overhead per request. Zero-cost abstractions. Memory safety without GC pauses.
ML inference ONNX Runtime (ort) Run Privacy Filter and GLiNER2 locally on CPU or GPU. No Python dependency in the hot path.
Tokenization HF tokenizers (Rust) Fast model-input preparation, shared across the detection pipeline.
Vault encryption AES-256-GCM · AES-SIV · FF1 GCM for general encryption, SIV for deterministic tokens, FF1 (NIST SP 800-38G) for format-preserving values.
Key management Vault Transit / cloud KMS Envelope encryption. Customer-managed root keys. Automatic rotation.
Token registry PostgreSQL (sqlx) Deterministic token lookup with per-tenant isolation. Proven at scale.
Policy engine OPA · Cedar OPA: industry standard, Rego DSL, sub-millisecond decisions. Cedar: typed alternative for compile-time guarantees.
Observability OpenTelemetry-rust Traces, metrics, and logs over Tokio streams, exportable to any OTEL collector.
Deployment · Five topologies

Laptop to air-gapped.

See what each tier includes

Same Rust binary. Same detection pipeline. Same vault encryption. Pick the topology that matches your security posture — from fully managed cloud to a fully offline air-gapped install with no network calls and no telemetry.

01
Managed cloud
Hosted at cloakpipe.co. Endpoint, dashboard, vault, audit logs all managed. Data processed in-transit only, never stored outside the vault. SOC 2 infrastructure.
SAAS
02
Docker
Single container or docker-compose on any Linux host. Customer controls all infrastructure, keys, and data. Open-source proxy plus a commercial license for platform features.
SINGLE BINARY
03
Kubernetes
Production Helm chart for K8s clusters. Horizontal autoscaling, rolling deploys, health checks — for teams running AI workloads on Kubernetes.
HELM · HPA
04
VPC / private
Deployed inside your AWS VPC, GCP VPC, or Azure VNet. No internet egress. CloakPipe engineering assists with deployment and configuration.
CUSTOMER CLOUD
05
Air-gapped
Fully offline. All detection models run via ONNX locally. No network calls. No telemetry. For defense, intelligence, and maximum-security healthcare environments.
OFFLINE · ONNX
Compliance · Posture

Built to be
deployable in audits.

Request compliance docs

CloakPipe helps your AI application meet regulatory requirements — and the product itself meets the standards needed to be deployable in regulated environments. Each framework maps to what it unlocks for your customers.

Framework CloakPipe status What it enables for customers
SOC 2 Type II In progress · Vanta Cite CloakPipe's report in your own audits. Required for enterprise procurement in healthcare, finance, and legal.
HIPAA BAA available Demonstrate PHI is masked before reaching model providers. Fulfills the HIPAA de-identification safe harbor.
GDPR / DPDP DPA template · Art. 25/32 Proof of data minimization. Answer data subject access and deletion requests from the audit trail. Data residency controls.
EU AI Act High-risk · Aug 2026 Demonstrate personal data is de-identified before high-risk AI processing, with a human-oversight audit trail.
PCI-DSS FPE tokenization Process payment-related queries without exposing card numbers. No cardholder data stored in plaintext.
ISO 27001 Planned Required for European and APAC enterprise procurement. ~70% control overlap with SOC 2.

Four surfaces.
One open core.