The trust layer · Pseudonymization, not redaction

Redaction destroys data.
Pseudonymization keeps it.

Every other tool replaces sensitive entities with dead tokens like [PERSON] — the original information is gone forever, and your application can never reassemble a useful answer. CloakPipe replaces them with deterministic, format-preserving fake entities the LLM can still reason about, stores the real values in an encrypted vault, then restores them on the way back. Reversible — but only under policy.

01 · The difference

Same input.
Three outcomes.

How the proxy does it

Detection-only tools — Microsoft Presidio, AWS Comprehend, the OpenAI Privacy Filter on its own — find sensitive data and stamp out a dead token. That works for anonymizing a training set. It does not work for a real-time AI application where the response has to come back with the real values intact. Watch what happens to a single clinical sentence.

Input
Original sensitive data
Other tools
Redacted — data lost
CloakPipe
Pseudonymized — utility kept
Dr. Sarah Chen
[PERSON]
Dr. PERSON_3
Amlodipine 10mg
[MEDICATION]
MEDICATION_7
s.chen@northshore.org
[EMAIL]
EMAIL_3@DOMAIN_1.org
555-0123
[PHONE]
PHONE_5
4532 1234 5678 9012
[CREDIT_CARD]
4532 8847 2231 5104
1600 Amphitheatre Pkwy
[ADDRESS]
742 ADDRESS_2 Ave
MRN-2024-88291
[MRN]
MRN-XXXX-TOKEN_12

The model never sees a single real value. It receives structurally valid text and reasons about it normally — “PERSON_3 should take MEDICATION_7 at 20mg.” CloakPipe restores the originals in the response: “Dr. Sarah Chen should take Amlodipine at 20mg.” Redaction throws that round trip away. Pseudonymization is the only approach that keeps it.

02 · The round trip

Detect. Mask.
Forward. Restore.

The full hot path

Every prompt your application makes passes through the Rust proxy. The whole loop — detection, vault write, forward, and rehydration — completes in under 50 ms p95, transparently. One line of code changes: the base URL.

01 · Detect

A tiered pipeline scans the outbound prompt locally. Regex and checksum validators catch structured identifiers in under a millisecond; ONNX models (OpenAI Privacy Filter, GLiNER2-PII) catch names, addresses, dates, and custom entities. No data leaves your infrastructure to be detected.

02 · Pseudonymize

Each detected entity is replaced with a deterministic, semantically valid fake. “Dr. Sarah Chen” becomes “Dr. PERSON_3” — the same token every time, so the LLM can track relationships across the entire conversation.

03 · Vault write

The real → fake mapping is written to an AES-256-GCM encrypted vault with customer-managed keys. The vault is the source of truth and the only place a real value ever exists outside your application.

04 · Forward clean prompt

The pseudonymized prompt is forwarded to any provider — OpenAI, Anthropic, Google, Bedrock, Azure, or a self-hosted model. The provider sees structurally valid text with zero real sensitive data.

05 · Rehydrate response

As the response streams back token-by-token, CloakPipe runs a sliding window over the SSE chunks, detects pseudonymized tokens, looks up the real values, and splices them in — without buffering or breaking the streaming contract.

06 · Audit & return

Every detection, mask, and unmask decision is logged as a privacy-safe audit event, then the response — with real values restored — is returned to your application. Compliance evidence is generated on the way out.

03 · Format-preserving

Fakes that pass
validation.

Inside the vault

A redacted token like [CREDIT_CARD] breaks the moment it hits a system that expects a card number. CloakPipe generates replacements that pass the same validation checks as the originals, using format-preserving encryption (FF1, NIST SP 800-38G). Downstream systems — and the model itself — keep working exactly as before.

Credit cards — Luhn-valid

A card number is replaced with another number that passes the Luhn checksum. Payment-related queries flow through your AI pipeline without exposing a single real PAN to the model provider.

IBANs — mod-97 valid

Bank account identifiers are regenerated to pass the mod-97 check digit. Any downstream validator that rejects malformed IBANs will accept the pseudonym just as it accepts the original.

Emails stay emails

s.chen@northshore.org becomes EMAIL_3@DOMAIN_1.org — still a syntactically valid address. Forms, regexes, and parsers that expect an email don't choke on a dead token.

Structured IDs keep their shape

Phone numbers stay phone-shaped, medical record numbers keep their prefix and length. The vault uses AES-SIV for deterministic encryption and FF1 for format preservation, chosen per entity type.

Format preservation is what makes pseudonymization safe to drop into production. The model — and any database, validator, or webhook that consumes its output — keeps working exactly as before, because every fake is indistinguishable in shape from the real value it stands in for.

04 · Deterministic consistency

Same input.
Same token.

Works across your stack

Tokenization is deterministic within a tenant's scope: the same value always produces the same token. That single property is what lets the LLM reason about relationships between entities instead of treating every mention as a brand-new unknown.

If “Dr. Sarah Chen” appears five times in a prompt, it maps to “Dr. PERSON_3” all five times. When the model answers, CloakPipe restores “Dr. Sarah Chen” in all five places. The model can correctly say “PERSON_3 prescribed MEDICATION_7 to PERSON_5” — and the relationship survives the round trip.

Consistency holds across conversations, sessions, and batch processing, not just a single message. Tokens can be scoped to a single conversation, expire after a configurable TTL, or persist indefinitely for long-running workflows — whichever your use case demands.

Relationships preserved

Because mapping is stable, the model keeps the graph intact: who prescribed what to whom, which party signed which clause, which account sent which transfer.

Session & TTL scoping

Tokens can live for one conversation, expire on a schedule, or persist for ongoing pipelines. You decide how long a pseudonym means the same thing.

05 · Reversibility under policy

This is not a feature.
It is the product.

Policy & CBAC

Without a vault, pseudonymization is one-way: data goes in, nothing comes back. With a vault, it is reversible under policy. That is the line between anonymization — which destroys utility — and privacy, which preserves utility while controlling access.

The vault stores every real → fake mapping, encrypted at rest with AES-256-GCM and per-tenant isolation. Unmasking is never automatic. Every unmask request runs through the policy engine first: the caller's identity, role, and request context are checked against rules defined in code and versioned in Git — context-based access control (CBAC).

A sales agent cannot unmask a medical record. A supervising physician can — but only during an active case review, and only for the entity types policy permits. Every decision, allowed or denied, is written to a privacy-safe audit trail that never contains a raw value.

Customer-managed keys

AES-256-GCM at rest with keys held in AWS KMS, GCP KMS, Azure Key Vault, or HashiCorp Vault Transit. CloakPipe never holds the keys to your customer's data, and keys rotate automatically without breaking existing tokens.

CBAC on every unmask

Who is asking, what is their role, how sensitive is the data, what is the workflow context — all evaluated at runtime via OPA or Cedar in sub-millisecond time before any real value is revealed.

Per-tenant isolation

Each customer gets their own vault namespace and their own encryption keys. Cross-tenant access is impossible at the cryptographic level, not just the application level.

Privacy-safe audit

Audit logs record what types of data were processed and what actions were taken — never the actual values. The trail proves compliance for HIPAA, GDPR, SOC 2, and PCI-DSS without becoming a new liability.

Redaction destroys data.
Pseudonymization keeps it.