Every other tool replaces sensitive entities with dead tokens like
[PERSON]
— the original information is gone forever, and your application can never reassemble a useful answer.
CloakPipe replaces them with deterministic, format-preserving fake entities the LLM can still reason about,
stores the real values in an encrypted vault, then restores them on the way back. Reversible — but only under policy.
Detection-only tools — Microsoft Presidio, AWS Comprehend, the OpenAI Privacy Filter on its own — find sensitive data and stamp out a dead token. That works for anonymizing a training set. It does not work for a real-time AI application where the response has to come back with the real values intact. Watch what happens to a single clinical sentence.
The model never sees a single real value. It receives structurally valid text and reasons about it normally — “PERSON_3 should take MEDICATION_7 at 20mg.” CloakPipe restores the originals in the response: “Dr. Sarah Chen should take Amlodipine at 20mg.” Redaction throws that round trip away. Pseudonymization is the only approach that keeps it.
Every prompt your application makes passes through the Rust proxy. The whole loop — detection, vault write, forward, and rehydration — completes in under 50 ms p95, transparently. One line of code changes: the base URL.
A tiered pipeline scans the outbound prompt locally. Regex and checksum validators catch structured identifiers in under a millisecond; ONNX models (OpenAI Privacy Filter, GLiNER2-PII) catch names, addresses, dates, and custom entities. No data leaves your infrastructure to be detected.
Each detected entity is replaced with a deterministic, semantically valid fake. “Dr. Sarah Chen” becomes “Dr. PERSON_3” — the same token every time, so the LLM can track relationships across the entire conversation.
The real → fake mapping is written to an AES-256-GCM encrypted vault with customer-managed keys. The vault is the source of truth and the only place a real value ever exists outside your application.
The pseudonymized prompt is forwarded to any provider — OpenAI, Anthropic, Google, Bedrock, Azure, or a self-hosted model. The provider sees structurally valid text with zero real sensitive data.
As the response streams back token-by-token, CloakPipe runs a sliding window over the SSE chunks, detects pseudonymized tokens, looks up the real values, and splices them in — without buffering or breaking the streaming contract.
Every detection, mask, and unmask decision is logged as a privacy-safe audit event, then the response — with real values restored — is returned to your application. Compliance evidence is generated on the way out.
A redacted token like [CREDIT_CARD]
breaks the moment it hits a system that expects a card number. CloakPipe generates replacements that
pass the same validation checks as the originals, using format-preserving encryption
(FF1, NIST SP 800-38G). Downstream systems — and the model itself — keep working exactly as before.
A card number is replaced with another number that passes the Luhn checksum. Payment-related queries flow through your AI pipeline without exposing a single real PAN to the model provider.
Bank account identifiers are regenerated to pass the mod-97 check digit. Any downstream validator that rejects malformed IBANs will accept the pseudonym just as it accepts the original.
s.chen@northshore.org becomes EMAIL_3@DOMAIN_1.org —
still a syntactically valid address. Forms, regexes, and parsers that expect an email don't choke on a
dead token.
Phone numbers stay phone-shaped, medical record numbers keep their prefix and length. The vault uses AES-SIV for deterministic encryption and FF1 for format preservation, chosen per entity type.
Format preservation is what makes pseudonymization safe to drop into production. The model — and any database, validator, or webhook that consumes its output — keeps working exactly as before, because every fake is indistinguishable in shape from the real value it stands in for.
Tokenization is deterministic within a tenant's scope: the same value always produces the same token. That single property is what lets the LLM reason about relationships between entities instead of treating every mention as a brand-new unknown.
If “Dr. Sarah Chen” appears five times in a prompt, it maps to “Dr. PERSON_3” all five times. When the model answers, CloakPipe restores “Dr. Sarah Chen” in all five places. The model can correctly say “PERSON_3 prescribed MEDICATION_7 to PERSON_5” — and the relationship survives the round trip.
Consistency holds across conversations, sessions, and batch processing, not just a single message. Tokens can be scoped to a single conversation, expire after a configurable TTL, or persist indefinitely for long-running workflows — whichever your use case demands.
Because mapping is stable, the model keeps the graph intact: who prescribed what to whom, which party signed which clause, which account sent which transfer.
Tokens can live for one conversation, expire on a schedule, or persist for ongoing pipelines. You decide how long a pseudonym means the same thing.
Without a vault, pseudonymization is one-way: data goes in, nothing comes back. With a vault, it is reversible under policy. That is the line between anonymization — which destroys utility — and privacy, which preserves utility while controlling access.
The vault stores every real → fake mapping, encrypted at rest with AES-256-GCM and per-tenant isolation. Unmasking is never automatic. Every unmask request runs through the policy engine first: the caller's identity, role, and request context are checked against rules defined in code and versioned in Git — context-based access control (CBAC).
A sales agent cannot unmask a medical record. A supervising physician can — but only during an active case review, and only for the entity types policy permits. Every decision, allowed or denied, is written to a privacy-safe audit trail that never contains a raw value.
AES-256-GCM at rest with keys held in AWS KMS, GCP KMS, Azure Key Vault, or HashiCorp Vault Transit. CloakPipe never holds the keys to your customer's data, and keys rotate automatically without breaking existing tokens.
Who is asking, what is their role, how sensitive is the data, what is the workflow context — all evaluated at runtime via OPA or Cedar in sub-millisecond time before any real value is revealed.
Each customer gets their own vault namespace and their own encryption keys. Cross-tenant access is impossible at the cryptographic level, not just the application level.
Audit logs record what types of data were processed and what actions were taken — never the actual values. The trail proves compliance for HIPAA, GDPR, SOC 2, and PCI-DSS without becoming a new liability.