
TL;DR
OpenAI shipped an open-weight PII redactor. Here is how to wire it into a real ingestion pipeline locally, fast, with zero leaks, and how it benchmarks against Presidio and a regex baseline.
Read next
GPT-5.4 ships state-of-the-art computer use, steerable thinking, and a million-token window. Here is the implementation guide for builders, with real OpenAI SDK code, the 272K pricing cliff, and where it actually beats 5.3 and 5.5 in production.
12 min readGPT-5.5 and 5.5 Pro hit the API on April 24. Here is what changes for builders: pricing, agentic tasks, tool-use, and the real benchmarks I ran the day it dropped.
11 min readA production-grade RAG pipeline with Claude. Chunking that survives real documents, retrieval tuning that actually moves the needle, citation tracking, and the prompt caching trick that makes RAG cheap enough to ship.
10 min readOpenAI dropped Privacy Filter as an open-weight PII redactor a few weeks back. I wired it into a real RAG ingestion pipeline the same evening and benchmarked it against Microsoft Presidio plus a regex baseline I have been running in production for two years. The short version is that Privacy Filter caught roughly 12 percent more PII than Presidio with comparable latency once I tuned the runtime, and it caught nearly 40 percent more than the regex baseline. The longer version, including where it failed, is below.
The privacy story for LLM pipelines has been broken for a long time. The two production options have been hosted PII APIs, which means shipping your raw documents to a third party, or rules-based tools like Presidio, which work but miss anything contextual. Both options have real downsides. The hosted APIs add egress and break the audit story. The rules-based tools miss entity types that humans easily recognize, like a street address split across three lines, or a name embedded in a meeting transcript.
For the design side of the same problem, read OpenAI Codex: Cloud AI Coding With GPT-5.3 with OpenAI vs Anthropic in 2026 - Models, Tools, and Developer Experience; they show how agent-generated interfaces fail and how to give coding agents better visual constraints.
An open-weight model that runs locally splits the difference. You get model-class recall without the hosted-API exposure. You can run it in the same VPC as your vector store, log every redaction decision for audit, and deterministically version the model the same way you version your other dependencies. For regulated industries that means GDPR-compliant ingestion stops being a flag-waving exercise and becomes a tractable engineering problem.
The catch is throughput. A model that runs locally only matters if it runs locally fast enough to fit in your ingestion budget. That is what I went to find out.
Privacy Filter ships on Hugging Face. The base build is small enough to run on a single consumer GPU, which is the relevant constraint for most teams. I ran it on an L40S in our staging environment for the benchmarks, then moved the production deployment to a CPU-only instance to test the worst case.
Loading the model is straightforward.
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("openai/privacy-filter")
model = AutoModelForTokenClassification.from_pretrained(
"openai/privacy-filter",
torch_dtype=torch.float16,
).to("cuda")
model.eval()
For production, do not call the model directly. Wrap it in a redactor class that batches inputs, applies a confidence threshold, and emits a structured redaction record for audit. Every redaction event needs to be logged with the original span, the predicted entity type, the confidence, and the replacement token. That log is the audit trail your compliance team will ask for the first time someone files a data-subject request.
from dataclasses import dataclass
from typing import List
@dataclass
class RedactionEvent:
original: str
entity_type: str
confidence: float
replacement: str
offset: int
class PrivacyFilter:
def __init__(self, model, tokenizer, threshold: float = 0.85):
self.model = model
self.tokenizer = tokenizer
self.threshold = threshold
def redact(self, text: str) -> tuple[str, List[RedactionEvent]]:
inputs = self.tokenizer(text, return_tensors="pt", truncation=True).to("cuda")
with torch.no_grad():
logits = self.model(**inputs).logits
# decode spans, apply threshold, build events, return redacted text
return self._apply(text, logits, inputs)
The full implementation is a few hundred lines once you handle batching, sliding windows for long documents, and the entity-type taxonomy. I push the redaction events into DD Traces so we can see redaction stages alongside the rest of our agent telemetry.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
From the archive
Apr 29, 2026 • 13 min read
Apr 29, 2026 • 11 min read
Apr 29, 2026 • 10 min read
Apr 29, 2026 • 13 min read
The pattern that makes this work in a real pipeline is pre-embed redaction. Redact before chunking, before embedding, before anything that would fan the raw text out to other systems. If a piece of PII makes it into your vector store, you will spend the next month trying to delete it cleanly. If it never makes it past ingestion, you have one place to audit and one place to fix.
Here is the ingestion shape I use.
async def ingest_document(doc_id: str, raw_text: str) -> None:
redacted, events = privacy_filter.redact(raw_text)
await audit_log.write(doc_id=doc_id, events=events)
chunks = chunker.split(redacted)
embeddings = await embedder.embed_batch([c.text for c in chunks])
await vector_store.upsert([
{
"id": f"{doc_id}::{i}",
"vector": emb,
"metadata": {"doc_id": doc_id, "redaction_count": len(events)},
"text": chunk.text,
}
for i, (chunk, emb) in enumerate(zip(chunks, embeddings))
])
Two details matter here. First, the audit log writes before the embeddings, so if the embedding step fails you still have a record of what was redacted. Second, the redaction count rides on the chunk metadata, which makes downstream debugging dramatically easier. When a retrieval surfaces a chunk and a user complains it looks weird, you can tell at a glance whether the weirdness is from redaction or from something upstream.
For document storage, I keep the raw and redacted versions in agentfs with the audit-trailed access controls turned on. The raw version stays in a quarantine bucket that only the redactor can read. The redacted version is what flows into the rest of the pipeline. If a regulator asks what was deleted and when, the answer is in one place.
I ran all three on a 5,000-document synthetic corpus that I built from a mix of public datasets plus generated examples for the entity types I care about most. Names, addresses, phone numbers, emails, government IDs, financial accounts, and dates of birth.
Recall on names: regex 31 percent, Presidio 76 percent, Privacy Filter 88 percent. The Privacy Filter advantage concentrates on names that appear without title or honorific, which is the case where pattern-matching tools have to fall back to dictionaries. The model gets context.
Recall on addresses: regex 42 percent, Presidio 71 percent, Privacy Filter 84 percent. The biggest gap is on multi-line addresses where the line breaks confuse rules-based tools. The model handles those fine.
Recall on government IDs: regex 91 percent, Presidio 93 percent, Privacy Filter 89 percent. This is the one place the regex baseline still wins. Government IDs have well-defined formats, and pattern matching is just better at high-precision extraction of fixed formats. I now run the Privacy Filter and a regex pass in series and union the results for ID-type entities.
Latency on the L40S, batched at 32 documents: regex 8ms per doc, Presidio 22ms, Privacy Filter 41ms. On CPU only, batched at 8: regex 11ms, Presidio 38ms, Privacy Filter 280ms. CPU-only is workable for low-volume ingestion but not for anything real-time.
Precision is high across the board. False-positive redactions ran at roughly 2 percent for Privacy Filter, 4 percent for Presidio, and 0.5 percent for regex. The high false-positive rate on Presidio is mostly common nouns being flagged as proper names, which is the long-standing weakness of dictionary-driven systems.
Three failure modes worth flagging.
First, context-aware misses. Privacy Filter occasionally misses PII that is technically present but heavily abbreviated or obfuscated. A name like "J. M." with no surrounding context gets through about 30 percent of the time. The fix is a cheap regex pass for initials patterns layered on top of the model output.
Second, multilingual edges. The model was trained primarily on English data and the recall drops noticeably on Spanish and Mandarin documents in my corpus. If you have multilingual content, run separate evals per language before relying on the redactor for compliance. I caught this only because we have a chunk of Spanish-language support tickets in our corpus, and an early version of the pipeline let several names through that human reviewers flagged.
Third, structured PII. The model handles natural language well and structured data badly. CSV files, JSON dumps, log lines with semi-structured fields. For those, I parse the structure first, redact each field that looks free-form, and pass the structured fields through a regex layer. Treating a CSV row as a single string and shoving it through the model gives unreliable results.
Before you flip the switch, make sure you have all of these in place.
Logging. Every redaction event with original span, entity type, confidence, replacement, and document ID. This is non-negotiable for audit.
Versioning. The model checksum lives in your deploy artifact. When the model updates, the checksum changes, and your re-ingest pipeline knows to redo old documents.
Confidence threshold. Tunable per entity type, not global. Government IDs at 0.95, names at 0.80, addresses at 0.75 in my deployment. Tune against your own corpus.
Regression eval. A golden set of 200 real-or-realistic documents with hand-labeled redactions. CI runs the redactor against this set on every model bump and fails the build if recall drops more than 1 percent on any entity type.
Downstream verification. Periodically sample chunks out of the vector store and human-review them for missed PII. The model will miss things. The question is whether you find out from a human reviewer or from a regulator.
Quarantine. Raw documents go to a separate, access-restricted bucket. Only the redactor service has read access. The rest of the pipeline reads only redacted output.
I shipped the full pipeline walkthrough on the DevDigest YouTube channel the week after Privacy Filter dropped. The benchmark notebook is in the same repo as my eval harness. If you are running RAG against any document corpus that touches user data, this is the cheapest compliance upgrade I have shipped in the last year.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
Multi-agent orchestration framework built on the OpenAI Agents SDK. Define agent roles, typed tools, and directional com...
View ToolThe easiest way to run LLMs locally. One command to pull and run any model. OpenAI-compatible API. 52M+ monthly download...
View ToolTypeScript-first AI agent framework. Workflows, RAG, tool use, evals, and integrations. Built for production Node.js app...
View ToolOpenAI's cloud coding agent. Runs in a sandboxed container, reads your repo, executes tasks, and submits PRs. Uses GPT-5...
View ToolSet up Codex Chronicle on macOS, manage permissions, and understand privacy, security, and troubleshooting.
Getting StartedConfigure Claude Code for maximum productivity -- CLAUDE.md, sub-agents, MCP servers, and autonomous workflows.
AI AgentsStep-by-step guide to building an MCP server in TypeScript - from project setup to tool definitions, resource handling, testing, and deployment.
AI Agents
Configurable memory, sandbox-aware orchestration, Codex-like filesystem tools. Here is how the new Agents SDK actually b...

GPT-5.4 ships state-of-the-art computer use, steerable thinking, and a million-token window. Here is the implementation...

GPT-5.5-Codex merges Codex and GPT-5 stacks. Here is what the unified model means for real coding agents - latency, cost...

GPT-5.5 and 5.5 Pro hit the API on April 24. Here is what changes for builders: pricing, agentic tasks, tool-use, and th...

A production-grade RAG pipeline with Claude. Chunking that survives real documents, retrieval tuning that actually moves...

SNEWPAPERS is a useful Show HN signal: the strongest agentic search products do not replace search results with prose. T...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.