OpenAI Open-Sources Privacy Filter: 96% F1 PII Masker

Elena Marchetti — Wed, 22 Apr 2026 23:28:57 +0200

OpenAI released Privacy Filter today under Apache 2.0, and the thing worth noting isn't the name. It is the shape. A 1.5-billion-parameter total, 50-million-active Mixture of Experts with a 128K context window, shipped as a bidirectional token classifier with WebGPU support via Transformers.js. In practice, that means an enterprise can run the full masking pass on text before it leaves a browser tab and hits any OpenAI endpoint, Azure tenant, or third-party API. That is a deliberate architectural position, and it answers a question the industry has been asking for two years.

The Landscape, Before And After

Approach	Example	Context-aware	Runs on-device	License
Regex + pattern	Microsoft Presidio	No	Yes	MIT
BERT token-classifier	`dslim/bert-base-NER`	Partial	Yes	MIT
Hosted API call	AWS Comprehend PII	Yes	No	Paid service
Hosted large model	GPT-4 via API	Yes	No	Paid service
Privacy Filter	`openai/privacy-filter`	Yes	Yes (browser)	Apache 2.0

The row that didn't previously exist is the one that matters. Regex tools are fast and local but confuse addresses with product codes. Large hosted models understand context but require sending the raw text off-device, which is the exact thing a privacy filter is meant to prevent. Privacy Filter is the first release this year that sits in the contextual-and-local cell of the matrix with a license you can ship in enterprise software.

Privacy Filter at a glance

1.5B total / 50M active (128 experts, top-4 routing)
128K context, non-autoregressive single-forward-pass inference
8 PII categories across 33 BIOES output classes
96% F1 on PII-Masking-300k (97.43 on the corrected set)
Apache 2.0, available on HuggingFace and GitHub
WebGPU support via Transformers.js - runs in the browser

How It Actually Works

The model card calls it a "bidirectional token classifier," which understates the structural choices.

The Architecture Choice

Eight transformer encoder blocks, pre-norm, d_model 640. Attention is grouped-query with rotary positional embeddings: 14 query heads, 2 KV heads, giving a group size of seven queries per KV head. The feedforward layers are sparse Mixture of Experts, 128 experts with top-4 routing per token. Every token sees four of the 128 experts, and the total active parameter count lands at 50M. The rest sits dormant in the checkpoint, which is why a 1.5B-weight download runs at 50M-weight latency.

The 50M-active design is the load-bearing decision. A standard dense 1.5B PII tagger would saturate a laptop GPU under batch load; a 50M-active MoE with top-4 routing runs at interactive latency on a WebGPU target in a Chromium tab. The architecture was built to deploy, not to impress on a leaderboard. Source: unsplash.com

The Output Format

This isn't a generative model. For each input token the model predicts one of 33 output classes: a background O plus 8 PII categories in BIOES span encoding (Begin / Inside / End / Single / Outside). Decoding uses a constrained Viterbi procedure with linear-chain transition scoring, which means the span boundaries are globally consistent rather than inferred greedily. The precision-recall tradeoff is tunable at inference time through the Viterbi parameters without retraining.

The 8 categories are specific, and their specificity is a design statement: account_number, private_address, private_email, private_person, private_phone, private_url, private_date, secret. The last bucket, secret, absorbs passwords, API keys, and tokens - the category an enterprise compliance team cares about most in 2026, given the rate at which these leak through agent logs.

The Browser Target

WebGPU via the Transformers.js runtime is the shipping target, not an afterthought. A three-line pipeline("token-classification", "openai/privacy-filter") call runs the full masking inference locally with no outbound network request to OpenAI infrastructure. For an enterprise that wants to paste a customer service transcript into a summarisation tool, the filter redacts names, phone numbers, and API keys before the transcript ever leaves the device.

What It Does Not Tell You

Three things the announcement glosses over.

The Eight Categories Are The Perimeter

This isn't a universal PII detector. It's a detector for eight specific categories of data, trained on English-language corpora with "selected multilingual robustness evaluation." It doesn't tag medical PHI as a separate class, it doesn't tag financial account information beyond generic account_number, and it doesn't tag biometric identifiers. An enterprise targeting HIPAA or PCI-DSS compliance isn't done when this model passes.

The 4% That Slips

OpenAI itself reports a 96% F1 on the benchmark, which means four out of every hundred labelled spans in the test set are either missed or mis-categorised. In adversarial or out-of-distribution text - dialects the training set underrepresents, proper names written in non-Latin scripts, phone numbers in formats the tokenizer did not see often - the miss rate is higher. OpenAI's own model card states the filter "may miss unusual identifiers" and "can over-redact short sentences." A 96% filter processing a billion-request-a-day support queue leaks 40 million spans. That is a floor, not a worst case.

The Compliance Question

OpenAI's own framing in the model card is worth quoting verbatim: the filter "is not an anonymization tool, a compliance certification, or a substitute for policy review." The company is explicitly declining to position this as a regulatory checkbox. Enterprises deploying it still need legal review of the residual risk. Source: unsplash.com

The model is positioned as a pre-processing layer, not a privacy guarantee. GDPR, HIPAA, and state-level privacy laws impose obligations that are process-level, not model-level: data retention schedules, subject access requests, consent flows, lawful basis for processing. Privacy Filter makes the pre-processing step better. It does nothing for the seven other steps.

The Release Itself

The strategic note worth marking is that OpenAI released this under Apache 2.0 rather than under the open-weight-but-restricted license pattern the company has tended to use for its other open releases. No commercial use restriction, no redistribution clause, no delayed release for non-US customers. The closest precedent inside OpenAI is Whisper, released under MIT in 2022 and still the reference speech-to-text baseline for the industry. Privacy Filter is positioned to do the same work in text sanitization: ubiquitous, forkable, hard to displace.

What makes this release different from OpenAI's enterprise-Codex push last week is that Privacy Filter does not require an OpenAI endpoint to deliver value. Run the classifier locally, keep the text on-premise, send nothing back to San Francisco. That is a model OpenAI is giving away to customers who have explicitly asked for exactly that. It is, in the most literal sense, a trust-building release.

The model is small, permissively licensed, architecturally appropriate for the job, and honestly documented about what it can't do. The open question is whether enterprises actually wire it into their pipelines or whether it becomes a reference implementation that sits in a GitHub repository while the same customers keep pasting raw transcripts into chat. OpenAI has done its part. The next move is the integrators'.

Sources:

OpenAI Privacy Filter model card on HuggingFace
openai/privacy-filter repository on GitHub
OpenAI Privacy Filter Model Card PDF - OpenAI, 22 April 2026
OpenAI Releases Privacy Filter Model to Redact Sensitive Data - Bloomberg Law, 22 April 2026
OpenAI Releases OpenAI Privacy Filter (Reuters wire) - Reuters via TradingView, 22 April 2026
OpenAI Just Open-Sourced a Tool That Scrubs Your Secrets Before ChatGPT Ever Sees Them - Decrypt, 22 April 2026

Data Protection | Awesome Agents