Gate News message, April 22 — OpenAI has released Privacy Filter, an open-source language model designed to detect and redact personally identifiable information (PII) in text. The model runs locally and processes long documents in a single forward pass, supporting up to 128,000 tokens of context. With 1.5 billion total parameters and 50 million active parameters, Privacy Filter identifies private names, addresses, email addresses, phone numbers, URLs, dates, account numbers, passwords, API keys, and other sensitive information.
The model is available under the Apache 2.0 license on Hugging Face and GitHub. It can identify a wide range of PII categories including personal contact details, financial information, and authentication credentials.
OpenAI stated that Privacy Filter is intended for use in privacy-preserving workflows such as training data preparation, indexing, logging, and content moderation.