🛡️ Data Anonymizer – PII Detection & Redaction in the Browser
The Data Anonymizer is a client-side privacy utility that automatically detects and removes Personally Identifiable Information (PII) from text, JSON payloads, log files, API responses, CSV exports, and any other structured or unstructured content — entirely in your browser, with zero server communication.
Why Anonymize Data?
Modern software workflows constantly expose sensitive data in unexpected places: log files contain user emails and IP addresses, database exports include SSNs and credit card numbers, and API responses leak internal identifiers. Sharing these raw artifacts in GitHub issues, Slack threads, or with third-party support teams creates serious GDPR, HIPAA, and PCI-DSS liability. Anonymizing data before sharing eliminates that risk without slowing down your workflow.
Four Anonymization Strategies
✂️ Redact
GDPR-safeReplaces each detected PII value with a typed placeholder such as [EMAIL], [PHONE], or [SSN]. The output is completely de-identified and safe to share publicly. Ideal for support tickets and public bug reports.
🎭 Pseudonymize
Pseudonymous onlyReplaces each PII value with realistic-looking fake data of the same structural type — a real-looking email for an email, a valid-format phone number for a phone number. Using a fixed seed makes the output fully reproducible: the same input always produces the same fake values, preserving referential integrity across your dataset.
🔒 Hash (SHA-256)
GDPR-safeConverts each PII value to a truncated SHA-256 fingerprintusing the browser's native Web Crypto API. The original value cannot be recovered from the hash (one-way transformation), making this strategy suitable for compliance logging, audit trails, and cross-system correlation without exposing raw PII.
🏷️ Tokenize
GDPR-safeReplaces each unique PII value with an opaque sequential token (TOKEN_001, TOKEN_002, …) and generates a Token Mapping Table shown below the output. The table can be exported as CSV for in-session re-identification and debugging without exposing raw values in the shared artifact.
Built-in PII Detection Patterns
The tool ships with a comprehensive regex library that covers the most common PII types without any configuration:
- Email addresses — standard RFC 5321 format
- US phone numbers — with or without country code, various separators
- Social Security Numbers (SSN) —
NNN-NN-NNNNformat - Credit card numbers — 13–16 digit sequences with common separators
- IPv4 and IPv6 addresses — full octet and abbreviated notation
- UUIDs — version-agnostic
8-4-4-4-12hex format - Dates — ISO 8601, US short (
MM/DD/YYYY), EU short (DD.MM.YYYY) - IBANs — international bank account numbers
- MAC addresses — colon- and hyphen-separated
- URLs —
http://andhttps://prefixed links
JSON & CSV Awareness
When you paste a valid JSON object, the tool automatically parses the structure and walks every string leaf value, applying PII replacement inside string values only — leaving keys, numbers, booleans, and nested structure intact. The output is re-serialized as pretty-printed JSON, ready to paste into documentation or a support ticket. For CSV data, every cell is scanned independently, preserving column headers and delimiters.
Consistent Replacement & Seeded Pseudonymization
Enabling Consistent Replacement ensures that the same original value always maps to the same replacement throughout the entire document. For example, if [email protected] appears in ten different fields, all ten are replaced with the same fake email, preserving join relationships in relational datasets. The pseudonymization seed controls the fake-value generator: using the same seed across runs produces identical outputs, enabling reproducible demo datasets and regression tests.
Custom Regex Patterns
Beyond the built-in library, you can supply any JavaScript-compatible regular expression to target domain-specific identifiers such as internal account numbers (\bACCT-\d{8}\b), patient IDs (\bPAT-\d{6}\b), or proprietary reference formats. The custom pattern is validated in real time — a syntax error is shown immediately before processing.
Diff View & Detection Summary
Enable the Diff View toggle to see a line-by-line comparison between your original input and the anonymized output, with removed content highlighted in red and replacements in green. The Detection Summary panel shows a horizontal bar chart breaking down the count of each PII type found, giving you a quick privacy audit at a glance.