🔤 Unicode Escape Converter – Encode & Decode Any Unicode Character
The Unicode Escape Converter is a free, browser-based tool that converts any Unicode text into escape sequences used by JavaScript, Python, HTML, CSS, and URL encoding — and decodes them back to readable text. Whether you are embedding emoji in JSON, escaping special characters in source code, or inspecting raw codepoints, this tool handles all major escape formats with a single click.
What Are Unicode Escape Sequences?
Unicode defines over 149,000 characters across scripts, symbols, emoji, and technical characters. Because many text formats and programming languages only accept ASCII input directly, Unicode characters are written as escape sequences — compact ASCII strings that represent the character's numeric codepoint. For example, the letter é (U+00E9) can be written as \u00E9 in JavaScript, é in HTML, or \00E9 in CSS.
Supported Escape Formats
| Format | Syntax | Example (🌍 = U+1F30D) | Use Case |
|---|---|---|---|
| JavaScript ES6 | \u{{XXXXX}{} | \u{1F30D} | Modern JS/TS, template literals |
| JavaScript ES5 | \uXXXX | \uD83C\uDF0D | Legacy JS, JSON strings |
| Python / C | \uXXXX / \UXXXXXXXX | \U0001F30D | Python 3, C/C++ wide strings |
| HTML Decimal | &#DDDDD; | 🌍 | HTML/XML content |
| HTML Hex | &#xXXXXX; | 🌍 | HTML/XML, XHTML |
| CSS | \XXXXXX | \1F30D | CSS content property, font-face |
| URL Legacy | %uXXXX | %uD83C%uDF0D | Legacy IE, non-standard URLs |
Encoding Options Explained
Scope: Non-ASCII Only vs. All Characters
Non-ASCII only (the default) leaves printable ASCII characters (U+0020–U+007E) as-is and only escapes characters above U+007F. This produces the most readable output and is the industry standard for JavaScript, JSON, and HTML. All characters escapes every single character — useful for generating fully portable escape-only strings where no Unicode at all should appear in the output.
Surrogate Pairs (JavaScript ES5)
JavaScript strings internally use UTF-16, which encodes characters above U+FFFF as two 16-bit surrogate pairs. In ES5 mode, the 🌍 emoji (U+1F30D) becomes \uD83C\uDF0D. ES6 introduced the cleaner \u{1F30D} syntax that represents the full codepoint directly, making surrogate pairs unnecessary in modern code.
The Codepoint Inspector
Expand the Codepoint Inspector panel after encoding to view a per-character breakdown of your input. For each character you can see:
- Codepoint — the Unicode codepoint in U+XXXX notation
- Name / Block — the Unicode block the character belongs to (e.g., LATIN EXTENDED, EMOTICONS EMOJI)
- UTF-8 bytes — the raw hexadecimal bytes used to store the character in UTF-8 (shown as color-coded badges)
- UTF-16 — the code unit(s) used in JavaScript's internal string representation
- Escaped form — the exact escape sequence generated for the selected format
This panel is especially useful for debugging encoding issues, understanding why a character takes multiple bytes in UTF-8, or learning how Unicode works in practice.
Decoding Escape Sequences
Switch to Decode mode and paste any mix of escape sequences. The decoder automatically detects all supported formats in the input — you can mix JavaScript \uXXXX, HTML &#DDDDD;, and CSS \XXXXXX sequences in the same string and the tool will decode them all. Surrogate pairs encountered during decoding are automatically recombined into the correct single codepoint.
Common Use Cases
- JSON / API payloads — escape non-ASCII text in JSON string values for maximum compatibility with older parsers
- JavaScript source code — represent emoji or RTL text as escape sequences in string literals to avoid font rendering issues in editors
- HTML templates — encode special characters as numeric references to avoid charset issues in older browsers or XML processors
- CSS content — generate CSS
contentproperty values for icon fonts using\XXXXXXescape notation - Security analysis — detect obfuscated JavaScript or HTML injection payloads encoded as Unicode escapes
- Internationalization (i18n) — inspect and convert locale strings, Devanagari, CJK, Arabic, or other non-Latin scripts before embedding them in resource files
Browser-Side Processing & Privacy
All conversions run entirely in your browser using native JavaScript APIs: String.prototype.codePointAt(), String.fromCodePoint(), and the TextEncoder API for accurate UTF-8 byte counts. Your input text is never sent to any server. The tool supports inputs up to 50,000 characters to prevent browser freezes on very large payloads.