Logo

MonoCalc

/

Unicode Escape Converter

Encode/Decode

Unicode Text Input

Escaped Output

About This Tool

🔤 Unicode Escape Converter – Encode & Decode Any Unicode Character

The Unicode Escape Converter is a free, browser-based tool that converts any Unicode text into escape sequences used by JavaScript, Python, HTML, CSS, and URL encoding — and decodes them back to readable text. Whether you are embedding emoji in JSON, escaping special characters in source code, or inspecting raw codepoints, this tool handles all major escape formats with a single click.

What Are Unicode Escape Sequences?

Unicode defines over 149,000 characters across scripts, symbols, emoji, and technical characters. Because many text formats and programming languages only accept ASCII input directly, Unicode characters are written as escape sequences — compact ASCII strings that represent the character's numeric codepoint. For example, the letter é (U+00E9) can be written as \u00E9 in JavaScript, é in HTML, or \00E9 in CSS.

Supported Escape Formats

FormatSyntaxExample (🌍 = U+1F30D)Use Case
JavaScript ES6\u{{XXXXX}{}\u{1F30D}Modern JS/TS, template literals
JavaScript ES5\uXXXX\uD83C\uDF0DLegacy JS, JSON strings
Python / C\uXXXX / \UXXXXXXXX\U0001F30DPython 3, C/C++ wide strings
HTML Decimal&#DDDDD;🌍HTML/XML content
HTML Hex&#xXXXXX;🌍HTML/XML, XHTML
CSS\XXXXXX\1F30DCSS content property, font-face
URL Legacy%uXXXX%uD83C%uDF0DLegacy IE, non-standard URLs

Encoding Options Explained

Scope: Non-ASCII Only vs. All Characters

Non-ASCII only (the default) leaves printable ASCII characters (U+0020–U+007E) as-is and only escapes characters above U+007F. This produces the most readable output and is the industry standard for JavaScript, JSON, and HTML. All characters escapes every single character — useful for generating fully portable escape-only strings where no Unicode at all should appear in the output.

Surrogate Pairs (JavaScript ES5)

JavaScript strings internally use UTF-16, which encodes characters above U+FFFF as two 16-bit surrogate pairs. In ES5 mode, the 🌍 emoji (U+1F30D) becomes \uD83C\uDF0D. ES6 introduced the cleaner \u{1F30D} syntax that represents the full codepoint directly, making surrogate pairs unnecessary in modern code.

The Codepoint Inspector

Expand the Codepoint Inspector panel after encoding to view a per-character breakdown of your input. For each character you can see:

  • Codepoint — the Unicode codepoint in U+XXXX notation
  • Name / Block — the Unicode block the character belongs to (e.g., LATIN EXTENDED, EMOTICONS EMOJI)
  • UTF-8 bytes — the raw hexadecimal bytes used to store the character in UTF-8 (shown as color-coded badges)
  • UTF-16 — the code unit(s) used in JavaScript's internal string representation
  • Escaped form — the exact escape sequence generated for the selected format

This panel is especially useful for debugging encoding issues, understanding why a character takes multiple bytes in UTF-8, or learning how Unicode works in practice.

Decoding Escape Sequences

Switch to Decode mode and paste any mix of escape sequences. The decoder automatically detects all supported formats in the input — you can mix JavaScript \uXXXX, HTML &#DDDDD;, and CSS \XXXXXX sequences in the same string and the tool will decode them all. Surrogate pairs encountered during decoding are automatically recombined into the correct single codepoint.

Common Use Cases

  • JSON / API payloads — escape non-ASCII text in JSON string values for maximum compatibility with older parsers
  • JavaScript source code — represent emoji or RTL text as escape sequences in string literals to avoid font rendering issues in editors
  • HTML templates — encode special characters as numeric references to avoid charset issues in older browsers or XML processors
  • CSS content — generate CSS content property values for icon fonts using \XXXXXX escape notation
  • Security analysis — detect obfuscated JavaScript or HTML injection payloads encoded as Unicode escapes
  • Internationalization (i18n) — inspect and convert locale strings, Devanagari, CJK, Arabic, or other non-Latin scripts before embedding them in resource files

Browser-Side Processing & Privacy

All conversions run entirely in your browser using native JavaScript APIs: String.prototype.codePointAt(), String.fromCodePoint(), and the TextEncoder API for accurate UTF-8 byte counts. Your input text is never sent to any server. The tool supports inputs up to 50,000 characters to prevent browser freezes on very large payloads.

Frequently Asked Questions

Is the Unicode Escape Converter free?

Yes, Unicode Escape Converter is totally free :)

Can I use the Unicode Escape Converter offline?

Yes, you can install the webapp as PWA.

Is it safe to use Unicode Escape Converter?

Yes, any data related to Unicode Escape Converter only stored in your browser (if storage required). You can simply clear browser cache to clear all the stored data. We do not store any data on server.

What is a Unicode escape sequence?

A Unicode escape sequence is a way to represent a Unicode character using only ASCII characters. Instead of writing 'é' directly, you write \u00E9 in JavaScript or é in HTML. This is useful when embedding special characters, emoji, or non-ASCII text in source code, JSON, HTML, or CSS.

How does the Unicode Escape Converter work?

Enter any Unicode text in the input box, select the target escape format and options, and the tool instantly converts each character to the corresponding escape sequence. In Decode mode, paste escape sequences and the tool reconstructs the original Unicode text. All processing happens in your browser — nothing is sent to a server.

What escape formats are supported?

The tool supports JavaScript ES6 (\u{XXXXX}), JavaScript ES5 (\uXXXX with surrogate pairs for astral characters), Python/C (\uXXXX and \UXXXXXXXX), HTML decimal (&#DDDDD;), HTML hexadecimal (&#xXXXXX;), CSS (\XXXXXX), and legacy URL encoding (%uXXXX).

What is the difference between 'All characters' and 'Non-ASCII only' scope?

When 'All characters' is selected, every character — including plain ASCII letters and digits — is escaped. 'Non-ASCII only' (the default) leaves printable ASCII characters as-is and only escapes characters with codepoints above U+007F. The latter is more readable and is the standard practice for most use cases.

What are surrogate pairs and when do I need them?

Surrogate pairs are a two-code-unit encoding used in JavaScript ES5 and UTF-16 for characters above U+FFFF (like most emoji). For example, the 🌍 emoji (U+1F30D) becomes \uD83C\uDF0D as a surrogate pair. ES6+ and Python use a single escape for the full codepoint (\u{1F30D} or \U0001F30D), which is cleaner and preferred for modern code.

What does the Codepoint Inspector show?

The Codepoint Inspector breaks your input into individual Unicode characters and shows the codepoint (U+XXXX), UTF-8 byte sequence, UTF-16 encoding, and escaped form for each character. This is useful for debugging encoding issues, understanding multi-byte characters, or learning how Unicode works.