Logo

MonoCalc

/

Emoji Unicode Converter

Encode/Decode
Paste any emoji or mixed text (max 2000 chars)

😀🌍🔥

Paste emoji above to see their Unicode code points, or switch to Unicode → Emoji to decode code points back to characters.

About This Tool

🔢 Emoji Unicode Converter – Encode & Decode Emoji Code Points

The Emoji Unicode Converter is a bidirectional tool that translates between visible emoji characters and their underlying Unicode code point representations. Whether you are debugging emoji in a database, embedding emoji safely in HTML, CSS, or JavaScript, or simply exploring how the Unicode standard works, this tool provides instant conversions in eight different output formats — all without leaving your browser.

What Is a Unicode Code Point?

Every character in the Unicode standard — including all 3,600+ emoji — is assigned a unique number called a code point. Code points are written in U+ notation: for example, the grinning face emoji 😀 is U+1F600. When you need to include emoji in source code, HTML, or data transfer formats, you often cannot use the raw character; instead you use an escape sequence derived from the code point.

Supported Output Formats

FormatExample (😀 = U+1F600)Use Case
U+ NotationU+1F600Unicode documentation, databases
HTML Hex Entity😀HTML pages, XML
HTML Decimal Entity😀Legacy HTML, email templates
CSS Escape\1F600CSS content property
JS ES6\u{1F600}Modern JavaScript/TypeScript strings
JS ES5 Surrogate\uD83D\uDE00Legacy JS engines, JSON
Python\U0001F600Python 3 string literals
Raw Hex1F600Low-level programming, fonts

ZWJ Sequences Explained

Some of the most complex emoji — such as family groups (👨‍👩‍👧‍👦), profession emoji (👩‍💻), or flag sequences — are built by joining simpler emoji with the Zero-Width Joiner character (U+200D, ZWJ). Visually they appear as a single glyph, but they are actually a sequence of multiple code points. The converter uses the Intl.Segmenter API to identify these clusters and shows each component emoji and its ZWJ connectors in the breakdown panel.

Surrogate Pairs in JavaScript

JavaScript stores strings internally as UTF-16. Characters above U+FFFF — which includes virtually all emoji — cannot fit in a single 16-bit code unit and require a surrogate pair: two 16-bit values (a high surrogate 0xD800–0xDBFF and a low surrogate 0xDC00–0xDFFF) that together encode the full code point. The formula is:

high = Math.floor((codePoint - 0x10000) / 0x400) + 0xD800
low  = ((codePoint - 0x10000) % 0x400) + 0xDC00

Modern JavaScript (ES6+) avoids this complexity with the \u{XXXXX} syntax, which directly accepts the full code point. The converter outputs both representations so you can pick the right one for your target environment.

UTF-8 Byte Encoding

UTF-8 uses a variable number of bytes per character. Most emoji fall in the range U+1F000–U+1FAFF, which requires 4 bytes in UTF-8 (bytes starting with F0). The per-emoji table shows the exact hex byte sequence for every character, which is useful when working with databases, file I/O, or network protocols that operate at the byte level.

Common Use Cases

  • Database debugging — convert emoji to code points to check whether your database column is using a UTF-8 4-byte charset (e.g., utf8mb4 in MySQL).
  • HTML templates — use HTML entity notation to safely embed emoji in HTML without worrying about file encoding.
  • CSS icons — paste the CSS escape into the content property of a ::before pseudo-element.
  • API payloads — sanitise emoji in JSON by replacing raw characters with\u escapes before serialisation.
  • Accessibility & NLP — identify and strip emoji from text pipelines using the per-emoji table.

Limitations

Emoji names shown in the breakdown are derived from Unicode block ranges and a built-in lookup table. They give the Unicode block name (e.g., "EMOTICONS", "TRANSPORT & MAP SYMBOLS") rather than the full CLDR annotation (e.g., "grinning face") for every single character. For the full official CLDR name of any specific emoji, cross-reference with the Unicode Emoji Chart. Input is capped at 10,000 characters to maintain browser performance.

Frequently Asked Questions

Is the Emoji Unicode Converter free?

Yes, Emoji Unicode Converter is totally free :)

Can I use the Emoji Unicode Converter offline?

Yes, you can install the webapp as PWA.

Is it safe to use Emoji Unicode Converter?

Yes, any data related to Emoji Unicode Converter only stored in your browser (if storage required). You can simply clear browser cache to clear all the stored data. We do not store any data on server.

How does the Emoji Unicode Converter work?

Paste any emoji or mixed text into the input box and the tool instantly breaks it into its Unicode code points. You can switch between Emoji → Unicode and Unicode → Emoji directions, and pick your preferred output format (U+ notation, HTML entity, CSS escape, JS escape, Python escape, or raw hex). All processing happens locally in your browser.

What output formats are supported?

The tool supports U+ notation (U+1F600), HTML hex entity (😀), HTML decimal entity (😀), CSS escape (\1F600), JavaScript ES6 escape (\u{1F600}), JavaScript ES5 surrogate pair (\uD83D\uDE00), Python unicode escape (\U0001F600), and raw hexadecimal (1F600).

What is a ZWJ sequence and how does the tool handle it?

A Zero-Width Joiner (ZWJ, U+200D) is an invisible character that combines multiple emoji into a single composite glyph — for example 👨‍👩‍👧‍👦 (family) or 👩‍💻 (woman technologist). The tool detects ZWJ sequences using the Intl.Segmenter API and breaks them into their individual components, showing each base emoji and every joiner.

Why do some emoji need surrogate pairs in JavaScript?

JavaScript's internal string encoding is UTF-16. Characters above U+FFFF (most emoji) cannot fit in a single 16-bit code unit, so they are represented as two 16-bit values called a surrogate pair. For example, 😀 (U+1F600) becomes \uD83D\uDE00. Modern JS can use \u{1F600} instead, which avoids the pair. The tool shows both representations.

How accurate are the emoji names shown in the table?

The tool uses a built-in lookup table covering the Unicode emoji ranges and their official CLDR short names. Common emoji families (faces, animals, food, symbols, flags) are identified by name. For very new or uncommon characters the tool falls back to the Unicode block name and code point.

Can I convert multiple emoji at once?

Yes. You can paste an entire paragraph of mixed text and emoji. The tool analyses each grapheme cluster (including complex skin-tone modified and ZWJ sequences), lists every emoji in a per-emoji table, and outputs the full converted string in your chosen format — all in one pass.