Logo

MonoCalc

/

Text Entropy Calculator

Text

Enter text above to calculate its Shannon entropy.

About This Tool

🔢 Text Entropy Calculator – Measure Information Density

The Text Entropy Calculator applies Shannon entropy — the foundational concept of information theory — to any text you provide. It quantifies how much randomness or information content exists in a string, measured in bits per character. Whether you're evaluating password strength, auditing encryption output, or studying linguistics, this tool gives you an instant, accurate picture of your text's information density.

📐 The Shannon Entropy Formula

Claude Shannon introduced entropy in 1948 as a measure of uncertainty in a communication channel. For discrete symbols, the formula is:

H(X) = -Σ [ p(xᵢ) × log₂(p(xᵢ)) ]

Where:

  • p(xᵢ) is the probability (relative frequency) of character xᵢ
  • The sum runs over all unique characters in the input
  • The result is in bits per character

For example, the string aaaa has zero entropy — every character is predictable. The string abcd has entropy log₂(4) = 2.0 bits/char — each character is equally likely.

📊 Derived Metrics Explained

MetricFormulaWhat It Tells You
Shannon EntropyH = -Σ p·log₂(p)Average randomness per character
Total Entropy BitsH × lengthTotal information content of the string
Max Possible Entropylog₂(alphabet_size)Upper bound for the detected character set
Normalised EntropyH / log₂(alphabet) × 100How close to maximum randomness (percentage)

🔐 Password Strength via Entropy

Total entropy bits (H × length) is the most reliable metric for password strength because it accounts for both character diversity and password length:

  • Under 28 bits → Weak — easily brute-forced
  • 28–35 bits → Fair — acceptable for low-risk accounts
  • 36–59 bits → Strong — recommended for most use cases
  • 60+ bits → Very Strong — suitable for high-security credentials
Tip

These thresholds assume an attacker has no prior knowledge of the password's structure. Dictionary-based passwords may score high in entropy but remain vulnerable to targeted attacks.

🔬 Analysis Modes

Character vs. Byte Analysis

Character mode treats each Unicode code point as one symbol — ideal for natural language text. Byte mode uses the raw UTF-8 byte values (0–255), which is appropriate for cryptographic assessments of binary data or multi-byte characters like emoji and accented letters.

N-gram Entropy

Instead of analysing individual characters, n-gram entropy considers sequences of n characters at once. A bigram entropy near zero (e.g., abababab) reveals repeating two-character patterns that unigram entropy alone might partially miss. This is useful for detecting structured tokens or weak pseudo-random generators.

Chunked Analysis

For long documents or logs, chunked mode divides the text into fixed-size segments and computes entropy for each chunk independently. Visualising entropy across chunks helps identify regions of compressed data, encrypted blocks, or repeated boilerplate content.

Comparative Mode

Paste two text samples side-by-side to instantly compare their entropy values, normalised scores, and character distributions. This is particularly useful when comparing candidate passwords or evaluating which version of a generated token is more random.

🛠️ Practical Use Cases

  • Security audits — verify that generated API keys, session tokens, or passwords meet entropy thresholds
  • Data compression — text with low entropy compresses well; high entropy text is already dense
  • Cryptographic analysis — encrypted ciphertext should have entropy close to the maximum for its byte range
  • Linguistics research — compare information density across languages, writing systems, or genres
  • Education — demonstrate core concepts of information theory interactively

⚙️ Options Reference

  • Case Sensitivity — treat A and a as the same character to analyse structural entropy independent of casing
  • Ignore Whitespace — exclude spaces and newlines to focus on meaningful character distribution
  • Character Set Filter — restrict analysis to All characters, ASCII only, or Alphanumeric only
  • Precision — configure the number of decimal places shown in the entropy result (0–10)
  • Custom Alphabet Size — override the detected alphabet size for normalisation (e.g., set to 64 for Base64-encoded strings)
Privacy

All calculations run entirely in your browser. No text is ever sent to any server, making this tool safe for analysing passwords, private keys, and sensitive documents.

Frequently Asked Questions

Is the Text Entropy Calculator free?

Yes, Text Entropy Calculator is totally free :)

Can I use the Text Entropy Calculator offline?

Yes, you can install the webapp as PWA.

Is it safe to use Text Entropy Calculator?

Yes, any data related to Text Entropy Calculator only stored in your browser (if storage required). You can simply clear browser cache to clear all the stored data. We do not store any data on server.

What is Shannon entropy and how is it calculated?

Shannon entropy measures the average information content (randomness) in a text. It is calculated using H(X) = -Σ [p(xᵢ) × log₂(p(xᵢ))], where p(xᵢ) is the probability of each unique character. The result is expressed in bits per character — higher values mean more randomness or information density.

How does this calculator work?

Enter any text and the tool instantly computes its Shannon entropy by analysing character (or byte) frequencies. It shows the entropy value in bits per character, total entropy bits, normalised entropy percentage, a character frequency table, and a password strength rating — all calculated directly in your browser.

What does a high or low entropy value mean?

High entropy (close to the maximum) means characters are distributed uniformly, indicating random, diverse, or encrypted content. Low entropy means many repeated characters — predictable, compressible, or structured text. For example, 'aaaaaaa' has entropy near 0, while a random password has entropy near the maximum for its character set.

How is password strength determined?

Password strength is based on total entropy bits (entropy per character × password length): under 28 bits is Weak, 28–35 bits is Fair, 36–59 bits is Strong, and 60 or more bits is Very Strong. This follows widely accepted cryptographic guidelines for password security.

What is normalised entropy?

Normalised entropy expresses the entropy as a percentage of the theoretical maximum for the detected alphabet size. For example, if the max possible entropy for your character set is 6.0 bits/char and the actual entropy is 4.5 bits/char, the normalised score is 75%. It helps compare texts using different alphabets on a common scale.

Is my text sent to any server?

No. All entropy calculations are performed entirely in your browser using JavaScript. Your text never leaves your device, ensuring complete privacy for passwords, sensitive documents, or confidential data.