The Complete Guide to Small Text and Unicode Subscripts

Small text generators turn ordinary characters into their Unicode equivalents — small caps, superscript, and subscript. This guide explains how these characters work under the hood, where they're useful, and where they'll let you down.


How small text works in Unicode

Unicode doesn't have a single "make it small" switch. Instead, dedicated code points scattered across several blocks represent smaller variants of letters and digits. The Superscripts and Subscripts block (U+2070–U+209F) covers most super/subscript digits and a handful of letters. Additional characters live in Latin Extended-B, Phonetic Extensions, and Spacing Modifier Letters.

The important thing to understand: these aren't styling instructions. Each small character is its own code point with its own identity. When you type H₂O, the isn't a styled 2 — it's U+2082, SUBSCRIPT TWO, a completely separate character. This is what lets small text travel across platforms without any formatting support.

The catch is that not every letter has a super, sub, or small-caps equivalent. The subscript block, for example, has digits and a few vowels but is missing most consonants. Generators typically leave unsupported characters unchanged or substitute the closest visual match.


Small caps

Small caps are a typographic convention where lowercase letters appear as smaller versions of their uppercase forms. You'll see them in academic citations (author names in bibliographies), legal documents (defined terms and section titles), and stylistic text on social media.

Unicode small caps aren't true typographic small caps — those require font-level support. Instead, generators map lowercase letters to characters from the Phonetic Extensions (U+1D00–U+1D7F) and Latin Extended blocks that happen to look like small uppercase letters. The result — ʟɪᴋᴇ ᴛʜɪs — is convincing at a glance but technically a string of phonetic symbols. Some letters (notably f, q, x) have weaker substitutions that may look slightly off depending on the font.


Superscript and subscript

Unicode provides superscript forms for all ten digits (⁰¹²³⁴⁵⁶⁷⁸⁹) and a reasonable set of lowercase letters (ᵃᵇᶜᵈᵉᶠᵍʰⁱʲᵏˡᵐⁿᵒᵖ…). Subscript coverage is thinner: all digits (₀₁₂₃₄₅₆₇₈₉) are present, but only a handful of letters (ₐₑₕᵢⱼₖₗₘₙₒₚᵣₛₜᵤᵥₓ) have subscript code points.

Common uses include mathematical notation (x² + y²), chemical formulas (H₂O, CO₂), footnote markers in plain text, and phonetic transcription in linguistics. Because these are real characters, they paste cleanly into any text field — no markup required.


Where they work and where they don't

Small text characters are standard Unicode, so they work anywhere Unicode is supported: social media bios and posts, messaging apps, forum comments, email subjects, and file names. That's the whole appeal — styled text without any rich-text editor.

The limitation is font coverage. Not every font includes glyphs for Phonetic Extensions or the full Superscripts and Subscripts block. On devices or platforms missing those glyphs, you'll see empty boxes (□), replacement characters (�), or fallback glyphs that don't match the surrounding text. Mobile devices and older operating systems are the most common culprits.


Alternatives for the web

If you control the rendering environment — your own website, email template, or document — Unicode substitution is the wrong tool. HTML gives you <sup> and <sub> elements that work with any font. CSS font-variant: small-caps produces proper small caps using the font's built-in metrics, which look far better than phonetic symbol substitutions.

Use Unicode small text when you can't control rendering: a Twitter bio, an Instagram caption, a plain-text README, or a chat message. Everywhere else, prefer semantic markup — it's more reliable, more accessible, and renders consistently.


Troubleshooting

Some letters don't convert to small text — Not every letter has a Unicode superscript, subscript, or small-caps equivalent. The generator leaves unsupported characters in their original form. This is a Unicode limitation, not a bug.

Text looks wrong on some devices or platforms — The target device's font likely doesn't include glyphs for the Unicode blocks used. Try a different platform or accept that rendering varies. There's no fix you can apply from the sender's side.

Screen readers pronounce strange words instead of my text — Screen readers read the Unicode character names, not the visual appearance. A small-caps "HELLO" made from phonetic symbols may be announced as "Latin letter small capital H, Latin letter small capital E…" — unintelligible to listeners. Avoid Unicode small text in contexts where accessibility matters.

Mixing small text with regular text looks inconsistent — Small-caps and superscript characters often come from different font blocks with different metrics. Line height, weight, and baseline can shift mid-word. This is inherent to the approach — for consistent typography, use CSS or proper typesetting tools instead.