All Tools
T

ASCII / Unicode Explorer

Inspect pasted text, Unicode code points, UTF-8 bytes, UTF-16 code units, JavaScript escapes, and a full 128-row ASCII table without leaving the browser.

Read the full guide for this tool
UTF-16 code units
11
Unicode code points
10
UTF-8 bytes
16
Grapheme clusters
9
Lines
2
CRLF 0 · CR 0 · LF 1
Surrogate-pair sequence detected

At least one selected symbol uses two UTF-16 code units for one Unicode code point. This is why JavaScript string length can be larger than the visible symbol count.

Combining-mark sequence detected

A combining mark modifies the character before it instead of standing alone. One visible glyph can therefore span multiple code points even when the text looks like a single character.

Normalization difference detected

The original text changes under NFC normalization, so visually identical strings may compare differently until they are normalized to the same canonical form.

A
Selected code point

Assigned Unicode code point (name not included in pinned v1 dataset)

U+0041 · valid Unicode scalar value

ASCIIprintable
ASCII subset

This value is inside the 7-bit ASCII range U+0000 through U+007F, so it is valid ASCII as well as Unicode.

Toolzy is still loading the official Unicode name for this assigned code point from the local chunked dataset.
ASCII subset
Literal
A
U+ notation
U+0041
Hex
0x41
Decimal
65
Binary
1000001
UTF-8 bytes
0x41
UTF-16 code units
0x0041
JS escape
\u0041
JS code point escape
\u{41}
HTML numeric reference (decimal)
A
HTML numeric reference (hex)
A

Code point breakdown

One row per Unicode code point, not one row per UTF-16 index.

#PreviewName / statusU+DecUTF-16UTF-8Labels
1A
Assigned Unicode code point (name not included in pinned v1 dataset)
A
U+0041650x00410x41
ASCIIprintable
2TAB
CHARACTER TABULATION
TAB
U+000990x00090x09
ASCIIcontrol
3C
Assigned Unicode code point (name not included in pinned v1 dataset)
C
U+0043670x00430x43
ASCIIprintable
4a
Assigned Unicode code point (name not included in pinned v1 dataset)
a
U+0061970x00610x61
ASCIIprintable
5f
Assigned Unicode code point (name not included in pinned v1 dataset)
f
U+00661020x00660x66
ASCIIprintable
6e
Assigned Unicode code point (name not included in pinned v1 dataset)
e
U+00651010x00650x65
ASCIIprintable
7COMBINING ACUTE
COMBINING ACUTE ACCENT
́
U+03017690x03010xCC 0x81
printablecombining mark
8LF
LINE FEED
LF
U+000A100x000A0x0A
ASCIIcontrol
9ZWSP
ZERO WIDTH SPACE
ZWSP
U+200B82030x200B0xE2 0x80 0x8B
invisible
10😀
Assigned Unicode code point (name not included in pinned v1 dataset)
😀
U+01F6001285120xD83D 0xDE000xF0 0x9F 0x98 0x80
printable

ASCII is exactly 128 values. Rows 0x80-0xFF are not ASCII.

DecHexBinaryAbbrNamePreview
00x0000000000NULNULLNUL
10x0100000001SOHSTART OF HEADINGSOH
20x0200000010STXSTART OF TEXTSTX
30x0300000011ETXEND OF TEXTETX
40x0400000100EOTEND OF TRANSMISSIONEOT
50x0500000101ENQENQUIRYENQ
60x0600000110ACKACKNOWLEDGEACK
70x0700000111BELBELLBEL
80x0800001000BSBACKSPACEBS
90x0900001001TABCHARACTER TABULATIONTAB
100x0A00001010LFLINE FEEDLF
110x0B00001011VTLINE TABULATIONVT
120x0C00001100FFFORM FEEDFF
130x0D00001101CRCARRIAGE RETURNCR
140x0E00001110SOSHIFT OUTSO
150x0F00001111SISHIFT INSI
160x1000010000DLEDATA LINK ESCAPEDLE
170x1100010001DC1DEVICE CONTROL ONEDC1
180x1200010010DC2DEVICE CONTROL TWODC2
190x1300010011DC3DEVICE CONTROL THREEDC3
200x1400010100DC4DEVICE CONTROL FOURDC4
210x1500010101NAKNEGATIVE ACKNOWLEDGENAK
220x1600010110SYNSYNCHRONOUS IDLESYN
230x1700010111ETBEND OF TRANSMISSION BLOCKETB
240x1800011000CANCANCELCAN
250x1900011001EMEND OF MEDIUMEM
260x1A00011010SUBSUBSTITUTESUB
270x1B00011011ESCESCAPEESC
280x1C00011100FSFILE SEPARATORFS
290x1D00011101GSGROUP SEPARATORGS
300x1E00011110RSRECORD SEPARATORRS
310x1F00011111USUNIT SEPARATORUS
320x2000100000 ASCII value
330x2100100001!ASCII value!
340x2200100010"ASCII value"
350x2300100011#ASCII value#
360x2400100100$ASCII value$
370x2500100101%ASCII value%
380x2600100110&ASCII value&
390x2700100111'ASCII value'
400x2800101000(ASCII value(
410x2900101001)ASCII value)
420x2A00101010*ASCII value*
430x2B00101011+ASCII value+
440x2C00101100,ASCII value,
450x2D00101101-ASCII value-
460x2E00101110.ASCII value.
470x2F00101111/ASCII value/
480x30001100000ASCII value0
490x31001100011ASCII value1
500x32001100102ASCII value2
510x33001100113ASCII value3
520x34001101004ASCII value4
530x35001101015ASCII value5
540x36001101106ASCII value6
550x37001101117ASCII value7
560x38001110008ASCII value8
570x39001110019ASCII value9
580x3A00111010:ASCII value:
590x3B00111011;ASCII value;
600x3C00111100<ASCII value<
610x3D00111101=ASCII value=
620x3E00111110>ASCII value>
630x3F00111111?ASCII value?
640x4001000000@ASCII value@
650x4101000001AASCII valueA
660x4201000010BASCII valueB
670x4301000011CASCII valueC
680x4401000100DASCII valueD
690x4501000101EASCII valueE
700x4601000110FASCII valueF
710x4701000111GASCII valueG
720x4801001000HASCII valueH
730x4901001001IASCII valueI
740x4A01001010JASCII valueJ
750x4B01001011KASCII valueK
760x4C01001100LASCII valueL
770x4D01001101MASCII valueM
780x4E01001110NASCII valueN
790x4F01001111OASCII valueO
800x5001010000PASCII valueP
810x5101010001QASCII valueQ
820x5201010010RASCII valueR
830x5301010011SASCII valueS
840x5401010100TASCII valueT
850x5501010101UASCII valueU
860x5601010110VASCII valueV
870x5701010111WASCII valueW
880x5801011000XASCII valueX
890x5901011001YASCII valueY
900x5A01011010ZASCII valueZ
910x5B01011011[ASCII value[
920x5C01011100\ASCII value\
930x5D01011101]ASCII value]
940x5E01011110^ASCII value^
950x5F01011111_ASCII value_
960x6001100000`ASCII value`
970x6101100001aASCII valuea
980x6201100010bASCII valueb
990x6301100011cASCII valuec
1000x6401100100dASCII valued
1010x6501100101eASCII valuee
1020x6601100110fASCII valuef
1030x6701100111gASCII valueg
1040x6801101000hASCII valueh
1050x6901101001iASCII valuei
1060x6A01101010jASCII valuej
1070x6B01101011kASCII valuek
1080x6C01101100lASCII valuel
1090x6D01101101mASCII valuem
1100x6E01101110nASCII valuen
1110x6F01101111oASCII valueo
1120x7001110000pASCII valuep
1130x7101110001qASCII valueq
1140x7201110010rASCII valuer
1150x7301110011sASCII values
1160x7401110100tASCII valuet
1170x7501110101uASCII valueu
1180x7601110110vASCII valuev
1190x7701110111wASCII valuew
1200x7801111000xASCII valuex
1210x7901111001yASCII valuey
1220x7A01111010zASCII valuez
1230x7B01111011{ASCII value{
1240x7C01111100|ASCII value|
1250x7D01111101}ASCII value}
1260x7E01111110~ASCII value~
1270x7F01111111DELDELETEDEL
Unicode lookup data version: 16.0.0. This build ships a pinned local dataset of official Unicode names with no runtime fetches, while all status detection and byte calculations still run entirely in the browser.

ASCII / Unicode Explorer: Inspect Hidden Characters, UTF-8 Bytes, and the Full ASCII Table

ASCII and Unicode get mixed together constantly in debugging conversations, but they solve different problems. ASCII is a fixed 128-value character set from 0 to 127 (0x00-0x7F). Unicode is the much larger standard that assigns code points across modern scripts, symbols, emoji, controls, and formatting marks. UTF-8 and UTF-16 are encodings and storage forms for those Unicode code points, not separate character sets.

This tool focuses on inspection rather than lossy conversion. Paste text with hidden characters, enter a code point like U+200B or 0x41, and inspect the result locally in the browser. Toolzy breaks text down by Unicode code point, shows UTF-8 bytes and UTF-16 code units, and keeps a canonical 128-row ASCII table on the same page for quick reference.

What this tool helps you debug

ASCII vs Unicode vs UTF-8 vs UTF-16

These terms are related but not interchangeable:

That distinction matters in JavaScript because strings are UTF-16 sequences. A character like A is one code point and one UTF-16 code unit. A character like 😀 is one code point but two UTF-16 code units. A visible symbol like may be two code points but one grapheme cluster.

Why hidden characters break real workflows

Many production bugs come from characters that are technically valid but visually hard to spot. A URL copied from chat may contain U+200B ZERO WIDTH SPACE. A CMS export may include U+00A0 NO-BREAK SPACE instead of a normal space. Logs may contain smart quotes, directional isolates, or replacement characters.

Those values can affect parsing, equality checks, sorting, wrapping, tokenization, and test fixtures. The point of an explorer is to show the raw reality of the string without silently normalizing or cleaning it up.

Surrogate pairs, combining marks, and string.length

If string.length feels wrong, it's usually measuring UTF-16 code units instead of the visible units you care about.

That is why this tool reports UTF-16 code units, Unicode code points, and grapheme clusters separately when the browser supports Intl.Segmenter.

Using the ASCII table correctly

The ASCII table on this page is intentionally strict: exactly 128 rows, exactly 0x00-0x7F. Values from 0x80 to 0xFF are not ASCII, even if older docs call them "extended ASCII". Those are non-ASCII 8-bit code pages or Unicode values, depending on context.

If you need to answer questions like "what is ASCII 65" or "what is hex 0x09", the table covers that directly. If you need to answer "what hidden Unicode character is in this payload", use the inspector above it.

Troubleshooting

Why does an emoji count as 2 in JavaScript? — JavaScript string length counts UTF-16 code units. Many emoji are one Unicode code point represented by a surrogate pair, so they take two UTF-16 code units.

Why do two strings look identical but compare differently? — They may use different Unicode sequences, such as a precomposed character versus a base letter plus combining mark, or a normal space versus a non-breaking space.

Why is UTF-8 unavailable for some pasted text? — The text may contain ill-formed UTF-16 such as a lone surrogate. JavaScript can hold that raw data, but valid UTF-8 requires Unicode scalar values.

Why is a code point valid even if it renders as a box or tofu? — Unicode validity and device font support are different things. A valid code point can still lack a glyph on the current system.

Why doesn't Toolzy show an official name for every assigned character yet? — This v1 ships with a pinned local name subset for ASCII and common debugging characters so the tool stays lightweight. Status detection still works for all code points without runtime fetches.