ASCII & Unicode Cheat Sheet

ASCII and Unicode reference with character codes, HTML entities, escape sequences, and encoding conversions. Searchable table.

63 entries 7 sections

Control

Status Code Description Use Case
Null character \0 in C/JavaScript strings
Bell / alert \a - terminal beep
Backspace \b - move cursor back
Horizontal tab \ - tab character
Line feed (newline) \ - Unix line ending
Carriage return \r - Windows uses \r\
Escape \e or \x1b - terminal escape sequences
Delete ASCII delete character

Symbols

Status Code Description Use Case
Space Regular space character
Exclamation mark ! โ†’ logical NOT in many languages
Double quote " โ†’ string delimiter
Number sign / hash # โ†’ comment, hashtag, CSS id
Dollar sign $ โ†’ variable prefix (PHP, Bash, jQuery)
Ampersand & โ†’ logical AND, HTML entity &
Single quote / apostrophe ' โ†’ string delimiter, char literal
Parentheses ( ) โ†’ grouping, function calls
Asterisk * โ†’ multiplication, pointer, wildcard
Plus sign + โ†’ addition, string concatenation
Comma , โ†’ separator in lists, parameters
Hyphen / minus - โ†’ subtraction, CLI flags
Period / dot . โ†’ decimal, member access, file extension
Forward slash / โ†’ division, path separator, regex
Colon : โ†’ key-value separator (JSON, YAML)
Semicolon ; โ†’ statement terminator
Less-than, equals, greater-than <, =, > โ†’ comparison, HTML tags
At sign @ โ†’ email, decorators, mentions
Brackets, backslash [ ] โ†’ arrays; \ โ†’ escape character
Caret ^ โ†’ XOR, regex start-of-line
Underscore _ โ†’ variable names (snake_case)
Backtick / grave accent ` โ†’ template literals (JS), code (Markdown)
Braces, pipe { } โ†’ blocks, objects; | โ†’ pipe, OR
Tilde ~ โ†’ home directory, bitwise NOT

Alphanumeric

Status Code Description Use Case
Digits 0-9 0123456789
Uppercase A-Z ABCDEFGHIJKLMNOPQRSTUVWXYZ
Lowercase a-z abcdefghijklmnopqrstuvwxyz

Unicode

Status Code Description Use Case
Basic Latin (ASCII compatible) First 128 characters = ASCII
Latin-1 Supplement รฉ รฑ รผ รถ ร  รŸ - accented Latin chars
General Punctuation - - โ€ฆ โ€ข โ€ฒ โ€ณ (em dash, ellipsis, bullet)
Arrows โ† โ†’ โ†‘ โ†“ โ†” โ‡’ โ‡ โ‡”
Mathematical Operators โˆ€ โˆƒ โˆ… โˆˆ โˆ‰ โˆ‘ โˆ โˆš โˆž โ‰ˆ โ‰  โ‰ค โ‰ฅ
Miscellaneous Symbols โ˜€ โ˜ โ˜‚ โ˜… โ˜† โ™  โ™ฃ โ™ฅ โ™ฆ โ˜Ž โœ“ โœ—
Emoticons / Emoji ๐Ÿ˜€ ๐Ÿ˜‚ ๐Ÿคฃ ๐Ÿ˜ ๐Ÿค” ๐Ÿ‘ ๐ŸŽ‰
Misc Symbols & Pictographs ๐ŸŒ ๐Ÿ”ฅ ๐Ÿ’ก ๐Ÿ“ฑ ๐Ÿ”’ ๐Ÿ”‘ โญ

Encoding

Status Code Description Use Case
Variable-width encoding (1-4 bytes) A = 1 byte; รฉ = 2 bytes; ๐ŸŽ‰ = 4 bytes
Variable-width encoding (2 or 4 bytes) Used internally by JavaScript and Java
Fixed-width encoding (4 bytes each) Simple but wasteful - rarely used in files
Original 128 characters 0-127 - subset of UTF-8
8-bit Western European encoding 0-255 - subset of UTF-8 for first 256 chars
Byte Order Mark Optional UTF-8 BOM: EF BB BF

HTML Entities

Status Code Description Use Case
Ampersand & <p>Tom &amp; Jerry</p>
Less than / greater than &lt;div&gt; โ†’ <div>
Non-breaking space Prevents line break between words
Double / single quote &quot;Hello&quot; โ†’ "Hello"
Copyright / registered / trademark ยฉ ยฎ โ„ข
Em dash / en dash - / -
Horizontal ellipsis โ€ฆ (three dots)
Unicode character by code point Hex or decimal code point reference

Escape Sequences

Status Code Description Use Case
Newline (line feed) "line1\ line2" โ†’ two lines
Tab character "col1\ col2" โ†’ tab-separated
Literal backslash "C:\\Users\\" โ†’ C:\Users\
Unicode escape (4 hex digits) \รฉ โ†’ รฉ; \โค โ†’ โค
Extended unicode escape (ES6+) \u{1F600} โ†’ ๐Ÿ˜€
Hex escape (2 hex digits) \x41 โ†’ A; \xFF โ†’ รฟ

Frequently asked questions

What's the difference between ASCII and Unicode?

ASCII defines 128 characters (7-bit: English letters, digits, symbols, control characters). Unicode defines 150,000+ characters covering every writing system. ASCII is a subset of Unicode - the first 128 Unicode code points are identical to ASCII.

What's the difference between UTF-8, UTF-16, and UTF-32?

They're different encodings of the same Unicode characters. UTF-8: 1-4 bytes per char, most efficient for English/Latin text, the web standard. UTF-16: 2 or 4 bytes, used by JavaScript/Java internally. UTF-32: always 4 bytes, simple but wasteful. Use UTF-8 for files and transmission.

Why do I see garbled characters (mojibake)?

Character encoding mismatch. The text was encoded in one format (e.g., UTF-8) but decoded as another (e.g., Latin-1). Fix by ensuring consistent encoding everywhere: file encoding, HTTP Content-Type header, HTML meta charset, and database character set.

What is a BOM and should I use one?

BOM (Byte Order Mark, U+FEFF) is an optional marker at the start of a file indicating encoding and byte order. For UTF-8, it's unnecessary and can cause issues (PHP errors, shell scripts failing). For UTF-16, it's important to indicate endianness. Best practice: use UTF-8 without BOM.

How do emoji work?

Emoji are Unicode characters in blocks like U+1F600-1F64F. They require 4 bytes in UTF-8. Modern emoji use sequences: skin tone modifiers (๐Ÿ‘‹๐Ÿฝ = wave + modifier), ZWJ sequences (๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ง = man + ZWJ + woman + ZWJ + girl), and flag sequences (๐Ÿ‡บ๐Ÿ‡ธ = regional indicators U + S).

How do I type special characters?

Windows: Alt codes (Alt+0169 for ยฉ) or Win+. for emoji. Mac: Option key combos (Option+G for ยฉ) or Ctrl+Cmd+Space for emoji. HTML: use named entities (&copy;) or numeric references (&#169;). Code: use Unicode escapes (\ยฉ).

Go from reference to real skills

Cheat sheets are great for quick lookups. Our in-depth courses take you from the fundamentals to professional-level mastery.

Browse all courses