ASCII & Unicode Cheat Sheet
ASCII and Unicode reference with character codes, HTML entities, escape sequences, and encoding conversions. Searchable table.
Control
| Status Code | Description | Use Case |
|---|---|---|
| Null character | \0 in C/JavaScript strings | |
| Bell / alert | \a - terminal beep | |
| Backspace | \b - move cursor back | |
| Horizontal tab | \ - tab character | |
| Line feed (newline) | \
- Unix line ending | |
| Carriage return | \r - Windows uses \r\ | |
| Escape | \e or \x1b - terminal escape sequences | |
| Delete | ASCII delete character |
Symbols
| Status Code | Description | Use Case |
|---|---|---|
| Space | Regular space character | |
| Exclamation mark | ! โ logical NOT in many languages | |
| Double quote | " โ string delimiter | |
| Number sign / hash | # โ comment, hashtag, CSS id | |
| Dollar sign | $ โ variable prefix (PHP, Bash, jQuery) | |
| Ampersand | & โ logical AND, HTML entity & | |
| Single quote / apostrophe | ' โ string delimiter, char literal | |
| Parentheses | ( ) โ grouping, function calls | |
| Asterisk | * โ multiplication, pointer, wildcard | |
| Plus sign | + โ addition, string concatenation | |
| Comma | , โ separator in lists, parameters | |
| Hyphen / minus | - โ subtraction, CLI flags | |
| Period / dot | . โ decimal, member access, file extension | |
| Forward slash | / โ division, path separator, regex | |
| Colon | : โ key-value separator (JSON, YAML) | |
| Semicolon | ; โ statement terminator | |
| Less-than, equals, greater-than | <, =, > โ comparison, HTML tags | |
| At sign | @ โ email, decorators, mentions | |
| Brackets, backslash | [ ] โ arrays; \ โ escape character | |
| Caret | ^ โ XOR, regex start-of-line | |
| Underscore | _ โ variable names (snake_case) | |
| Backtick / grave accent | ` โ template literals (JS), code (Markdown) | |
| Braces, pipe | { } โ blocks, objects; | โ pipe, OR | |
| Tilde | ~ โ home directory, bitwise NOT |
Alphanumeric
| Status Code | Description | Use Case |
|---|---|---|
| Digits 0-9 | 0123456789 | |
| Uppercase A-Z | ABCDEFGHIJKLMNOPQRSTUVWXYZ | |
| Lowercase a-z | abcdefghijklmnopqrstuvwxyz |
Unicode
| Status Code | Description | Use Case |
|---|---|---|
| Basic Latin (ASCII compatible) | First 128 characters = ASCII | |
| Latin-1 Supplement | รฉ รฑ รผ รถ ร ร - accented Latin chars | |
| General Punctuation | - - โฆ โข โฒ โณ (em dash, ellipsis, bullet) | |
| Arrows | โ โ โ โ โ โ โ โ | |
| Mathematical Operators | โ โ โ
โ โ โ โ โ โ โ โ โค โฅ | |
| Miscellaneous Symbols | โ โ โ โ
โ โ โฃ โฅ โฆ โ โ โ | |
| Emoticons / Emoji | ๐ ๐ ๐คฃ ๐ ๐ค ๐ ๐ | |
| Misc Symbols & Pictographs | ๐ ๐ฅ ๐ก ๐ฑ ๐ ๐ โญ |
Encoding
| Status Code | Description | Use Case |
|---|---|---|
| Variable-width encoding (1-4 bytes) | A = 1 byte; รฉ = 2 bytes; ๐ = 4 bytes | |
| Variable-width encoding (2 or 4 bytes) | Used internally by JavaScript and Java | |
| Fixed-width encoding (4 bytes each) | Simple but wasteful - rarely used in files | |
| Original 128 characters | 0-127 - subset of UTF-8 | |
| 8-bit Western European encoding | 0-255 - subset of UTF-8 for first 256 chars | |
| Byte Order Mark | Optional UTF-8 BOM: EF BB BF |
HTML Entities
| Status Code | Description | Use Case |
|---|---|---|
| Ampersand & | <p>Tom & Jerry</p> | |
| Less than / greater than | <div> โ <div> | |
| Non-breaking space | Prevents line break between words | |
| Double / single quote | "Hello" โ "Hello" | |
| Copyright / registered / trademark | ยฉ ยฎ โข | |
| Em dash / en dash | - / - | |
| Horizontal ellipsis | โฆ (three dots) | |
| Unicode character by code point | Hex or decimal code point reference |
Escape Sequences
| Status Code | Description | Use Case |
|---|---|---|
| Newline (line feed) | "line1\
line2" โ two lines | |
| Tab character | "col1\ col2" โ tab-separated | |
| Literal backslash | "C:\\Users\\" โ C:\Users\ | |
| Unicode escape (4 hex digits) | \รฉ โ รฉ; \โค โ โค | |
| Extended unicode escape (ES6+) | \u{1F600} โ ๐ | |
| Hex escape (2 hex digits) | \x41 โ A; \xFF โ รฟ |
Frequently asked questions
What's the difference between ASCII and Unicode?
ASCII defines 128 characters (7-bit: English letters, digits, symbols, control characters). Unicode defines 150,000+ characters covering every writing system. ASCII is a subset of Unicode - the first 128 Unicode code points are identical to ASCII.
What's the difference between UTF-8, UTF-16, and UTF-32?
They're different encodings of the same Unicode characters. UTF-8: 1-4 bytes per char, most efficient for English/Latin text, the web standard. UTF-16: 2 or 4 bytes, used by JavaScript/Java internally. UTF-32: always 4 bytes, simple but wasteful. Use UTF-8 for files and transmission.
Why do I see garbled characters (mojibake)?
Character encoding mismatch. The text was encoded in one format (e.g., UTF-8) but decoded as another (e.g., Latin-1). Fix by ensuring consistent encoding everywhere: file encoding, HTTP Content-Type header, HTML meta charset, and database character set.
What is a BOM and should I use one?
BOM (Byte Order Mark, U+FEFF) is an optional marker at the start of a file indicating encoding and byte order. For UTF-8, it's unnecessary and can cause issues (PHP errors, shell scripts failing). For UTF-16, it's important to indicate endianness. Best practice: use UTF-8 without BOM.
How do emoji work?
Emoji are Unicode characters in blocks like U+1F600-1F64F. They require 4 bytes in UTF-8. Modern emoji use sequences: skin tone modifiers (๐๐ฝ = wave + modifier), ZWJ sequences (๐จโ๐ฉโ๐ง = man + ZWJ + woman + ZWJ + girl), and flag sequences (๐บ๐ธ = regional indicators U + S).
How do I type special characters?
Windows: Alt codes (Alt+0169 for ยฉ) or Win+. for emoji. Mac: Option key combos (Option+G for ยฉ) or Ctrl+Cmd+Space for emoji. HTML: use named entities (©) or numeric references (©). Code: use Unicode escapes (\ยฉ).
Go from reference to real skills
Cheat sheets are great for quick lookups. Our in-depth courses take you from the fundamentals to professional-level mastery.
Browse all courses