ASCII & Unicode Cheat Sheet

ASCII and Unicode reference with character codes, HTML entities, escape sequences, and encoding conversions. Searchable table.

63 entries 7 sections

Control

Status Code	Description	Use Case
	Null character	`\0 in C/JavaScript strings`
	Bell / alert	`\a - terminal beep`
	Backspace	`\b - move cursor back`
	Horizontal tab	`\ - tab character`
	Line feed (newline)	`\ - Unix line ending`
	Carriage return	`\r - Windows uses \r\`
	Escape	`\e or \x1b - terminal escape sequences`
	Delete	`ASCII delete character`

Symbols

Status Code	Description	Use Case
	Space	`Regular space character`
	Exclamation mark	`! → logical NOT in many languages`
	Double quote	`" → string delimiter`
	Number sign / hash	`# → comment, hashtag, CSS id`
	Dollar sign	`$ → variable prefix (PHP, Bash, jQuery)`
	Ampersand	`& → logical AND, HTML entity &`
	Single quote / apostrophe	`' → string delimiter, char literal`
	Parentheses	`( ) → grouping, function calls`
	Asterisk	`* → multiplication, pointer, wildcard`
	Plus sign	`+ → addition, string concatenation`
	Comma	`, → separator in lists, parameters`
	Hyphen / minus	`- → subtraction, CLI flags`
	Period / dot	`. → decimal, member access, file extension`
	Forward slash	`/ → division, path separator, regex`
	Colon	`: → key-value separator (JSON, YAML)`
	Semicolon	`; → statement terminator`
	Less-than, equals, greater-than	`<, =, > → comparison, HTML tags`
	At sign	`@ → email, decorators, mentions`
	Brackets, backslash	`[ ] → arrays; \ → escape character`
	Caret	`^ → XOR, regex start-of-line`
	Underscore	`_ → variable names (snake_case)`
	Backtick / grave accent	` → template literals (JS), code (Markdown)
	Braces, pipe	`{ } → blocks, objects; \| → pipe, OR`
	Tilde	`~ → home directory, bitwise NOT`

Alphanumeric

Status Code	Description	Use Case
	Digits 0-9	`0123456789`
	Uppercase A-Z	`ABCDEFGHIJKLMNOPQRSTUVWXYZ`
	Lowercase a-z	`abcdefghijklmnopqrstuvwxyz`

Unicode

Status Code	Description	Use Case
	Basic Latin (ASCII compatible)	`First 128 characters = ASCII`
	Latin-1 Supplement	`é ñ ü ö à ß - accented Latin chars`
	General Punctuation	`- - … • ′ ″ (em dash, ellipsis, bullet)`
	Arrows	`← → ↑ ↓ ↔ ⇒ ⇐ ⇔`
	Mathematical Operators	`∀ ∃ ∅ ∈ ∉ ∑ ∏ √ ∞ ≈ ≠ ≤ ≥`
	Miscellaneous Symbols	`☀ ☁ ☂ ★ ☆ ♠ ♣ ♥ ♦ ☎ ✓ ✗`
	Emoticons / Emoji	`😀 😂 🤣 😍 🤔 👍 🎉`
	Misc Symbols & Pictographs	`🌍 🔥 💡 📱 🔒 🔑 ⭐`

Encoding

Status Code	Description	Use Case
	Variable-width encoding (1-4 bytes)	`A = 1 byte; é = 2 bytes; 🎉 = 4 bytes`
	Variable-width encoding (2 or 4 bytes)	`Used internally by JavaScript and Java`
	Fixed-width encoding (4 bytes each)	`Simple but wasteful - rarely used in files`
	Original 128 characters	`0-127 - subset of UTF-8`
	8-bit Western European encoding	`0-255 - subset of UTF-8 for first 256 chars`
	Byte Order Mark	`Optional UTF-8 BOM: EF BB BF`

HTML Entities

Status Code	Description	Use Case
	Ampersand &	`<p>Tom & Jerry</p>`
	Less than / greater than	`<div> → <div>`
	Non-breaking space	`Prevents line break between words`
	Double / single quote	`"Hello" → "Hello"`
	Copyright / registered / trademark	`© ® ™`
	Em dash / en dash	`- / -`
	Horizontal ellipsis	`… (three dots)`
	Unicode character by code point	`Hex or decimal code point reference`

Escape Sequences

Status Code	Description	Use Case
	Newline (line feed)	`"line1\ line2" → two lines`
	Tab character	`"col1\ col2" → tab-separated`
	Literal backslash	`"C:\\Users\\" → C:\Users\`
	Unicode escape (4 hex digits)	`\é → é; \❤ → ❤`
	Extended unicode escape (ES6+)	`\u{1F600} → 😀`
	Hex escape (2 hex digits)	`\x41 → A; \xFF → ÿ`

Frequently asked questions

What's the difference between ASCII and Unicode?

ASCII defines 128 characters (7-bit: English letters, digits, symbols, control characters). Unicode defines 150,000+ characters covering every writing system. ASCII is a subset of Unicode - the first 128 Unicode code points are identical to ASCII.

What's the difference between UTF-8, UTF-16, and UTF-32?

They're different encodings of the same Unicode characters. UTF-8: 1-4 bytes per char, most efficient for English/Latin text, the web standard. UTF-16: 2 or 4 bytes, used by JavaScript/Java internally. UTF-32: always 4 bytes, simple but wasteful. Use UTF-8 for files and transmission.

Why do I see garbled characters (mojibake)?

Character encoding mismatch. The text was encoded in one format (e.g., UTF-8) but decoded as another (e.g., Latin-1). Fix by ensuring consistent encoding everywhere: file encoding, HTTP Content-Type header, HTML meta charset, and database character set.

What is a BOM and should I use one?

BOM (Byte Order Mark, U+FEFF) is an optional marker at the start of a file indicating encoding and byte order. For UTF-8, it's unnecessary and can cause issues (PHP errors, shell scripts failing). For UTF-16, it's important to indicate endianness. Best practice: use UTF-8 without BOM.

How do emoji work?

Emoji are Unicode characters in blocks like U+1F600-1F64F. They require 4 bytes in UTF-8. Modern emoji use sequences: skin tone modifiers (👋🏽 = wave + modifier), ZWJ sequences (👨‍👩‍👧 = man + ZWJ + woman + ZWJ + girl), and flag sequences (🇺🇸 = regional indicators U + S).

How do I type special characters?

Windows: Alt codes (Alt+0169 for ©) or Win+. for emoji. Mac: Option key combos (Option+G for ©) or Ctrl+Cmd+Space for emoji. HTML: use named entities (©) or numeric references (©). Code: use Unicode escapes (\©).

Go from reference to real skills

Cheat sheets are great for quick lookups. Our in-depth courses take you from the fundamentals to professional-level mastery.

Browse all courses