ISO-8859-1
ISO-8859-1 (Latin-1) is a legacy single-byte character encoding. Learn how it maps 256 code points, why browsers treat it as Windows-1252, and its entities.
ISO-8859-1 (named after the International Organization for Standardization, and also known as Latin-1) is a legacy single-byte character encoding. This page explains what it is, where you still encounter it, the surprising way browsers actually handle it, and the full character/entity reference.
Note that ISO-8859-1 is not the default in modern browsers. Since HTML5, the default character set is UTF-8, which is the encoding you should use for every new document. ISO-8859-1 matters today mainly when you read or maintain older pages.
What ISO-8859-1 Is
ISO-8859-1 is a single-byte encoding: every character is stored in exactly one byte, so it can represent a maximum of 256 code points, numbered 0 to 255. Those 256 slots split into two halves:
- 0–127 — identical to ASCII. The basic Latin letters
A–Zanda–z, the digits0–9, punctuation, the space, and control characters all live here. - 128–255 — the Latin-1 supplement: accented letters (à, é, ñ, ü), and symbols such as ©, £, ¥, ½, and ÷. These cover most Western European languages.
Because it is single-byte, ISO-8859-1 cannot represent characters outside this set — there is no way to encode, for example, the euro sign €, Greek, Cyrillic, or any CJK script. That limitation is exactly why the multi-byte UTF-8 replaced it. For the bigger picture of how encodings relate, see HTML Character Sets.
Historical context
In the 1990s and early 2000s, ISO-8859-1 was the default fallback encoding for HTTP and HTML on the Western web, so a great many older pages were authored with it. You still encounter it today in legacy HTML files, databases, email headers, and HTTP responses that have not been migrated to UTF-8. Recognizing it helps you debug the classic "mojibake" problem, where accented characters render as garbled symbols because a file's actual bytes and its declared encoding disagree.
The Windows-1252 Gotcha
Here is the most common source of confusion. Per the WHATWG Encoding Standard, when a browser sees a document declared as charset=ISO-8859-1, it does not decode it as true ISO-8859-1. Instead it decodes it as Windows-1252.
The difference lies in the 128–159 range. In genuine ISO-8859-1 those positions are unused C1 control characters. Windows-1252 reuses that range for printable characters such as the euro sign (€), curly quotes (“ ”, ‘ ’), the em dash (—), and the trademark sign (™). Because real-world content so often expected those Windows characters, the standard mandates that ISO-8859-1 (and its aliases latin1, iso8859-1, etc.) be treated as Windows-1252 for HTML decoding.
The practical takeaway: a <meta charset="ISO-8859-1"> declaration and a <meta charset="windows-1252"> declaration behave identically in browsers. When you are unsure which legacy encoding a page uses, this is usually why the characters in the 128–159 range still appear correctly.
Declaring the Character Encoding
Use <meta charset="UTF-8"> to declare the encoding of your HTML document, and place it inside the <head> section:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
</head>
<body>
<!-- Your content here -->
</body>
</html>The placement matters. The HTML standard requires the <meta charset> declaration to appear within the first 1024 bytes of the document. The browser starts reading bytes before it knows the encoding, so the declaration has to come early enough for the browser to find it and re-interpret the rest of the page correctly. In the example above, <meta charset="UTF-8"> is the very first thing inside <head>, comfortably inside that window. To declare a legacy encoding instead, you would write <meta charset="ISO-8859-1"> (which, as noted above, the browser treats as Windows-1252).
Reserved Characters in HTML
Some characters are reserved in HTML because they are used to make up the HTML language. For example, you cannot use the greater-than or less-than signs in your text, as the browser will try to interpret them as HTML. Use the entity name or entity number when you want to output any of the reserved characters.
See the list of the reserved characters in the table below:
| Character | Entity Number | Entity Name | Description |
|---|---|---|---|
| " | " | " | quotation mark |
| ' | ' | ' | apostrophe |
| & | & | & | ampersand |
| < | < | < | less-than |
| > | > | > | greater-than |
For the complete reference of named character references, see HTML Entities.
ISO 8859-1 Symbols
| Character | Entity Number | Entity Name | Description |
|---|---|---|---|
| non-breaking space | |||
| ¡ | ¡ | ¡ | inverted exclamation mark |
| ¢ | ¢ | ¢ | cent |
| £ | £ | £ | pound |
| ¤ | ¤ | ¤ | currency |
| ¥ | ¥ | ¥ | yen |
| ¦ | ¦ | ¦ | broken vertical bar |
| § | § | § | section |
| ¨ | ¨ | ¨ | spacing diaeresis |
| © | © | © | copyright |
| ª | ª | ª | feminine ordinal indicator |
| « | « | « | angle quotation mark (left) |
| ¬ | ¬ | ¬ | negation |
| | | | soft hyphen |
| ® | ® | ® | registered trademark |
| ¯ | ¯ | ¯ | spacing macron |
| ° | ° | ° | degree |
| ± | ± | ± | plus-or-minus |
| ² | ² | ² | superscript 2 |
| ³ | ³ | ³ | superscript 3 |
| ´ | ´ | ´ | spacing acute |
| µ | µ | µ | micro |
| ¶ | ¶ | ¶ | paragraph |
| · | · | · | middle dot |
| ¸ | ¸ | ¸ | spacing cedilla |
| ¹ | ¹ | ¹ | superscript 1 |
| º | º | º | masculine ordinal indicator |
| » | » | » | angle quotation mark (right) |
| ¼ | ¼ | ¼ | fraction 1/4 |
| ½ | ½ | ½ | fraction 1/2 |
| ¾ | ¾ | ¾ | fraction 3/4 |
| ¿ | ¿ | ¿ | inverted question mark |
| × | × | × | multiplication |
| ÷ | ÷ | ÷ | division |
ISO 8859-1 Characters
| Character | Entity Number | Entity Name | Description |
|---|---|---|---|
| À | À | À | capital a, grave accent |
| Á | Á | Á | capital a, acute accent |
| Â | Â | Â | capital a, circumflex accent |
| Ã | Ã | Ã | capital a, tilde |
| Ä | Ä | Ä | capital a, umlaut mark |
| Å | Å | Å | capital a, ring |
| Æ | Æ | Æ | capital ae |
| Ç | Ç | Ç | capital c, cedilla |
| È | È | È | capital e, grave accent |
| É | É | É | capital e, acute accent |
| Ê | Ê | Ê | capital e, circumflex accent |
| Ë | Ë | Ë | capital e, umlaut mark |
| Ì | Ì | Ì | capital i, grave accent |
| Í | Í | Í | capital i, acute accent |
| Î | Î | Î | capital i, circumflex accent |
| Ï | Ï | Ï | capital i, umlaut mark |
| Ð | Ð | Ð | capital eth, Icelandic |
| Ñ | Ñ | Ñ | capital n, tilde |
| Ò | Ò | Ò | capital o, grave accent |
| Ó | Ó | Ó | capital o, acute accent |
| Ô | Ô | Ô | capital o, circumflex accent |
| Õ | Õ | Õ | capital o, tilde |
| Ö | Ö | Ö | capital o, umlaut mark |
| Ø | Ø | Ø | capital o, slash |
| Ù | Ù | Ù | capital u, grave accent |
| Ú | Ú | Ú | capital u, acute accent |
| Û | Û | Û | capital u, circumflex accent |
| Ü | Ü | Ü | capital u, umlaut mark |
| Ý | Ý | Ý | capital y, acute accent |
| Þ | Þ | Þ | capital THORN, Icelandic |
| ß | ß | ß | small sharp s, German |
| à | à | à | small a, grave accent |
| á | á | á | small a, acute accent |
| â | â | â | small a, circumflex accent |
| ã | ã | ã | small a, tilde |
| ä | ä | ä | small a, umlaut mark |
| å | å | å | small a, ring |
| æ | æ | æ | small ae |
| ç | ç | ç | small c, cedilla |
| è | è | è | small e, grave accent |
| é | é | é | small e, acute accent |
| ê | ê | ê | small e, circumflex accent |
| ë | ë | ë | small e, umlaut mark |
| ì | ì | ì | small i, grave accent |
| í | í | í | small i, acute accent |
| î | î | î | small i, circumflex accent |
| ï | ï | ï | small i, umlaut mark |
| ð | ð | ð | small eth, Icelandic |
| ñ | ñ | ñ | small n, tilde |
| ò | ò | ò | small o, grave accent |
| ó | ó | ó | small o, acute accent |
| ô | ô | ô | small o, circumflex accent |
| õ | õ | õ | small o, tilde |
| ö | ö | ö | small o, umlaut mark |
| ø | ø | ø | small o, slash |
| ù | ù | ù | small u, grave accent |
| ú | ú | ú | small u, acute accent |
| û | û | û | small u, circumflex accent |
| ü | ü | ü | small u, umlaut mark |
| ý | ý | ý | small y, acute accent |
| þ | þ | þ | small thorn, Icelandic |
| ÿ | ÿ | ÿ | small y, umlaut mark |
Variants of ISO-8859-1
ISO-8859-1 is only the first part of the larger ISO 8859 family. Each part keeps the ASCII lower half (0–127) but swaps the upper half (128–255) to cover a different group of languages or scripts. The most common parts are listed below.
| Character set | Description | Covers |
|---|---|---|
| ISO-8859-1 | Latin 1 | North America, Western Europe, Latin America, the Caribbean, Canada, Africa. |
| ISO-8859-2 | Latin 2 | Eastern Europe. |
| ISO-8859-3 | Latin 3 | SE Europe, Esperanto, miscellaneous others. |
| ISO-8859-4 | Latin 4 | Scandinavia/Baltics (and others not in ISO-8859-1). |
| ISO-8859-5 | Latin/Cyrillic | The languages that use a Cyrillic alphabet such as Bulgarian, Belarusian, Russian and Macedonian. |
| ISO-8859-6 | Latin/Arabic | The languages that use the Arabic alphabet. |
| ISO-8859-7 | Latin/Greek | The modern Greek language as well as mathematical symbols derived from the Greek. |
| ISO-8859-8 | Latin/Hebrew | The languages that use the Hebrew alphabet. |
| ISO-8859-9 | Latin/Turkish | The Turkish language. Same as ISO-8859-1 except Turkish characters replace Icelandic ones. |
| ISO-8859-10 | Latin/Nordic | The Nordic languages. |
| ISO-8859-15 | Latin 9 (Latin 0) | Similar to ISO-8859-1 but replaces some less common symbols with the euro sign and some other missing characters. |
Modern browsers automatically detect or fall back to UTF-8 when no encoding is specified. Legacy encodings like ISO-8859-1 are primarily supported for backward compatibility with older web pages. For new projects, always use UTF-8 to ensure full Unicode support and cross-platform consistency.
See also HTML Character Sets, HTML ASCII, and HTML Entities.