How-to articles, tricks, and solutions about UNICODE

"Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3

This error occurs when trying to open a file that contains escape characters (such as \) in the file path, and the escape characters are not being properly interpreted by Python.

Convert a Unicode string to a string in Python (containing extra symbols)

You can use the .encode() method to convert a Unicode string to a string containing extra symbols in Python.

How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?

In PHP, you can use the utf8_decode() function to decode a string that contains Unicode escape sequences.

Unicode (UTF-8) reading and writing to files in Python

To read a file in Unicode (UTF-8) encoding in Python, you can use the built-in open() function, specifying the encoding as "utf-8".

Unicode character in PHP string

You can use Unicode characters in a PHP string by including the character directly in the string, or by using the \u escape sequence followed by the Unicode code point of the character in hexadecimal.

UnicodeDecodeError, invalid continuation byte

Here is an example of a Python code snippet that could cause a UnicodeDecodeError: invalid continuation byte error:

UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>

This error occurs when trying to decode a string using the 'charmap' codec, which is typically used for Windows-1252 character encoding.

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)

This error is raised when trying to encode a Unicode string using the ASCII codec, and the string contains a character that is not within the ASCII range (0-127).

What does the 'b' character do in front of a string literal?

The 'b' character in front of a string literal indicates that the string is a bytes literal.