UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c
This error occurs when a file or string that is being decoded using the UTF-8 encoding contains an invalid byte sequence. Here's an example of a Python script that could raise this error:
try:
with open('file.txt', 'r', encoding='utf8') as f:
data = f.read()
except UnicodeDecodeError as e:
print(f'Error decoding file: {e}')
In this example, the script attempts to open and read the contents of a file named "file.txt" using the UTF-8 encoding. If the file contains a byte sequence that is not valid UTF-8, the UnicodeDecodeError
will be raised and the error message will be printed.
Watch a video course
Python - The Practical Guide
To resolve this, you can try opening the file in binary mode 'rb' and then decode it using the utf-8 codec
try:
with open('file.txt', 'rb') as f:
data = f.read()
data = data.decode('utf-8', 'ignore')
except UnicodeDecodeError as e:
print(f'Error decoding file: {e}')
This will ignore any invalid bytes and decode the file using utf-8.