UnicodeDecodeError: 'utf8' codec can't decode byte 0x9c

This error occurs when a file or string that is being decoded using the UTF-8 encoding contains an invalid byte sequence. Here's an example of a Python script that could raise this error:

try:
    with open('file.txt', 'r', encoding='utf8') as f:
        data = f.read()
except UnicodeDecodeError as e:
    print(f'Error decoding file: {e}')

In this example, the script attempts to open and read the contents of a file named "file.txt" using the UTF-8 encoding. If the file contains a byte sequence that is not valid UTF-8, the UnicodeDecodeError will be raised and the error message will be printed.

Watch a course Python - The Practical Guide

To resolve this, you can try opening the file in binary mode 'rb' and then decode it using the utf-8 codec

try:
    with open('file.txt', 'rb') as f:
        data = f.read()
        data = data.decode('utf-8', 'ignore')

except UnicodeDecodeError as e:
    print(f'Error decoding file: {e}')

This will ignore any invalid bytes and decode the file using utf-8.