Working with UTF-8 encoding in Python source

Here is a code snippet that demonstrates how to work with UTF-8 encoding in a Python source file:

# Encode a string in UTF-8
string = "Hello, 世界"
encoded_string = string.encode("utf-8")
print(encoded_string) # b'Hello, \xe4\xb8\x96\xe7\x95\x8c'

# Decode a UTF-8 encoded string
decoded_string = encoded_string.decode("utf-8")
print(decoded_string) # 'Hello, 世界'

In this example, the encode() method is used to convert a string to a UTF-8 encoded byte sequence, and the decode() method is used to convert a UTF-8 encoded byte sequence back to a string.

Watch a course Python - The Practical Guide

You can also read and write files with utf-8 encoding

with open("file.txt", "w", encoding="utf-8") as f:
    f.write("Hello, 世界")

with open("file.txt", "r", encoding="utf-8") as f:
    print(f.read()) # 'Hello, 世界'