W3docs

Python CSV Files

Learn to read and write CSV files in Python using the built-in csv module, including csv.reader, csv.writer, DictReader, and DictWriter with practical examples.

CSV (Comma-Separated Values) is one of the most common formats for exchanging tabular data — every spreadsheet application, database, and data-science tool can read and write it. Python's built-in csv module handles the tricky parts automatically: quoting fields that contain commas, stripping newline quirks across operating systems, and mapping rows to dictionaries. You do not need to install anything; csv ships with every Python installation.

This chapter covers reading CSV files, writing CSV files, working with DictReader and DictWriter, handling custom delimiters, and common gotchas to avoid.

What is a CSV File?

A CSV file is a plain-text file where each line represents one row of data and each field within a row is separated by a delimiter — usually a comma. Here is a minimal example:

name,age,city
Alice,30,New York
Bob,25,London

The first row is typically a header that names each column. Subsequent rows contain the actual data. If a field value itself contains a comma, the field is wrapped in double quotes:

name,bio
Alice,"Engineer, New York"

The csv module handles this quoting transparently so you do not need to parse it by hand.

Reading CSV Files with csv.reader

csv.reader turns an open file (or any iterable of strings) into an iterator that yields each row as a Python list.

Basic pattern — read a CSV file row by row

import csv

with open("people.csv", newline="") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

The newline="" argument is important. Without it, Python's universal newline translation can insert extra blank rows on Windows because the csv module does its own newline handling internally.

Assuming people.csv contains the example data above, the output is:

['name', 'age', 'city']
['Alice', '30', 'New York']
['Bob', '25', 'London']

Notice that all values — including the number 30 — come back as strings. The csv module does not infer data types; convert them yourself when needed.

Skipping the Header Row

When you only want the data rows and not the header, call next() on the reader once to consume the first line:

Skip the header row with next()

import csv

with open("people.csv", newline="") as f:
    reader = csv.reader(f)
    header = next(reader)          # consume and store the header
    print("Columns:", header)
    for row in reader:             # only data rows remain
        name, age, city = row
        print(f"{name} is {age} years old and lives in {city}.")

Output:

Columns: ['name', 'age', 'city']
Alice is 30 years old and lives in New York.
Bob is 25 years old and lives in London.

Loading All Rows into a List

If you need the entire file in memory at once, pass the reader to list():

import csv

with open("people.csv", newline="") as f:
    reader = csv.reader(f)
    rows = list(reader)

print(rows[0])   # header row
print(rows[1])   # first data row

Output:

['name', 'age', 'city']
['Alice', '30', 'New York']

Writing CSV Files with csv.writer

csv.writer writes rows to any file-like object, automatically quoting fields that contain the delimiter, double quotes, or newline characters.

Write rows to a new CSV file

import csv

rows = [
    ["product", "price", "quantity"],
    ["Apple", 1.2, 50],
    ["Banana", 0.5, 100],
    ["Cherry", 3.0, 30],
]

with open("inventory.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(rows)

After running this, inventory.csv contains:

product,price,quantity
Apple,1.2,50
Banana,0.5,100
Cherry,3.0,30

Use writer.writerow(row) to write a single row, or writer.writerows(rows) to write many at once. Both accept any iterable.

Why newline="" Matters When Writing

On Windows, Python opens text files in a mode that translates \n to \r\n. The csv module also writes \r\n line endings by default. Together they produce \r\r\n — a blank line between every row when the file is opened in another program. Passing newline="" suppresses the extra translation and lets csv manage line endings on its own.

Reading CSV Files with csv.DictReader

DictReader maps each row to an OrderedDict (or plain dict in Python 3.8+) keyed by the column names in the header row. This is the preferred approach when columns have meaningful names and you want to access them by name rather than by index.

Read a CSV file as a sequence of dictionaries

import csv

with open("people.csv", newline="") as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(row["name"], "—", row["city"])

Output:

Alice — New York
Bob — London

DictReader reads the first row automatically as the header. You can override this by passing a fieldnames argument:

import csv

# File has no header; provide field names explicitly
with open("data_no_header.csv", newline="") as f:
    reader = csv.DictReader(f, fieldnames=["name", "age", "city"])
    for row in reader:
        print(row)

The reader.fieldnames attribute always holds the list of column names in use, which is handy for introspection before processing rows.

Writing CSV Files with csv.DictWriter

DictWriter is the counterpart to DictReader. You provide the column names up front, then write dictionaries — the writer maps each key to the correct column.

Write a list of dictionaries to a CSV file

import csv

people = [
    {"name": "Alice", "age": 30, "city": "New York"},
    {"name": "Bob", "age": 25, "city": "London"},
]

fieldnames = ["name", "age", "city"]

with open("people_out.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()          # writes the column-name row
    writer.writerows(people)

The resulting file:

name,age,city
Alice,30,New York
Bob,25,London

writeheader() uses the fieldnames list you supplied at construction. Call it once before any writerow() calls.

Handling Extra or Missing Keys

By default, DictWriter raises a ValueError if a dictionary contains a key that is not in fieldnames. You can change this behaviour with the extrasaction parameter:

writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction="ignore")

Conversely, if a dictionary is missing a key, the writer writes an empty string for that field unless you supply a restval default:

writer = csv.DictWriter(f, fieldnames=fieldnames, restval="N/A")

Custom Delimiters and Quoting

Real-world CSV files are not always comma-delimited. Tab-separated values (TSV) files and pipe-delimited files are common. Use the delimiter parameter to handle them:

Read a tab-separated file

import csv

with open("scores.tsv", newline="") as f:
    reader = csv.reader(f, delimiter="\t")
    for row in reader:
        print(row)

Write a pipe-delimited file

import csv

with open("output.psv", "w", newline="") as f:
    writer = csv.writer(f, delimiter="|")
    writer.writerow(["id", "name", "score"])
    writer.writerow([1, "Alice", 98])
    writer.writerow([2, "Bob", 87])

Output file:

id|name|score
1|Alice|98
2|Bob|87

Quoting Constants

The quoting parameter controls which fields get quoted in the output:

ConstantValueBehaviour
csv.QUOTE_MINIMAL0Quote only fields that contain the delimiter, quotechar, or a newline (default)
csv.QUOTE_ALL1Quote every field
csv.QUOTE_NONNUMERIC2Quote all non-numeric fields; reader converts unquoted fields to float
csv.QUOTE_NONE3Never quote; raise error if the delimiter appears in a field

Force every field to be quoted

import csv, io

output = io.StringIO()
writer = csv.writer(output, quoting=csv.QUOTE_ALL)
writer.writerow(["name", "bio"])
writer.writerow(["Alice", "Engineer, New York"])
print(output.getvalue())

Output:

"name","bio"
"Alice","Engineer, New York"

Using io.StringIO for In-Memory CSV

When you do not need to touch the filesystem — for example, in tests or when processing CSV data received from an API — use io.StringIO as the file-like object:

Parse CSV from a string

import csv
import io

raw = "name,score\nAlice,95\nBob,87\n"

reader = csv.DictReader(io.StringIO(raw))
for row in reader:
    print(row["name"], "scored", row["score"])

Output:

Alice scored 95
Bob scored 87

Common Gotchas

All Values Are Strings

csv.reader and DictReader always return strings. Convert values explicitly:

age = int(row["age"])
price = float(row["price"])

Encoding Issues

Open files with the correct encoding to avoid UnicodeDecodeError. UTF-8 is the most common encoding for modern CSV files, but files exported from Excel may use latin-1 or cp1252:

with open("data.csv", newline="", encoding="utf-8") as f:
    reader = csv.reader(f)

Blank Rows

If your CSV file contains blank lines between data rows, csv.reader yields empty lists [] for them. Filter them out:

import csv

with open("data.csv", newline="") as f:
    reader = csv.reader(f)
    for row in reader:
        if not row:        # skip blank lines
            continue
        print(row)

Error Handling

Wrap file operations in a try/except block to handle missing files and permission errors gracefully:

import csv

try:
    with open("data.csv", newline="") as f:
        reader = csv.reader(f)
        for row in reader:
            print(row)
except FileNotFoundError:
    print("Error: data.csv was not found.")
except PermissionError:
    print("Error: no permission to read data.csv.")

csv vs Pandas for Large Files

The csv module is ideal for:

  • Small to medium files (up to a few hundred MB)
  • Scripts that do not have Pandas installed
  • Situations where you need fine-grained control over reading and writing

For large datasets, complex filtering, or aggregation operations, the third-party pandas library provides pd.read_csv() and DataFrame.to_csv(), which are significantly faster and more feature-rich.

Putting It All Together

The following example reads a CSV file, filters rows based on a condition, and writes the filtered results to a new file:

Filter rows and write a new CSV file

import csv

input_file = "inventory.csv"
output_file = "expensive.csv"

with open(input_file, newline="") as infile, \
     open(output_file, "w", newline="") as outfile:

    reader = csv.DictReader(infile)
    writer = csv.DictWriter(outfile, fieldnames=reader.fieldnames)

    writer.writeheader()
    for row in reader:
        if float(row["price"]) >= 1.0:
            writer.writerow(row)

print(f"Filtered rows written to {output_file}.")

This pattern — open both files in the same with block, stream rows from the reader to the writer — handles files of any size without loading everything into memory at once.

Practice

Practice
Which csv module class maps each CSV row to a dictionary keyed by column names?
Which csv module class maps each CSV row to a dictionary keyed by column names?
Was this page helpful?