PandasHow-ToBeginner · 3 min read

How to Use read_csv Parameters in pandas for Data Loading

Use pandas.read_csv() with parameters like filepath_or_buffer to specify the file path, sep for delimiter, header to set header row, and usecols to select columns. These parameters help control how CSV data is read into a DataFrame.

📐

Syntax

The basic syntax of pandas.read_csv() is:

filepath_or_buffer: Path or URL of the CSV file.
sep: Character that separates columns, default is comma (,).
header: Row number to use as column names, default is 'infer' (usually 0 if header exists).
usecols: List of columns to read from the file.
dtype: Data type for columns.
skiprows: Number of rows or list of rows to skip at the start.
nrows: Number of rows to read.

These parameters customize how the CSV file is loaded into a DataFrame.

python

pandas.read_csv(filepath_or_buffer, sep=',', header='infer', usecols=None, dtype=None, skiprows=None, nrows=None)

💻

Example

This example shows how to read a CSV file with a semicolon separator, skip the first row, and select only specific columns.

python

import pandas as pd
from io import StringIO

csv_data = '''Name;Age;City;Salary
John;28;New York;70000
Anna;22;Los Angeles;80000
Mike;32;Chicago;65000'''

# Use StringIO to simulate a file object
file_like = StringIO(csv_data)

df = pd.read_csv(file_like, sep=';', skiprows=1, usecols=['Name', 'City'])
print(df)

Output

Name City 0 John New York 1 Anna Los Angeles 2 Mike Chicago

⚠️

Common Pitfalls

Common mistakes when using read_csv include:

Not setting the correct sep when the delimiter is not a comma.
Forgetting that header=0 means the first row is used as column names, so skipping rows can misalign headers.
Using usecols with column names that don't exist causes errors.
Not handling missing values or incorrect data types.

Always check your CSV file format before setting parameters.

python

import pandas as pd
from io import StringIO

csv_data = 'A|B|C\n1|2|3\n4|5|6'
file_like = StringIO(csv_data)

# Wrong: default sep=',' but file uses '|'
try:
    df_wrong = pd.read_csv(file_like)
except Exception as e:
    print(f'Error: {e}')

# Right: specify sep='|'
file_like.seek(0)  # reset pointer

df_right = pd.read_csv(file_like, sep='|')
print(df_right)

Output

Error: Error tokenizing data. C error: Expected 1 fields in line 2, saw 3 A B C 0 1 2 3 1 4 5 6

📊

Quick Reference

Parameter	Description	Default
filepath_or_buffer	File path or object to read	None (required)
sep	Delimiter character	','
header	Row number for column names	0
usecols	Columns to read (list or callable)	None (all columns)
dtype	Data type for columns	None (infer)
skiprows	Rows to skip at start	None
nrows	Number of rows to read	None (all rows)

✅

Key Takeaways

Specify the correct delimiter with the sep parameter to avoid parsing errors.

Use header to control which row is used as column names, especially when skipping rows.

Select only needed columns with usecols to save memory and speed up loading.

Always check your CSV file format before setting parameters to avoid common mistakes.

read_csv is flexible and powerful for loading CSV data into pandas DataFrames.