How to Create DataFrame from CSV in pandas - Simple Guide
To create a DataFrame from a CSV file in pandas, use the
pd.read_csv('filename.csv') function. This reads the CSV file and converts it into a DataFrame for easy data analysis.Syntax
The basic syntax to create a DataFrame from a CSV file is:
pd.read_csv(filepath_or_buffer, sep=',', header='infer', names=None, index_col=None)
Here:
filepath_or_buffer: Path to the CSV file.sep: Delimiter used in the file (default is comma).header: Row number to use as column names (default is first row).names: List of column names to use if no header row.index_col: Column(s) to set as index.
python
import pandas as pd df = pd.read_csv('data.csv')
Example
This example shows how to read a CSV file named data.csv into a DataFrame and display its content.
python
import pandas as pd # Create a sample CSV file csv_content = '''Name,Age,City Alice,30,New York Bob,25,Los Angeles Charlie,35,Chicago''' with open('data.csv', 'w') as file: file.write(csv_content) # Read the CSV file into a DataFrame df = pd.read_csv('data.csv') # Display the DataFrame print(df)
Output
Name Age City
0 Alice 30 New York
1 Bob 25 Los Angeles
2 Charlie 35 Chicago
Common Pitfalls
Common mistakes when creating DataFrames from CSV files include:
- Using the wrong file path or filename causes a
FileNotFoundError. - Incorrect delimiter if the CSV uses tabs or semicolons instead of commas.
- Not specifying
header=Nonewhen the file has no header row, which causes the first row to be treated as column names. - Forgetting to handle encoding issues, which can cause errors with special characters.
python
import pandas as pd # Wrong way: assuming comma delimiter but file uses semicolon # df = pd.read_csv('data_semicolon.csv') # This will cause wrong parsing # Right way: specify the correct delimiter # df = pd.read_csv('data_semicolon.csv', sep=';')
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| filepath_or_buffer | Path to the CSV file | Required |
| sep | Delimiter used in the file | ',' (comma) |
| header | Row number to use as column names | infer |
| names | List of column names if no header | None |
| index_col | Column(s) to set as index | None |
| encoding | File encoding type | 'utf-8' |
Key Takeaways
Use pd.read_csv('filename.csv') to load CSV data into a pandas DataFrame.
Specify the correct delimiter with the sep parameter if not a comma.
Set header=None if your CSV file has no header row.
Check the file path and encoding to avoid common errors.
Use index_col to set a column as the DataFrame index if needed.