0
0
Pandasdata~5 mins

Specifying column names and index in Pandas

Choose your learning style9 modes available
Introduction
We specify column names and index to organize data clearly and access it easily, just like labeling folders and files in real life.
When reading data from a file without headers and you want to name the columns yourself.
When creating a new table and you want to set meaningful column names and row labels.
When you want to change the default row numbers to something more descriptive.
When you want to select or sort data based on specific column names or index labels.
Syntax
Pandas
pd.DataFrame(data, columns=[list_of_column_names], index=[list_of_index_labels])
The columns list sets the names for each column in the order of the data.
The index list sets the row labels, which can be numbers, strings, or other unique identifiers.
Examples
Create a DataFrame with two columns named 'A' and 'B' and rows labeled 'row1' and 'row2'.
Pandas
import pandas as pd

data = [[10, 20], [30, 40]]
df = pd.DataFrame(data, columns=['A', 'B'], index=['row1', 'row2'])
print(df)
Set column names but keep default row numbers (0, 1).
Pandas
import pandas as pd

# Create DataFrame with default index but custom columns
data = [[1, 2], [3, 4]]
df = pd.DataFrame(data, columns=['X', 'Y'])
print(df)
Set row labels to 'person1' and 'person2' while columns come from dictionary keys.
Pandas
import pandas as pd

# Create DataFrame with custom index only
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data, index=['person1', 'person2'])
print(df)
Sample Program
This creates a table with sales and profit data for three months labeled Jan, Feb, and Mar.
Pandas
import pandas as pd

# Data without headers
data = [[100, 200], [300, 400], [500, 600]]

# Specify column names and index labels
df = pd.DataFrame(data, columns=['Sales', 'Profit'], index=['Jan', 'Feb', 'Mar'])

print(df)
OutputSuccess
Important Notes
If you provide fewer column names than data columns, pandas will raise an error.
Index labels must be unique to avoid confusion when selecting rows.
You can change column names later using df.columns = [...].
Summary
Specify column names to label each data column clearly.
Set index labels to name rows for easier access and understanding.
This helps organize and work with data like labeling folders and files.