0
0
Pandasdata~10 mins

Selecting columns by name in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Selecting columns by name
Start with DataFrame
Specify column name(s)
Access columns using df[column_name
Return selected column(s) as Series or DataFrame
Use or display selected data
Start with a DataFrame, specify the column name(s) you want, access them using the DataFrame syntax, and get the selected data back.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({
  'Name': ['Alice', 'Bob'],
  'Age': [25, 30],
  'City': ['NY', 'LA']
})

selected = df['Age']
This code creates a DataFrame and selects the 'Age' column by its name.
Execution Table
StepActionCodeResult TypeOutput
1Create DataFramedf = pd.DataFrame({...})DataFrame{'Name': ['Alice', 'Bob'], 'Age': [25, 30], 'City': ['NY', 'LA']}
2Select single column by nameselected = df['Age']Series[25, 30]
3Select multiple columns by listselected = df[['Name', 'City']]DataFrame{'Name': ['Alice', 'Bob'], 'City': ['NY', 'LA']}
4Attempt select non-existing columnselected = df['Salary']KeyErrorError: 'Salary' not found in columns
5End--Selection complete or error raised
💡 Selection stops after returning the requested column(s) or raising an error if column not found.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4Final
dfundefined{Name, Age, City}{Name, Age, City}{Name, Age, City}{Name, Age, City}{Name, Age, City}
selectedundefinedundefined[25, 30]{Name: ['Alice', 'Bob'], City: ['NY', 'LA']}KeyErrorKeyError or selected data
Key Moments - 2 Insights
Why does selecting a single column return a Series but selecting multiple columns returns a DataFrame?
Selecting one column like df['Age'] returns a Series because it's a single column of data. Selecting multiple columns with a list like df[['Name', 'City']] returns a DataFrame because it contains multiple columns. See execution_table rows 2 and 3.
What happens if you try to select a column name that does not exist?
Pandas raises a KeyError because the column is not found in the DataFrame. This is shown in execution_table row 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the type of 'selected' after step 2?
ASeries
BList
CDataFrame
DDictionary
💡 Hint
Check the 'Result Type' column in execution_table row 2.
At which step does the code raise an error for selecting a non-existing column?
AStep 2
BStep 3
CStep 4
DStep 5
💡 Hint
Look for 'KeyError' in the 'Result Type' column in execution_table.
If you want to select columns 'Name' and 'Age' together, which code would you use?
Adf['Name', 'Age']
Bdf[['Name', 'Age']]
Cdf['Name' & 'Age']
Ddf['Name'] + df['Age']
💡 Hint
See execution_table row 3 for selecting multiple columns.
Concept Snapshot
Selecting columns by name in pandas:
- Use df['col'] for single column (returns Series)
- Use df[['col1', 'col2']] for multiple columns (returns DataFrame)
- Column names must exist or KeyError occurs
- Useful to focus on specific data in DataFrame
Full Transcript
This visual execution shows how to select columns by name in pandas. We start with a DataFrame containing columns 'Name', 'Age', and 'City'. Selecting a single column like 'Age' returns a Series with that column's data. Selecting multiple columns using a list returns a DataFrame with those columns. If a column name does not exist, pandas raises a KeyError. The variable tracker shows how 'df' stays the same while 'selected' changes depending on the selection. Key moments clarify why single vs multiple column selection returns different types and what happens on errors. The quiz tests understanding of these steps and correct syntax for multiple column selection.