0
0
Data Analysis Pythondata~10 mins

Selecting columns in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Selecting columns
Start with DataFrame
Choose column(s) by name
Extract column data
Use extracted data for analysis or display
End
Start with a table of data, pick one or more columns by their names, get those columns out, then use them for your work.
Execution Sample
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A':[1,2], 'B':[3,4], 'C':[5,6]})
col = df['B']
cols = df[['A','C']]
Create a small table and select one column and then two columns from it.
Execution Table
StepCode LineActionResult
1import pandas as pdLoad pandas librarypandas ready
2df = pd.DataFrame({'A':[1,2], 'B':[3,4], 'C':[5,6]})Create DataFrame with columns A,B,Cdf with 3 columns and 2 rows
3col = df['B']Select column 'B'col is Series: [3, 4]
4cols = df[['A','C']]Select columns 'A' and 'C'cols is DataFrame with columns A and C
5-End of selectionVariables col and cols hold selected data
💡 All requested columns selected, execution ends.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
dfundefined{A: [1,2], B: [3,4], C: [5,6]}{A: [1,2], B: [3,4], C: [5,6]}{A: [1,2], B: [3,4], C: [5,6]}{A: [1,2], B: [3,4], C: [5,6]}
colundefinedundefined[3, 4][3, 4][3, 4]
colsundefinedundefinedundefined{A: [1,2], C: [5,6]}{A: [1,2], C: [5,6]}
Key Moments - 3 Insights
Why do we use single brackets for one column but double brackets for multiple columns?
Single brackets select one column and return a Series (see Step 3). Double brackets select multiple columns and return a DataFrame (see Step 4).
What type of data is returned when selecting a single column?
A single column selection returns a pandas Series, which is like a list with labels (Step 3 shows col as Series).
Can we select columns that do not exist in the DataFrame?
No, selecting non-existing columns will cause an error. The code only selects columns 'A', 'B', and 'C' which exist (Step 2).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the value of 'col' after Step 3?
A[1,2]
B[5,6]
C[3,4]
DUndefined
💡 Hint
Check the 'Result' column for Step 3 in the execution_table.
At which step does the variable 'cols' get assigned a value?
AStep 3
BStep 4
CStep 2
DStep 5
💡 Hint
Look for 'cols' assignment in the 'Code Line' column of execution_table.
If we select df['A'] instead of df['B'], what would 'col' contain after Step 3?
A[1,2]
B[5,6]
C[3,4]
DError
💡 Hint
Refer to the variable_tracker for values of columns in df.
Concept Snapshot
Selecting columns in pandas:
- Use df['col'] for one column (returns Series)
- Use df[['col1','col2']] for multiple columns (returns DataFrame)
- Column names must exist in df
- Result can be used for analysis or display
Full Transcript
We start with a DataFrame that has columns A, B, and C. We select one column 'B' using single brackets, which gives us a Series with values 3 and 4. Then we select two columns 'A' and 'C' using double brackets, which gives us a smaller DataFrame with those columns. Variables 'col' and 'cols' hold these selections. Single bracket returns Series, double brackets return DataFrame. Selecting columns that do not exist causes errors. This process helps us focus on specific data parts for analysis.