0
0
Pandasdata~10 mins

Creating new columns in Pandas - Visual Walkthrough

Choose your learning style9 modes available
Concept Flow - Creating new columns
Start with DataFrame
Define new column values
Assign new column to DataFrame
DataFrame updated with new column
End
We start with a DataFrame, define values for a new column, assign it, and update the DataFrame.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3]})
df['B'] = df['A'] * 2
print(df)
This code creates a new column 'B' by doubling the values in column 'A'.
Execution Table
StepActionDataFrame StateNew Column 'B' Values
1Create DataFrame with column 'A'{'A': [1, 2, 3]}N/A
2Calculate new column 'B' as df['A'] * 2Same as step 1[2, 4, 6]
3Assign new column 'B' to DataFrame{'A': [1, 2, 3], 'B': [2, 4, 6]}[2, 4, 6]
4Print DataFrameDisplays both columns[2, 4, 6]
💡 New column 'B' added successfully; execution ends after printing DataFrame.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3Final
dfundefined{'A': [1, 2, 3]}{'A': [1, 2, 3]}{'A': [1, 2, 3], 'B': [2, 4, 6]}{'A': [1, 2, 3], 'B': [2, 4, 6]}
df['B']undefinedundefined[2, 4, 6][2, 4, 6][2, 4, 6]
Key Moments - 2 Insights
Why does the new column 'B' appear only after assignment and not before?
Because calculating df['A'] * 2 creates a Series but does not add it to df until we assign it as df['B'] (see execution_table step 2 vs 3).
Can we create a new column using a list of values directly?
Yes, as long as the list length matches the DataFrame rows. Assignment like df['B'] = [2,4,6] works (similar to step 3).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table at step 2, what are the values calculated for the new column 'B'?
AN/A
B[1, 2, 3]
C[2, 4, 6]
D[3, 6, 9]
💡 Hint
Check the 'New Column 'B' Values' column in execution_table row for step 2.
At which step does the DataFrame actually get updated with the new column 'B'?
AStep 1
BStep 3
CStep 2
DStep 4
💡 Hint
Look at the 'DataFrame State' column in execution_table to see when 'B' appears.
If we assign df['B'] = [10, 20], what will happen?
AError due to length mismatch
BNew column 'B' with values [10, 20] added
CColumn 'B' will have NaN for missing rows
DColumn 'B' will be empty
💡 Hint
Recall that the list length must match DataFrame rows (see key_moments about list length).
Concept Snapshot
Creating new columns in pandas:
- Use df['new_col'] = values
- Values can be Series, list, or calculations
- Length of values must match DataFrame rows
- New column appears after assignment
- Useful for adding derived data
Full Transcript
We start with a DataFrame having one column 'A'. We calculate new values by doubling 'A' and store them temporarily. Only after assigning these values to df['B'] does the DataFrame update to include the new column. Printing shows both columns. Beginners often wonder why the new column isn't visible before assignment and must ensure the new column values match the DataFrame length. This process allows adding new data columns easily.