How to Use assign in pandas: Add or Modify Columns Easily
Use the
assign method in pandas to add new columns or modify existing ones in a DataFrame. It returns a new DataFrame with the changes, leaving the original unchanged. You pass column names as keywords and their values as expressions or functions.Syntax
The assign method syntax is:
DataFrame.assign(**kwargs)
Here, kwargs are column names as keys and their new values or expressions as values. It returns a new DataFrame with these columns added or updated.
python
df.assign(new_col=value_expression)
Example
This example shows how to add a new column and modify an existing one using assign. The original DataFrame stays the same.
python
import pandas as pd df = pd.DataFrame({ 'A': [1, 2, 3], 'B': [4, 5, 6] }) # Add new column 'C' as sum of 'A' and 'B', and modify 'B' by doubling new_df = df.assign(C=df['A'] + df['B'], B=lambda x: x['B'] * 2) print("Original DataFrame:\n", df) print("\nNew DataFrame with assign:\n", new_df)
Output
Original DataFrame:
A B
0 1 4
1 2 5
2 3 6
New DataFrame with assign:
A B C
0 1 8 5
1 2 10 7
2 3 12 9
Common Pitfalls
Common mistakes when using assign include:
- Expecting
assignto modify the original DataFrame (it returns a new one). - Passing positional arguments instead of keyword arguments.
- Using column names that conflict with existing DataFrame methods.
Always assign the result back to a variable if you want to keep the changes.
python
import pandas as pd df = pd.DataFrame({'A': [1, 2]}) # Wrong: This does not change df # df.assign(B=[3, 4]) # Right: Assign result to a new variable or overwrite new_df = df.assign(B=[3, 4]) print(new_df)
Output
A B
0 1 3
1 2 4
Quick Reference
Summary tips for using assign:
- Use keyword arguments to add or update columns.
- Values can be scalars, lists, Series, or functions.
- Functions receive the DataFrame and return the new column values.
- Original DataFrame is not changed; assign returns a new one.
Key Takeaways
The assign method adds or modifies columns and returns a new DataFrame.
Always assign the result of assign to a variable to keep changes.
You can pass functions to assign to compute new column values dynamically.
Original DataFrame remains unchanged after using assign.
Use keyword arguments with column names and values or functions.