How to Use map() in pandas for Data Transformation
In pandas,
map() is used on a Series to replace or transform its values by mapping them through a dictionary, Series, or function. It helps quickly change values based on a mapping or apply a function element-wise.Syntax
The map() function is called on a pandas Series with one argument that can be:
- a dictionary to map old values to new values,
- a Series to map values based on matching index,
- or a function to apply to each element.
Example syntax:
Series.map(arg, na_action=None)
where arg is the mapping or function, and na_action='ignore' skips missing values.
python
series.map(arg, na_action=None)
Example
This example shows how to use map() with a dictionary to replace values, and with a function to transform values.
python
import pandas as pd # Create a Series s = pd.Series(['cat', 'dog', 'rabbit', 'dog', 'cat']) # Map using a dictionary mapping_dict = {'cat': 'kitten', 'dog': 'puppy'} mapped_series = s.map(mapping_dict) # Map using a function def add_exclamation(x): return x + '!' mapped_func_series = s.map(add_exclamation) print('Original Series:') print(s) print('\nMapped with dictionary:') print(mapped_series) print('\nMapped with function:') print(mapped_func_series)
Output
Original Series:
0 cat
1 dog
2 rabbit
3 dog
4 cat
dtype: object
Mapped with dictionary:
0 kitten
1 puppy
2 NaN
3 puppy
4 kitten
dtype: object
Mapped with function:
0 cat!
1 dog!
2 rabbit!
3 dog!
4 cat!
dtype: object
Common Pitfalls
Common mistakes when using map() include:
- Expecting
map()to work on DataFrames directly (it works on Series only). - Not handling missing keys in the mapping dictionary, which results in
NaNvalues. - Confusing
map()withapply()which works on DataFrames and Series but differently.
Example of missing keys causing NaN:
python
import pandas as pd s = pd.Series(['apple', 'banana', 'cherry']) mapping = {'apple': 'red', 'banana': 'yellow'} # Missing 'cherry' in mapping leads to NaN result = s.map(mapping) print(result) # To avoid NaN, use fillna() result_filled = result.fillna('unknown') print(result_filled)
Output
0 red
1 yellow
2 NaN
dtype: object
0 red
1 yellow
2 unknown
dtype: object
Quick Reference
| Usage | Description |
|---|---|
| Series.map(dict) | Replace values using a dictionary mapping |
| Series.map(function) | Apply a function to each element |
| Series.map(Series) | Map values based on matching index of another Series |
| na_action='ignore' | Skip mapping for missing values (NaN) |
| Result contains NaN | When keys are missing in the mapping dictionary |
Key Takeaways
Use map() on a pandas Series to replace or transform values with a dictionary, Series, or function.
Missing keys in dictionary mapping result in NaN values; handle them with fillna() if needed.
map() works only on Series, not directly on DataFrames.
Use na_action='ignore' to skip mapping on missing values.
map() is different from apply(); map() is simpler for element-wise mapping.