The replace() function helps you change specific values in your data to new ones. This is useful to clean or update data easily.
replace() for value substitution in Data Analysis Python
import pandas as pd # Create a DataFrame data = {'Column1': ['A', 'B', 'C', 'A'], 'Column2': [1, 2, 3, 4]} df = pd.DataFrame(data) # Replace values in a column or entire DataFrame df.replace(to_replace, value, inplace=False) # Parameters: # to_replace: value or list/dict of values to find # value: value or list/dict of values to replace with # inplace: if True, changes original DataFrame, else returns new one
You can replace single values, lists of values, or use a dictionary to map old to new values.
By default, replace() returns a new DataFrame. Use inplace=True to modify the original.
import pandas as pd data = {'Fruit': ['apple', 'banana', 'apple', 'orange']} df = pd.DataFrame(data) # Replace 'apple' with 'pear' df_replaced = df.replace('apple', 'pear') print(df_replaced)
import pandas as pd data = {'Fruit': ['apple', 'banana', 'apple', 'orange']} df = pd.DataFrame(data) # Replace multiple values using a list df_replaced = df.replace(['apple', 'banana'], ['pear', 'kiwi']) print(df_replaced)
import pandas as pd data = {'Fruit': ['apple', 'banana', 'apple', 'orange']} df = pd.DataFrame(data) # Replace using a dictionary df_replaced = df.replace({'apple': 'pear', 'orange': 'grape'}) print(df_replaced)
import pandas as pd data = {'Fruit': []} df = pd.DataFrame(data) # Replace on empty DataFrame df_replaced = df.replace('apple', 'pear') print(df_replaced)
This program creates a DataFrame with some misspelled fruit names. It then uses replace() with a dictionary to fix the typos. The original and corrected DataFrames are printed to show the change.
import pandas as pd # Create a DataFrame with some fruit names and some typos fruit_data = {'Fruit': ['apple', 'bananna', 'apple', 'oragne', 'banana']} df = pd.DataFrame(fruit_data) print('Original DataFrame:') print(df) # Replace misspelled fruits with correct names corrections = {'bananna': 'banana', 'oragne': 'orange'} df_corrected = df.replace(corrections) print('\nDataFrame after replace():') print(df_corrected)
Time complexity: Usually O(n) where n is number of elements, because it checks each value.
Space complexity: O(n) if inplace=False because it creates a new DataFrame copy.
Common mistake: forgetting to assign the result back or use inplace=True, so changes don't appear.
Use replace() when you want to change specific values. Use map() or apply() for more complex transformations.
replace() changes specific values in your data easily.
You can replace single values, lists, or use dictionaries for multiple replacements.
Remember to assign the result or use inplace=True to keep changes.