0
0
Data Analysis Pythondata~5 mins

Adding and removing columns in Data Analysis Python

Choose your learning style9 modes available
Introduction

We add or remove columns to change the data we want to analyze. This helps us focus on important information or clean up the data.

You want to create a new column based on existing data, like calculating age from birth year.
You need to remove columns that are not useful or have too many missing values.
You want to add a column to label data, like marking sales as 'high' or 'low'.
You want to simplify the dataset by keeping only relevant columns for your analysis.
Syntax
Data Analysis Python
import pandas as pd

# Adding a column
df['new_column'] = values

# Removing a column
df = df.drop('column_name', axis=1)

Use axis=1 to specify you want to drop a column, not a row.

Adding a column can be done by assigning a list, a single value, or a calculation.

Examples
This adds a new column 'C' which is the sum of columns 'A' and 'B'.
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Add column C as sum of A and B
df['C'] = df['A'] + df['B']
This removes the column 'B' from the DataFrame.
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
# Remove column B
df = df.drop('B', axis=1)
This adds a new column 'constant' with value 10 for every row.
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A': [1, 2]})
# Add a column with the same value for all rows
df['constant'] = 10
Sample Program

This program creates a table with names, ages, and salaries. It adds a new column showing age after 5 years. Then it removes the salary column. Finally, it prints the updated table.

Data Analysis Python
import pandas as pd

# Create a simple DataFrame
df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Salary': [50000, 60000, 70000]
})

# Add a new column 'Age in 5 years'
df['Age in 5 years'] = df['Age'] + 5

# Remove the 'Salary' column
df = df.drop('Salary', axis=1)

print(df)
OutputSuccess
Important Notes

Dropping columns does not change the original DataFrame unless you assign it back or use inplace=True.

Adding columns with calculations helps create new insights from existing data.

Summary

You add columns to include new information or calculations.

You remove columns to clean or simplify your data.

Use df['new_col'] = ... to add and df.drop(..., axis=1) to remove columns.