0
0
Pandasdata~5 mins

When to use apply vs vectorized operations in Pandas

Choose your learning style9 modes available
Introduction

We use apply and vectorized operations to change or analyze data in tables. Vectorized operations are faster and simpler, but apply is useful when we need custom or complex changes.

You want to quickly add or multiply columns in a table.
You need to apply a simple math operation to every value in a column.
You want to run a custom function that is not built-in on each row or column.
You have complex logic that depends on multiple columns and cannot be done with simple math.
You want to transform data row by row or column by column with a specific rule.
Syntax
Pandas
df['new_column'] = df['existing_column'].apply(function)
df['new_column'] = df['existing_column'] + 10  # vectorized example

apply runs a function on each element or row/column.

Vectorized operations use built-in fast math on whole columns at once.

Examples
This is a vectorized operation that doubles each value fast.
Pandas
df['double'] = df['value'] * 2
This does the same using apply and a custom function.
Pandas
df['double'] = df['value'].apply(lambda x: x * 2)
Use apply with axis=1 to check multiple columns in each row.
Pandas
df['category'] = df.apply(lambda row: 'high' if row['value'] > 10 else 'low', axis=1)
Sample Program

This code shows adding 10 to each number using vectorized operation, which is fast and simple.

Then it uses apply with a function to label values as 'high' or 'low'.

Pandas
import pandas as pd

data = {'value': [5, 15, 8, 20]}
df = pd.DataFrame(data)

# Vectorized operation: add 10 to each value
added = df['value'] + 10
print('Vectorized result:')
print(added)

# Apply with a custom function: label high or low
labels = df['value'].apply(lambda x: 'high' if x > 10 else 'low')
print('\nApply result:')
print(labels)
OutputSuccess
Important Notes

Vectorized operations are usually faster and preferred for simple math or built-in functions.

Use apply when you need custom logic that vectorized operations cannot do.

apply with axis=1 works row-wise, axis=0 works column-wise.

Summary

Use vectorized operations for simple, fast math on columns.

Use apply for custom or complex functions on rows or columns.

apply is slower but more flexible than vectorized operations.