0
0
Data Analysis Pythondata~5 mins

Log transformation for skewed data in Data Analysis Python

Choose your learning style9 modes available
Introduction

Log transformation helps to make data less skewed and more normal. This makes it easier to analyze and understand.

When your data has very large values and a long tail on the right side.
When you want to reduce the effect of extreme values in your data.
When preparing data for models that assume normal distribution.
When visualizing data that is heavily skewed to make patterns clearer.
Syntax
Data Analysis Python
import numpy as np
log_data = np.log(original_data)

Use np.log() for natural logarithm (base e).

Make sure data has no zero or negative values before applying log.

Examples
This example shows log transformation on a list of positive numbers.
Data Analysis Python
import numpy as np
original_data = [1, 10, 100, 1000]
log_data = np.log(original_data)
print(log_data)
Log transformation works on numbers less than 1 but greater than 0.
Data Analysis Python
import numpy as np
original_data = [0.1, 1, 10, 100]
log_data = np.log(original_data)
print(log_data)
Log transformation on small positive integers.
Data Analysis Python
import numpy as np
original_data = [1, 2, 3, 4, 5]
log_data = np.log(original_data)
print(log_data)
Sample Program

This program creates a skewed dataset, applies log transformation, prints both versions, and shows histograms to compare the effect.

Data Analysis Python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Create skewed data
original_data = pd.Series([1, 2, 2, 3, 5, 10, 50, 100, 500, 1000])

# Apply log transformation
log_data = np.log(original_data)

# Show original and transformed data
print('Original data:')
print(original_data.to_list())
print('\nLog transformed data:')
print(log_data.to_list())

# Plot to compare
plt.figure(figsize=(10,4))
plt.subplot(1,2,1)
plt.hist(original_data, bins=10, color='skyblue')
plt.title('Original Data Histogram')
plt.subplot(1,2,2)
plt.hist(log_data, bins=10, color='lightgreen')
plt.title('Log Transformed Data Histogram')
plt.tight_layout()
plt.show()
OutputSuccess
Important Notes

Log transformation cannot be applied to zero or negative values directly.

If data contains zeros, add a small constant (like 1) before applying log.

After transformation, data is easier to analyze with many statistical methods.

Summary

Log transformation reduces skewness in data.

It works only on positive values.

It helps make data more normal and easier to analyze.