Recall & Review
beginner
What is the purpose of applying a log transformation to data?
A log transformation helps to reduce skewness in data, making it more symmetric and easier to analyze with methods that assume normality.
Click to reveal answer
beginner
When should you consider using a log transformation on your dataset?
When your data is right-skewed (long tail on the right), applying a log transformation can help normalize the distribution.
Click to reveal answer
intermediate
What is a common issue you must handle before applying a log transformation?
Log transformation cannot be applied to zero or negative values, so you may need to add a small constant to all values before transforming.
Click to reveal answer
beginner
Show the Python code to apply a log transformation to a pandas DataFrame column named 'Income'.
import numpy as np
import pandas as pd
# Assuming df is your DataFrame
# Add 1 to avoid log(0) error
df['Income_log'] = np.log(df['Income'] + 1)Click to reveal answer
intermediate
How does log transformation affect the scale of data?
It compresses large values more than small values, reducing the effect of extreme outliers and making the data scale more manageable.
Click to reveal answer
Why do we add a small constant before applying log transformation?
✗ Incorrect
Logarithm is undefined for zero or negative values, so adding a small constant ensures all values are positive.
What type of skewness is best handled by log transformation?
✗ Incorrect
Log transformation reduces right skewness by compressing large values.
Which Python library is commonly used to apply log transformation?
✗ Incorrect
Numpy provides the log function used for log transformation.
What happens to outliers after log transformation?
✗ Incorrect
Log transformation compresses large values, reducing the impact of outliers.
Which of these is NOT a reason to use log transformation?
✗ Incorrect
Log transformation cannot handle zero or negative values directly without adjustment.
Explain why and how you would apply a log transformation to a skewed dataset.
Think about data shape and mathematical limits of log.
You got /4 concepts.
Describe the effect of log transformation on data distribution and outliers.
Consider how scale changes after transformation.
You got /4 concepts.