Challenge - 5 Problems
Log Transformation Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of log transformation on skewed data
What is the output of the following Python code that applies a log transformation to a skewed data list?
Data Analysis Python
import numpy as np import pandas as pd skewed_data = pd.Series([1, 10, 100, 1000, 10000]) log_data = np.log(skewed_data) print(log_data.round(2))
Attempts:
2 left
💡 Hint
Remember that np.log() computes the natural logarithm (base e).
✗ Incorrect
The natural log of 1 is 0.00, and the logs of 10, 100, 1000, and 10000 are approximately 2.30, 4.61, 6.91, and 9.21 respectively.
❓ data_output
intermediate2:00remaining
Effect of log transformation on skewness
Given a skewed dataset, which option shows the correct skewness values before and after applying a log transformation?
Data Analysis Python
import numpy as np import pandas as pd from scipy.stats import skew skewed_data = pd.Series([1, 2, 2, 3, 100]) skew_before = skew(skewed_data) skew_after = skew(np.log(skewed_data)) print(round(skew_before, 2), round(skew_after, 2))
Attempts:
2 left
💡 Hint
Log transformation reduces skewness by compressing large values.
✗ Incorrect
The original data is highly skewed (2.24), and after log transformation, skewness reduces significantly (0.68).
❓ visualization
advanced3:00remaining
Identify the correct plot after log transformation
Which plot correctly shows the distribution of data before and after log transformation?
Data Analysis Python
import matplotlib.pyplot as plt import numpy as np import pandas as pd skewed_data = pd.Series([1, 2, 2, 3, 100]) log_data = np.log(skewed_data) plt.figure(figsize=(8,4)) plt.subplot(1,2,1) plt.hist(skewed_data, bins=5, color='blue') plt.title('Original Data') plt.subplot(1,2,2) plt.hist(log_data, bins=5, color='green') plt.title('Log Transformed Data') plt.tight_layout() plt.show()
Attempts:
2 left
💡 Hint
Log transformation reduces right skewness.
✗ Incorrect
The original data histogram is right-skewed due to the large value 100. After log transform, the distribution looks more balanced.
🧠 Conceptual
advanced1:30remaining
Why use log transformation on skewed data?
Which option best explains why log transformation is applied to skewed data in data science?
Attempts:
2 left
💡 Hint
Think about how skewness affects statistical models.
✗ Incorrect
Log transformation compresses large values, reducing skewness and helping models that assume normality.
🔧 Debug
expert2:00remaining
Identify the error in log transformation code
What error will the following code raise when applying log transformation to data containing zero or negative values?
Data Analysis Python
import numpy as np import pandas as pd data = pd.Series([0, 1, 2, -1, 3]) log_data = np.log(data) print(log_data)
Attempts:
2 left
💡 Hint
Logarithm of zero or negative numbers is undefined in real numbers.
✗ Incorrect
np.log(0) causes a divide by zero warning; np.log of negative values results in NaN but no ValueError.