0
0
Data Analysis Pythondata~20 mins

Log transformation for skewed data in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Log Transformation Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of log transformation on skewed data
What is the output of the following Python code that applies a log transformation to a skewed data list?
Data Analysis Python
import numpy as np
import pandas as pd

skewed_data = pd.Series([1, 10, 100, 1000, 10000])
log_data = np.log(skewed_data)
print(log_data.round(2))
A[0.69, 2.30, 4.61, 6.91, 9.21]
B[0.00, 2.30, 4.61, 6.91, 9.21]
C[1, 10, 100, 1000, 10000]
D[0.00, 1.00, 2.00, 3.00, 4.00]
Attempts:
2 left
💡 Hint
Remember that np.log() computes the natural logarithm (base e).
data_output
intermediate
2:00remaining
Effect of log transformation on skewness
Given a skewed dataset, which option shows the correct skewness values before and after applying a log transformation?
Data Analysis Python
import numpy as np
import pandas as pd
from scipy.stats import skew

skewed_data = pd.Series([1, 2, 2, 3, 100])
skew_before = skew(skewed_data)
skew_after = skew(np.log(skewed_data))
print(round(skew_before, 2), round(skew_after, 2))
A2.24 2.24
B1.50 1.50
C2.24 0.68
D0.00 0.00
Attempts:
2 left
💡 Hint
Log transformation reduces skewness by compressing large values.
visualization
advanced
3:00remaining
Identify the correct plot after log transformation
Which plot correctly shows the distribution of data before and after log transformation?
Data Analysis Python
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

skewed_data = pd.Series([1, 2, 2, 3, 100])
log_data = np.log(skewed_data)

plt.figure(figsize=(8,4))
plt.subplot(1,2,1)
plt.hist(skewed_data, bins=5, color='blue')
plt.title('Original Data')
plt.subplot(1,2,2)
plt.hist(log_data, bins=5, color='green')
plt.title('Log Transformed Data')
plt.tight_layout()
plt.show()
ALeft plot is right-skewed; right plot is more symmetric
BLeft plot is symmetric; right plot is right-skewed
CBoth plots are symmetric
DBoth plots are left-skewed
Attempts:
2 left
💡 Hint
Log transformation reduces right skewness.
🧠 Conceptual
advanced
1:30remaining
Why use log transformation on skewed data?
Which option best explains why log transformation is applied to skewed data in data science?
ATo reduce skewness and make data distribution more normal-like for better modeling
BTo increase the range of data values for better visualization
CTo convert categorical data into numerical data
DTo remove missing values from the dataset
Attempts:
2 left
💡 Hint
Think about how skewness affects statistical models.
🔧 Debug
expert
2:00remaining
Identify the error in log transformation code
What error will the following code raise when applying log transformation to data containing zero or negative values?
Data Analysis Python
import numpy as np
import pandas as pd

data = pd.Series([0, 1, 2, -1, 3])
log_data = np.log(data)
print(log_data)
ANo error, prints log values with NaN for invalid entries
BValueError: math domain error
CTypeError: unsupported operand type(s) for log
DRuntimeWarning: divide by zero encountered in log
Attempts:
2 left
💡 Hint
Logarithm of zero or negative numbers is undefined in real numbers.