0
0
NumPydata~20 mins

Why statistics with NumPy matters - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
NumPy Statistics Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Calculate mean and median with NumPy
What is the output of this code that calculates the mean and median of a dataset using NumPy?
NumPy
import numpy as np

data = np.array([10, 20, 30, 40, 50])
mean_val = np.mean(data)
median_val = np.median(data)
print(f"Mean: {mean_val}, Median: {median_val}")
AMean: 30.0, Median: 30.0
BMean: 25.0, Median: 30.0
CMean: 30.0, Median: 25.0
DMean: 20.0, Median: 40.0
Attempts:
2 left
💡 Hint
Remember that mean is the average and median is the middle value when data is sorted.
data_output
intermediate
2:00remaining
Variance and Standard Deviation Calculation
Given this NumPy code, what is the output of variance and standard deviation?
NumPy
import numpy as np

values = np.array([2, 4, 4, 4, 5, 5, 7, 9])
variance = np.var(values)
std_dev = np.std(values)
print(f"Variance: {variance:.2f}, Standard Deviation: {std_dev:.2f}")
AVariance: 3.50, Standard Deviation: 1.87
BVariance: 2.00, Standard Deviation: 1.41
CVariance: 4.00, Standard Deviation: 2.00
DVariance: 5.00, Standard Deviation: 2.24
Attempts:
2 left
💡 Hint
Variance is the average of squared differences from the mean. Standard deviation is the square root of variance.
visualization
advanced
2:30remaining
Histogram Visualization of Data Distribution
Which option shows the correct code to create a histogram of the data array using NumPy and Matplotlib?
NumPy
import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(loc=0, scale=1, size=1000)

# Choose the correct code to plot histogram
A
plt.scatter(data)
plt.title('Data Distribution')
plt.show()
B
plt.hist(data, bins=30, color='blue', edgecolor='black')
plt.title('Data Distribution')
plt.show()
C
plt.bar(data, bins=30)
plt.title('Data Distribution')
plt.show()
D
plt.plot(data, bins=30)
plt.title('Data Distribution')
plt.show()
Attempts:
2 left
💡 Hint
Histograms use plt.hist() to show frequency distribution.
🧠 Conceptual
advanced
2:00remaining
Why Use NumPy for Statistics?
Which statement best explains why NumPy is preferred for statistical calculations on large datasets?
ANumPy automatically creates visualizations for statistical data without extra code.
BNumPy replaces the need for any other Python libraries for data analysis.
CNumPy stores data in text files which makes it easier to read and write statistical results.
DNumPy uses optimized C code for fast computation and supports vectorized operations, making statistics faster and efficient.
Attempts:
2 left
💡 Hint
Think about speed and efficiency when working with numbers in Python.
🔧 Debug
expert
2:00remaining
Identify the Error in Statistical Calculation
What error does this code raise when calculating the mean of a list with a string element?
NumPy
import numpy as np

values = [10, 20, '30', 40]
mean_val = np.mean(values)
print(mean_val)
AValueError: could not convert string to float: '30'
BSyntaxError: invalid syntax
CNo error, output is 25.0
DTypeError: unsupported operand type(s) for +: 'int' and 'str'
Attempts:
2 left
💡 Hint
Check what happens when NumPy tries to add numbers and strings.