0
0
NumPydata~5 mins

np.std() and np.var() for spread in NumPy - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What does np.std() calculate in a dataset?

np.std() calculates the standard deviation, which measures how spread out the numbers are from the average (mean). A small value means data points are close to the mean, and a large value means they are more spread out.

Click to reveal answer
beginner
What is the difference between np.var() and np.std()?

np.var() calculates variance, which is the average of squared differences from the mean. np.std() is the square root of variance. So, standard deviation is in the same units as the data, while variance is in squared units.

Click to reveal answer
intermediate
How does changing the ddof parameter affect np.std() and np.var()?

The ddof parameter stands for 'Delta Degrees of Freedom'. Setting ddof=1 calculates the sample standard deviation or variance (dividing by N-1), which is used when data is a sample. The default ddof=0 calculates population values (dividing by N).

Click to reveal answer
beginner
Why is standard deviation often preferred over variance to describe spread?

Standard deviation is preferred because it is in the same units as the original data, making it easier to understand and compare. Variance is in squared units, which can be harder to interpret.

Click to reveal answer
beginner
If a dataset has values [2, 4, 4, 4, 5, 5, 7, 9], what is the approximate standard deviation using np.std() with default settings?

The mean is 5. The squared differences average to 4, so variance is 4. The standard deviation is the square root of 4, which is 2.

Click to reveal answer
What does np.var() measure in a dataset?
AAverage squared distance from the mean
BSquare root of the average distance from the mean
CMedian of the dataset
DSum of all data points
Which function returns a value in the same units as the original data?
Anp.std()
Bnp.var()
Cnp.mean()
Dnp.sum()
What does setting ddof=1 do in np.std()?
ACalculates population standard deviation
BIgnores the first data point
CCalculates median instead
DCalculates sample standard deviation
If variance is 9, what is the standard deviation?
A9
B3
C81
D1
Why might variance be harder to interpret than standard deviation?
AIt is always smaller than standard deviation
BIt is not related to the mean
CIt is in squared units, not original units
DIt ignores outliers
Explain in your own words what np.std() and np.var() tell us about a dataset.
Think about how spread out the data points are and how these functions measure that.
You got /4 concepts.
    Describe how changing the ddof parameter affects the calculation of spread using np.std() or np.var().
    Consider why we sometimes divide by N-1 instead of N.
    You got /3 concepts.