beginner

What does np.std() calculate in a dataset?

np.std() calculates the standard deviation, which measures how spread out the numbers are from the average (mean). A small value means data points are close to the mean, and a large value means they are more spread out.

Click to reveal answer

beginner

What is the difference between np.var() and np.std()?

np.var() calculates variance, which is the average of squared differences from the mean. np.std() is the square root of variance. So, standard deviation is in the same units as the data, while variance is in squared units.

Click to reveal answer

intermediate

How does changing the ddof parameter affect np.std() and np.var()?

The ddof parameter stands for 'Delta Degrees of Freedom'. Setting ddof=1 calculates the sample standard deviation or variance (dividing by N-1), which is used when data is a sample. The default ddof=0 calculates population values (dividing by N).

Click to reveal answer

beginner

Why is standard deviation often preferred over variance to describe spread?

Standard deviation is preferred because it is in the same units as the original data, making it easier to understand and compare. Variance is in squared units, which can be harder to interpret.

Click to reveal answer

beginner

If a dataset has values [2, 4, 4, 4, 5, 5, 7, 9], what is the approximate standard deviation using np.std() with default settings?

The mean is 5. The squared differences average to 4, so variance is 4. The standard deviation is the square root of 4, which is 2.

Click to reveal answer

What does np.var() measure in a dataset?

AAverage squared distance from the mean

BSquare root of the average distance from the mean

CMedian of the dataset

DSum of all data points

Which function returns a value in the same units as the original data?

Anp.std()

Bnp.var()

Cnp.mean()

Dnp.sum()

What does setting ddof=1 do in np.std()?

ACalculates population standard deviation

BIgnores the first data point

CCalculates median instead

DCalculates sample standard deviation

If variance is 9, what is the standard deviation?

A9

B3

C81

D1

Why might variance be harder to interpret than standard deviation?

AIt is always smaller than standard deviation

BIt is not related to the mean

CIt is in squared units, not original units

DIt ignores outliers

Explain in your own words what np.std() and np.var() tell us about a dataset.

Describe how changing the ddof parameter affects the calculation of spread using np.std() or np.var().