Challenge - 5 Problems

🎖️

Binning Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of pandas cut with custom bins

What is the output of the following code snippet?

Data Analysis Python

import pandas as pd
import numpy as np

values = np.array([1, 5, 10, 15, 20])
bins = [0, 5, 10, 15]
categories = pd.cut(values, bins)
print(categories)

[(0, 5], (0, 5], (5, 10], (10, 15], NaN]
Categories (3, interval[int64, right]): [(0, 5] &lt; (5, 10] &lt; (10, 15]]

[(0, 5], NaN, (5, 10], (10, 15], NaN]
Categories (3, interval[int64, right]): [(0, 5] &lt; (5, 10] &lt; (10, 15]]

[NaN, (0, 5], (5, 10], (10, 15], NaN]
Categories (3, interval[int64, right]): [(0, 5] &lt; (5, 10] &lt; (10, 15]]

[NaN, (0, 5], (5, 10], (10, 15], (15, 20]]
Categories (4, interval[int64, right]): [(0, 5] &lt; (5, 10] &lt; (10, 15] &lt; (15, 20]]

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

Number of bins created by qcut

Given the following code, how many unique bins will be created?

Data Analysis Python

import pandas as pd
import numpy as np

values = np.array([1, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10])
categories = pd.qcut(values, 4)
unique_bins = categories.unique()
print(len(unique_bins))

Attempts:

2 left

❓ visualization

advanced

2:30remaining

Visualizing binning effect on data distribution

Which option shows the correct histogram with bins created by pd.cut for the data below?

Data Analysis Python

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

np.random.seed(0)
data = np.random.normal(loc=50, scale=10, size=1000)
bins = [20, 40, 60, 80]
categories = pd.cut(data, bins)
plt.hist(data, bins=bins, edgecolor='black')
plt.title('Histogram with bins [20, 40, 60, 80]')
plt.show()

AHistogram with 4 bars showing counts of data in intervals [20,40), [40,60), [60,80), [80,100)

BHistogram with 3 bars showing counts of data in intervals (20,40], (40,60], (60,80]

CHistogram with 3 bars showing counts of data in intervals [20,40), [40,60), [60,80)

DHistogram with 4 bars showing counts of data in intervals (20,40], (40,60], (60,80], (80,100]

Attempts:

2 left

🧠 Conceptual

advanced

1:00remaining

Effect of right parameter in pd.cut

What is the effect of setting right=False in pd.cut when binning data?

ABins include the right edge and exclude the left edge of intervals.

BBins include both edges of intervals.

CBins include the left edge and exclude the right edge of intervals.

DBins exclude both edges of intervals.

Attempts:

2 left

🔧 Debug

expert

1:30remaining

Identify the error in binning code

What error will the following code raise?

Data Analysis Python

import pandas as pd
values = [1, 2, 3, 4, 5]
bins = [0, 2, 4]
categories = pd.cut(values, bins, labels=['Low', 'Medium', 'High'])

ANo error, code runs successfully

BTypeError: 'list' object is not callable

CIndexError: list index out of range

DValueError: Bin labels must be one fewer than the number of bin edges

Attempts:

2 left