Challenge - 5 Problems

🎖️

Customer Segmentation Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of KMeans clustering labels

Given the following code that performs KMeans clustering on customer data, what will be the output of print(labels)?

Data Analysis Python

from sklearn.cluster import KMeans
import numpy as np

X = np.array([[5, 200], [6, 220], [7, 210], [20, 800], [22, 850], [21, 830]])
kmeans = KMeans(n_clusters=2, random_state=42)
kmeans.fit(X)
labels = kmeans.labels_
print(labels)

A[1 1 1 0 0 0]

B[0 0 0 1 1 1]

C[0 1 0 1 0 1]

D[1 0 1 0 1 0]

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Number of customers in each segment

After segmenting customers using KMeans with 3 clusters, what is the count of customers in each cluster?

Data Analysis Python

import pandas as pd
from sklearn.cluster import KMeans

customers = pd.DataFrame({
    'Age': [25, 45, 35, 23, 52, 40, 60, 48],
    'Annual_Spend': [500, 1500, 800, 450, 2000, 1200, 2200, 1600]
})
kmeans = KMeans(n_clusters=3, random_state=0)
customers['Segment'] = kmeans.fit_predict(customers)
counts = customers['Segment'].value_counts().sort_index()
print(counts)

0    3
1    2
2    3
Name: Segment, dtype: int64

0    2
1    3
2    3
Name: Segment, dtype: int64

0    3
1    3
2    2
Name: Segment, dtype: int64

0    4
1    2
2    2
Name: Segment, dtype: int64

Attempts:

2 left

❓ visualization

advanced

3:00remaining

Identify the correct scatter plot of customer segments

Which scatter plot correctly shows customer segments after applying KMeans clustering on Age vs Annual Spend?

Data Analysis Python

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.cluster import KMeans

customers = pd.DataFrame({
    'Age': [22, 25, 47, 52, 46, 56, 55, 60],
    'Annual_Spend': [400, 500, 1500, 1600, 1400, 1700, 1800, 1900]
})
kmeans = KMeans(n_clusters=2, random_state=1)
customers['Segment'] = kmeans.fit_predict(customers)

plt.figure(figsize=(6,4))
plt.scatter(customers['Age'], customers['Annual_Spend'], c=customers['Segment'], cmap='viridis')
plt.xlabel('Age')
plt.ylabel('Annual Spend')
plt.title('Customer Segments')
plt.show()

AScatter plot showing random color distribution with no clear groups.

BScatter plot with all points in one color, no segmentation visible.

CScatter plot with two distinct groups: younger customers with lower spend and older customers with higher spend.

DScatter plot with three overlapping clusters mixed in colors.

Attempts:

2 left

🧠 Conceptual

advanced

1:30remaining

Understanding silhouette score in customer segmentation

What does a silhouette score close to 1 indicate about the customer segments created by a clustering algorithm?

AThe clustering algorithm failed to converge.

BClusters overlap heavily and customers are poorly matched to clusters.

CThere are too many clusters causing overfitting.

DClusters are well separated and customers are well matched to their own cluster.

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Identify the error in customer segmentation code

What error will this code raise when trying to segment customers using KMeans?

Data Analysis Python

from sklearn.cluster import KMeans
import pandas as pd

customers = pd.DataFrame({
    'Age': [30, 40, 50],
    'Annual_Spend': [1000, 1500, 2000]
})

kmeans = KMeans(n_clusters=4)
kmeans.fit(customers)
labels = kmeans.labels_
print(labels)

AValueError: Number of clusters (4) cannot be greater than number of samples (3).

BAttributeError: 'KMeans' object has no attribute 'labels_'

CTypeError: fit() missing 1 required positional argument

DNo error, prints labels array of length 3.

Attempts:

2 left