0
0
SciPydata~30 mins

K-means via scipy vs scikit-learn - Hands-On Comparison

Choose your learning style9 modes available
K-means Clustering with SciPy and scikit-learn
📖 Scenario: You work as a data analyst for a small retail company. You want to group customers based on their shopping habits to create better marketing strategies. You will use K-means clustering to find groups of similar customers.
🎯 Goal: Build a simple K-means clustering model using both scipy and scikit-learn libraries. Compare how to set up the data, run the clustering, and get the cluster centers.
📋 What You'll Learn
Create a dataset of customer shopping data as a list of lists.
Set the number of clusters to 2 using a variable.
Use scipy.cluster.vq.kmeans to find cluster centers.
Use sklearn.cluster.KMeans to fit the same data and get cluster centers.
Print the cluster centers from both methods.
💡 Why This Matters
🌍 Real World
K-means clustering helps businesses group customers or products based on features to target marketing or improve services.
💼 Career
Data scientists and analysts often use clustering to find patterns in data without labels, helping in customer segmentation and recommendation systems.
Progress0 / 4 steps
1
Create the customer data
Create a variable called data that holds this exact list of lists: [[1.0, 2.0], [1.5, 1.8], [5.0, 8.0], [8.0, 8.0], [1.0, 0.6], [9.0, 11.0]].
SciPy
Need a hint?

Use a variable named data and assign the list exactly as shown.

2
Set the number of clusters
Create a variable called num_clusters and set it to 2.
SciPy
Need a hint?

Use a variable named num_clusters and assign the value 2.

3
Run K-means with SciPy and scikit-learn
Import kmeans from scipy.cluster.vq and KMeans from sklearn.cluster. Use kmeans with data and num_clusters to get centroids_scipy. Then create a KMeans object with n_clusters=num_clusters, fit it to data, and get centroids_sklearn from its cluster_centers_ attribute.
SciPy
Need a hint?

Use the exact variable names and imports as shown. Remember to unpack the result of kmeans into centroids_scipy and a second value you can ignore.

4
Print the cluster centers
Print the string "SciPy centroids:" followed by centroids_scipy. Then print the string "scikit-learn centroids:" followed by centroids_sklearn.
SciPy
Need a hint?

Use two print statements exactly as described to show the centroids.