Challenge - 5 Problems

🎖️

NumPy ML Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of NumPy array shape after sklearn train_test_split

What is the shape of X_train after running the following code?

NumPy

import numpy as np
from sklearn.model_selection import train_test_split

X = np.arange(20).reshape(10, 2)
y = np.arange(10)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
print(X_train.shape)

A(7, 2)

B(3, 2)

C(10, 2)

D(14, 2)

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Result of NumPy array after StandardScaler transform

What is the output array after applying StandardScaler to the data below?

NumPy

import numpy as np
from sklearn.preprocessing import StandardScaler

data = np.array([[1, 2], [3, 4], [5, 6]])
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)
print(np.round(scaled_data, 2))

[[1. 1.]
 [3. 4.]
 [5. 6.]]

[[-1.22 -1.22]
 [ 0.   0.  ]
 [ 1.22  1.22]]

[[0. 0.]
 [0. 0.]
 [0. 0.]]

[[-1.  -1.]
 [ 0.   0.]
 [ 1.   1.]]

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error when using NumPy array with sklearn LinearRegression

What error will this code raise when fitting the model?

NumPy

import numpy as np
from sklearn.linear_model import LinearRegression

X = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])
model = LinearRegression()
model.fit(X, y)

AValueError: Expected 2D array, got 1D array instead

BTypeError: unsupported operand type(s) for +: 'int' and 'str'

CNo error, model fits successfully

DIndexError: index out of bounds

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Interpret the plot of PCA components from NumPy data

Given the PCA plot below, which statement is true about the data?

NumPy

import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA

np.random.seed(0)
data = np.dot(np.random.rand(2, 2), np.random.randn(2, 200)).T
pca = PCA(n_components=2)
components = pca.fit_transform(data)
plt.scatter(components[:, 0], components[:, 1])
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.title('PCA of Data')
plt.show()

APCA components are identical to original features.

BThe data has no variance along the first principal component.

CThe first principal component explains more variance than the second.

DThe second principal component explains more variance than the first.

Attempts:

2 left

🧠 Conceptual

expert

2:00remaining

Why use NumPy arrays with scikit-learn instead of Python lists?

Which is the main reason scikit-learn prefers NumPy arrays over Python lists for input data?

ANumPy arrays automatically normalize data before training.

BPython lists cannot store numerical data.

CPython lists are immutable and cannot be changed.

DNumPy arrays provide efficient memory usage and fast numerical operations required by scikit-learn.

Attempts:

2 left