Challenge - 5 Problems

🎖️

Feature Engineering Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

How do engineered features help machine learning models?

Which of the following best explains why engineered features improve the performance of machine learning models?

AThey make the dataset more complex without adding useful information.

BThey reduce the size of the dataset by removing rows.

CThey add new information that helps models find patterns more easily.

DThey always increase the number of features without improving accuracy.

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Output of feature scaling on data

Given the following code that scales a feature using min-max scaling, what is the output?

Data Analysis Python

import pandas as pd
from sklearn.preprocessing import MinMaxScaler

data = pd.DataFrame({'age': [20, 30, 40, 50]})
scaler = MinMaxScaler()
data['age_scaled'] = scaler.fit_transform(data[['age']])
print(data)

ARaises a TypeError due to wrong input format

B{'age': [20, 30, 40, 50], 'age_scaled': [20, 30, 40, 50]}

C{'age': [20, 30, 40, 50], 'age_scaled': [1.0, 0.6667, 0.3333, 0.0]}

D{'age': [20, 30, 40, 50], 'age_scaled': [0.0, 0.3333, 0.6667, 1.0]}

Attempts:

2 left

❓ visualization

advanced

3:00remaining

Effect of polynomial features on data separation

Which plot best shows how adding polynomial features can help separate data that is not linearly separable?

Data Analysis Python

import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import PolynomialFeatures

np.random.seed(0)
X = np.random.randn(100, 1)
y = (X[:, 0] > 0).astype(int)
poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(X)

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.scatter(X, y, c=y, cmap='bwr')
plt.title('Original Feature')
plt.xlabel('X')
plt.ylabel('Class')

plt.subplot(1, 2, 2)
plt.scatter(X_poly[:, 0], X_poly[:, 1], c=y, cmap='bwr')
plt.title('With Polynomial Feature')
plt.xlabel('X')
plt.ylabel('X squared')
plt.tight_layout()
plt.show()

ABoth plots show perfect linear separation of classes.

BLeft plot shows overlapping classes; right plot shows classes separated by curve.

CRight plot shows classes mixed more than left plot.

DLeft plot shows classes separated; right plot shows overlapping classes.

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in feature extraction code

What error will this code raise when trying to create a new feature 'bmi_category' based on 'bmi' values?

import pandas as pd
data = pd.DataFrame({'bmi': [18, 22, 27, 31]})
def bmi_cat(bmi):
    if bmi < 18.5:
        return 'Underweight'
    elif bmi < 25:
        return 'Normal'
    elif bmi < 30:
        return 'Overweight'
    else:
        return 'Obese'
data['bmi_category'] = data['bmi'].apply(bmi_cat)

ASyntaxError due to missing colon after else

BTypeError because 'bmi' is not a number

CKeyError because 'bmi_category' column does not exist

DNo error; code runs correctly

Attempts:

2 left

🚀 Application

expert

3:00remaining

Choosing engineered features for time series forecasting

You have daily sales data and want to improve forecasting by adding engineered features. Which feature is most likely to improve the model?

ADay of the week encoded as a number from 0 (Monday) to 6 (Sunday)

BRandom numbers generated for each day to add noise

CCustomer names as text strings

DUnique ID numbers for each sale

Attempts:

2 left