0
0
Data Analysis Pythondata~20 mins

Why engineered features improve analysis in Data Analysis Python - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Feature Engineering Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
How do engineered features help machine learning models?

Which of the following best explains why engineered features improve the performance of machine learning models?

AThey make the dataset more complex without adding useful information.
BThey reduce the size of the dataset by removing rows.
CThey add new information that helps models find patterns more easily.
DThey always increase the number of features without improving accuracy.
Attempts:
2 left
💡 Hint

Think about how new features can help models understand data better.

data_output
intermediate
2:00remaining
Output of feature scaling on data

Given the following code that scales a feature using min-max scaling, what is the output?

Data Analysis Python
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

data = pd.DataFrame({'age': [20, 30, 40, 50]})
scaler = MinMaxScaler()
data['age_scaled'] = scaler.fit_transform(data[['age']])
print(data)
ARaises a TypeError due to wrong input format
B{'age': [20, 30, 40, 50], 'age_scaled': [20, 30, 40, 50]}
C{'age': [20, 30, 40, 50], 'age_scaled': [1.0, 0.6667, 0.3333, 0.0]}
D{'age': [20, 30, 40, 50], 'age_scaled': [0.0, 0.3333, 0.6667, 1.0]}
Attempts:
2 left
💡 Hint

Min-max scaling transforms values to a 0-1 range based on min and max.

visualization
advanced
3:00remaining
Effect of polynomial features on data separation

Which plot best shows how adding polynomial features can help separate data that is not linearly separable?

Data Analysis Python
import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import PolynomialFeatures

np.random.seed(0)
X = np.random.randn(100, 1)
y = (X[:, 0] > 0).astype(int)
poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(X)

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.scatter(X, y, c=y, cmap='bwr')
plt.title('Original Feature')
plt.xlabel('X')
plt.ylabel('Class')

plt.subplot(1, 2, 2)
plt.scatter(X_poly[:, 0], X_poly[:, 1], c=y, cmap='bwr')
plt.title('With Polynomial Feature')
plt.xlabel('X')
plt.ylabel('X squared')
plt.tight_layout()
plt.show()
ABoth plots show perfect linear separation of classes.
BLeft plot shows overlapping classes; right plot shows classes separated by curve.
CRight plot shows classes mixed more than left plot.
DLeft plot shows classes separated; right plot shows overlapping classes.
Attempts:
2 left
💡 Hint

Polynomial features add curves that can separate data better than a straight line.

🔧 Debug
advanced
2:00remaining
Identify the error in feature extraction code

What error will this code raise when trying to create a new feature 'bmi_category' based on 'bmi' values?

import pandas as pd
data = pd.DataFrame({'bmi': [18, 22, 27, 31]})
def bmi_cat(bmi):
    if bmi < 18.5:
        return 'Underweight'
    elif bmi < 25:
        return 'Normal'
    elif bmi < 30:
        return 'Overweight'
    else:
        return 'Obese'
data['bmi_category'] = data['bmi'].apply(bmi_cat)
ASyntaxError due to missing colon after else
BTypeError because 'bmi' is not a number
CKeyError because 'bmi_category' column does not exist
DNo error; code runs correctly
Attempts:
2 left
💡 Hint

Check the syntax of the if-else statements carefully.

🚀 Application
expert
3:00remaining
Choosing engineered features for time series forecasting

You have daily sales data and want to improve forecasting by adding engineered features. Which feature is most likely to improve the model?

ADay of the week encoded as a number from 0 (Monday) to 6 (Sunday)
BRandom numbers generated for each day to add noise
CCustomer names as text strings
DUnique ID numbers for each sale
Attempts:
2 left
💡 Hint

Think about what patterns repeat regularly in daily sales.