Which of the following best explains why engineered features improve the performance of machine learning models?
Think about how new features can help models understand data better.
Engineered features create new, meaningful information from existing data, helping models detect patterns that raw data alone might not reveal.
Given the following code that scales a feature using min-max scaling, what is the output?
import pandas as pd from sklearn.preprocessing import MinMaxScaler data = pd.DataFrame({'age': [20, 30, 40, 50]}) scaler = MinMaxScaler() data['age_scaled'] = scaler.fit_transform(data[['age']]) print(data)
Min-max scaling transforms values to a 0-1 range based on min and max.
The minimum age 20 becomes 0.0, maximum 50 becomes 1.0, and others scale linearly between.
Which plot best shows how adding polynomial features can help separate data that is not linearly separable?
import matplotlib.pyplot as plt import numpy as np from sklearn.preprocessing import PolynomialFeatures np.random.seed(0) X = np.random.randn(100, 1) y = (X[:, 0] > 0).astype(int) poly = PolynomialFeatures(degree=2, include_bias=False) X_poly = poly.fit_transform(X) plt.figure(figsize=(12, 5)) plt.subplot(1, 2, 1) plt.scatter(X, y, c=y, cmap='bwr') plt.title('Original Feature') plt.xlabel('X') plt.ylabel('Class') plt.subplot(1, 2, 2) plt.scatter(X_poly[:, 0], X_poly[:, 1], c=y, cmap='bwr') plt.title('With Polynomial Feature') plt.xlabel('X') plt.ylabel('X squared') plt.tight_layout() plt.show()
Polynomial features add curves that can separate data better than a straight line.
The original feature cannot separate classes linearly, but adding squared terms creates a curved boundary that separates classes better.
What error will this code raise when trying to create a new feature 'bmi_category' based on 'bmi' values?
import pandas as pd
data = pd.DataFrame({'bmi': [18, 22, 27, 31]})
def bmi_cat(bmi):
if bmi < 18.5:
return 'Underweight'
elif bmi < 25:
return 'Normal'
elif bmi < 30:
return 'Overweight'
else:
return 'Obese'
data['bmi_category'] = data['bmi'].apply(bmi_cat)Check the syntax of the if-else statements carefully.
The else statement is missing a colon, causing a SyntaxError.
You have daily sales data and want to improve forecasting by adding engineered features. Which feature is most likely to improve the model?
Think about what patterns repeat regularly in daily sales.
Day of the week captures weekly seasonality, which is important for forecasting sales patterns.