Imagine you want to predict house prices. You have raw data like size in square feet and number of rooms. Why might creating new features like 'price per room' help the model?
Think about how combining simple data points can create more meaningful information.
Engineered features combine or transform raw data to highlight important relationships. This helps models learn better patterns and improve predictions.
What is the output of the following Python code that scales a feature using min-max scaling?
import numpy as np from sklearn.preprocessing import MinMaxScaler X = np.array([[10], [20], [30], [40], [50]]) scaler = MinMaxScaler() X_scaled = scaler.fit_transform(X) print(X_scaled.flatten())
Min-max scaling transforms values to a range between 0 and 1 based on min and max values.
The smallest value (10) becomes 0, the largest (50) becomes 1, and others are scaled linearly between.
You created polynomial features (squares and cubes) from your original data. Which model below is best suited to use these features effectively?
Think about how adding many polynomial features can cause overfitting and how regularization helps.
Regularized linear regression controls complexity from many polynomial features, improving generalization.
A model trained on raw features has 75% accuracy. After adding engineered features, accuracy rises to 85%. What does this improvement most likely indicate?
Think about what adding meaningful features does to model learning.
Better accuracy usually means the model learned better patterns thanks to the new features.
Consider a model where adding many engineered features caused test accuracy to drop. Which reason below best explains this?
Think about how adding many features can sometimes confuse the model.
Too many irrelevant or noisy features can cause the model to fit training data too closely and perform worse on new data.