Recall & Review
beginner
What is feature engineering in data science?
Feature engineering is the process of creating new input variables or modifying existing ones to improve a machine learning model's performance.
Click to reveal answer
beginner
Why is feature engineering important?
It helps models learn better by providing more meaningful or clearer information, often leading to improved accuracy and insights.
Click to reveal answer
beginner
Name three common feature engineering techniques.
1. Creating new features by combining existing ones (e.g., ratios)<br>2. Encoding categorical variables (e.g., one-hot encoding)<br>3. Scaling or normalizing numerical features
Click to reveal answer
beginner
How can you create a new feature in pandas?
You can create a new column by performing operations on existing columns, for example:
df['new_feature'] = df['A'] + df['B'].Click to reveal answer
beginner
What is one-hot encoding and when do you use it?
One-hot encoding converts categorical variables into multiple binary columns (0 or 1) to help models understand categories as numbers without implying order.
Click to reveal answer
Which of the following is NOT a feature engineering technique?
✗ Incorrect
Removing missing values is data cleaning, not feature engineering.
In pandas, how do you create a new feature 'total' by adding columns 'A' and 'B'?
✗ Incorrect
Option A correctly creates a new column by adding two columns.
What does one-hot encoding do?
✗ Incorrect
One-hot encoding creates binary columns for each category.
Why might you create a new feature like 'age_group' from 'age'?
✗ Incorrect
Grouping continuous data into categories can help models capture patterns better.
Which pandas function helps to scale numerical features?
✗ Incorrect
StandardScaler from sklearn is used to scale numerical features.
Explain what feature engineering is and why it matters in simple terms.
Think about how changing or adding data columns can help a model learn better.
You got /3 concepts.
Describe how you would create a new feature in a pandas DataFrame using existing columns.
Remember the syntax for adding a new column in pandas.
You got /3 concepts.