0
0
MLOpsdevops~30 mins

Feature engineering pipelines in MLOps - Mini Project: Build & Apply

Choose your learning style9 modes available
Feature Engineering Pipelines
📖 Scenario: You are working on a machine learning project. You need to prepare your data by creating a feature engineering pipeline. This pipeline will help you clean and transform your data automatically before training your model.
🎯 Goal: Build a simple feature engineering pipeline using scikit-learn that scales numeric features and encodes categorical features.
📋 What You'll Learn
Create a dataset dictionary with numeric and categorical features
Define a configuration variable for numeric feature names
Build a pipeline that scales numeric features and encodes categorical features
Print the transformed feature array
💡 Why This Matters
🌍 Real World
Feature engineering pipelines automate data preparation steps, making machine learning workflows faster and less error-prone.
💼 Career
Understanding how to build and use feature engineering pipelines is essential for MLOps engineers and data scientists to deploy reliable ML models.
Progress0 / 4 steps
1
Create the initial dataset dictionary
Create a dictionary called data with these exact entries: 'age': [25, 32, 47], 'salary': [50000, 60000, 80000], and 'department': ['sales', 'engineering', 'marketing'].
MLOps
Need a hint?

Use a dictionary with keys 'age', 'salary', and 'department' and assign lists of values exactly as shown.

2
Define numeric feature names
Create a list called numeric_features containing the strings 'age' and 'salary'.
MLOps
Need a hint?

Define a list named numeric_features with the exact two strings.

3
Build the feature engineering pipeline
Import ColumnTransformer, StandardScaler, and OneHotEncoder from sklearn. Create a ColumnTransformer called preprocessor that applies StandardScaler() to numeric_features and OneHotEncoder() to the 'department' feature.
MLOps
Need a hint?

Use ColumnTransformer with two transformers: one for numeric features using StandardScaler() and one for categorical feature 'department' using OneHotEncoder().

4
Transform and print the processed features
Convert data to a pandas DataFrame called df. Use preprocessor.fit_transform(df) to transform the data and assign it to features. Print features.
MLOps
Need a hint?

Use pandas.DataFrame to convert data. Then call preprocessor.fit_transform(df) and print the result.