Feature union helps combine different sets of features into one. It makes it easy to use many types of information together for better predictions.
Feature union in ML Python
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
ML Python
from sklearn.pipeline import FeatureUnion feature_union = FeatureUnion(transformer_list=[ ('name1', transformer1), ('name2', transformer2), # ... ]) combined_features = feature_union.fit_transform(X)
transformer_list is a list of (name, transformer) pairs.
Each transformer should have fit and transform methods.
Examples
ML Python
from sklearn.pipeline import FeatureUnion from sklearn.decomposition import PCA from sklearn.feature_extraction.text import TfidfVectorizer union = FeatureUnion([ ('pca', PCA(n_components=2)), ('tfidf', TfidfVectorizer()) ])
ML Python
from sklearn.preprocessing import FunctionTransformer from sklearn.feature_extraction.text import CountVectorizer from sklearn.pipeline import FeatureUnion union = FeatureUnion([ ('length', FunctionTransformer(lambda x: x.apply(len).values.reshape(-1,1))), ('count', CountVectorizer()) ])
Sample Model
This program loads iris data, scales it, applies PCA, and combines both features side by side.
ML Python
from sklearn.pipeline import FeatureUnion from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from sklearn.datasets import load_iris # Load sample data iris = load_iris() X = iris.data # Define two feature transformers scaler = StandardScaler() pca = PCA(n_components=2) # Combine features using FeatureUnion union = FeatureUnion([ ('scaled', scaler), ('pca', pca) ]) # Fit and transform data X_combined = union.fit_transform(X) print('Original shape:', X.shape) print('Combined features shape:', X_combined.shape) print('First 3 rows of combined features:') print(X_combined[:3])
Important Notes
FeatureUnion concatenates features horizontally (side by side).
All transformers must output arrays with the same number of rows as input.
Use FeatureUnion inside pipelines to build complex workflows.
Summary
FeatureUnion combines multiple feature sets into one.
It helps use different data types or extraction methods together.
It keeps code organized and improves model input.
Practice
1. What is the main purpose of using
FeatureUnion in machine learning?easy
Solution
Step 1: Understand FeatureUnion's role
FeatureUnion is used to combine different feature extraction methods so their outputs join into one feature set.Step 2: Compare with other options
Splitting data, feature selection, and model averaging are different tasks not done by FeatureUnion.Final Answer:
To combine multiple feature extraction methods into a single feature set -> Option AQuick Check:
FeatureUnion = Combine features [OK]
Hint: FeatureUnion joins features, not data splits or models [OK]
Common Mistakes:
- Confusing FeatureUnion with data splitting
- Thinking it selects features instead of combining
- Mixing it up with model ensemble methods
2. Which of the following is the correct way to create a
FeatureUnion with two transformers named 'tf1' and 'tf2'?easy
Solution
Step 1: Recall FeatureUnion syntax
FeatureUnion expects a list of tuples, each tuple with a name and a transformer.Step 2: Check each option
FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) uses a list of tuples correctly. Options B, C, and D use wrong data structures or missing list.Final Answer:
FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) -> Option CQuick Check:
FeatureUnion needs list of (name, transformer) tuples [OK]
Hint: Use list of (name, transformer) tuples for FeatureUnion [OK]
Common Mistakes:
- Passing a dictionary instead of list of tuples
- Passing transformers without names
- Passing transformers as separate arguments
3. Given the code below, what will be the shape of
X_transformed?
from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import numpy as np
X = np.array([[1, 2, 3], [4, 5, 6]])
union = FeatureUnion([
('scale', StandardScaler()),
('pca', PCA(n_components=1))
])
X_transformed = union.fit_transform(X)medium
Solution
Step 1: Analyze each transformer output
StandardScaler keeps original shape (2 samples, 3 features) so output shape is (2,3). PCA with n_components=1 outputs (2,1).Step 2: Combine outputs with FeatureUnion
FeatureUnion concatenates outputs horizontally: (2,3) + (2,1) = (2,4).Final Answer:
(2, 4) -> Option DQuick Check:
Concatenate (2,3) and (2,1) = (2,4) [OK]
Hint: FeatureUnion concatenates horizontally, sum feature counts [OK]
Common Mistakes:
- Assuming PCA output replaces original features
- Thinking FeatureUnion stacks vertically
- Ignoring output shapes of individual transformers
4. You wrote this code but get an error:
union = FeatureUnion([
('scale', StandardScaler()),
('pca', PCA(n_components=3))
])
X_transformed = union.fit_transform([[1, 2], [3, 4], [5, 6]])
What is the likely cause of the error?medium
Solution
Step 1: Check input data shape
The input X = [[1,2],[3,4],[5,6]] has shape (3, 2), meaning 2 features.Step 2: Analyze PCA configuration
PCA(n_components=3) requests 3 components, but only 2 features are available, causing a ValueError.Final Answer:
PCA cannot have n_components greater than input features -> Option AQuick Check:
PCA n_components ≤ features [OK]
Hint: Check PCA n_components ≤ number of features [OK]
Common Mistakes:
- Assuming StandardScaler needs 3D input
- Thinking FeatureUnion needs fit_predict
- Believing input must be DataFrame
5. You want to combine text and numeric features for a model. You have a
TfidfVectorizer for text and StandardScaler for numeric data. How do you use FeatureUnion to prepare the data correctly?hard
Solution
Step 1: Understand data types and transformers
Text and numeric data need different preprocessing. TfidfVectorizer works on text, StandardScaler on numeric features.Step 2: Use ColumnTransformer with FeatureUnion
Apply each transformer to correct columns using ColumnTransformer, then combine with FeatureUnion to merge features.Final Answer:
Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer -> Option BQuick Check:
Separate preprocessing per data type, then combine [OK]
Hint: Preprocess each data type separately, then combine features [OK]
Common Mistakes:
- Applying wrong transformer to wrong data type
- Skipping column selection before FeatureUnion
- Trying to combine raw data without preprocessing
