Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a Feature Union in machine learning?
Feature Union is a technique that combines multiple feature extraction processes into one. It joins different sets of features side by side to create a bigger feature set for a model.
Click to reveal answer
beginner
Why use Feature Union instead of just one feature extractor?
Using Feature Union lets you combine different types of features that capture different information. This can help the model learn better by having more diverse data.
Click to reveal answer
intermediate
How does Feature Union work internally?
Feature Union runs each feature extractor separately on the input data, then stacks their outputs horizontally (side by side) to form one combined feature matrix.
Click to reveal answer
intermediate
Give a simple example of using Feature Union in Python with scikit-learn.
You can combine a text vectorizer and a numeric scaler like this: FeatureUnion([('text', CountVectorizer()), ('num', StandardScaler())]). This creates features from text and numbers together.
Click to reveal answer
intermediate
What is the difference between Feature Union and Pipeline in scikit-learn?
Feature Union combines features from parallel transformers side by side. Pipeline applies transformers sequentially, one after another, transforming data step by step.
Click to reveal answer
What does Feature Union do with the outputs of multiple feature extractors?
AStacks them side by side to create a combined feature set
BAdds them together element-wise
CSelects the best output and discards others
DRuns them one after another in sequence
✗ Incorrect
Feature Union stacks outputs horizontally, combining features side by side.
Which scikit-learn class is used to perform Feature Union?
AFeatureUnion
BPipeline
CGridSearchCV
DStandardScaler
✗ Incorrect
FeatureUnion is the class designed to combine multiple feature extractors.
Feature Union is most useful when:
AYou want to apply transformations sequentially
BYou want to reduce the number of features
CYou want to combine different types of features from the same data
DYou want to train multiple models separately
✗ Incorrect
Feature Union helps combine different feature sets extracted in parallel.
What is the main difference between Feature Union and Pipeline?
AFeature Union is for model training; Pipeline is for data cleaning
BFeature Union applies steps sequentially; Pipeline combines features side by side
CFeature Union is only for numeric data; Pipeline is only for text data
DFeature Union combines features side by side; Pipeline applies steps one after another
✗ Incorrect
Feature Union merges features horizontally; Pipeline chains transformations vertically.
If you want to combine text features and numeric features for a model, which technique is best?
AGrid Search
BFeature Union
CCross Validation
DPrincipal Component Analysis
✗ Incorrect
Feature Union allows combining different feature types into one feature set.
Explain in your own words what Feature Union is and why it is useful in machine learning.
Think about how combining different views of data can help a model.
You got /4 concepts.
Describe the difference between Feature Union and Pipeline in scikit-learn.
Consider how data flows through each.
You got /4 concepts.
Practice
(1/5)
1. What is the main purpose of using FeatureUnion in machine learning?
easy
A. To combine multiple feature extraction methods into a single feature set
B. To split data into training and testing sets
C. To reduce the number of features by selecting the best ones
D. To train multiple models and average their predictions
Solution
Step 1: Understand FeatureUnion's role
FeatureUnion is used to combine different feature extraction methods so their outputs join into one feature set.
Step 2: Compare with other options
Splitting data, feature selection, and model averaging are different tasks not done by FeatureUnion.
Final Answer:
To combine multiple feature extraction methods into a single feature set -> Option A
Quick Check:
FeatureUnion = Combine features [OK]
Hint: FeatureUnion joins features, not data splits or models [OK]
Common Mistakes:
Confusing FeatureUnion with data splitting
Thinking it selects features instead of combining
Mixing it up with model ensemble methods
2. Which of the following is the correct way to create a FeatureUnion with two transformers named 'tf1' and 'tf2'?
easy
A. FeatureUnion(tf1=transformer1, tf2=transformer2)
B. FeatureUnion({'tf1': transformer1, 'tf2': transformer2})
C. FeatureUnion([('tf1', transformer1), ('tf2', transformer2)])
D. FeatureUnion(transformer1, transformer2)
Solution
Step 1: Recall FeatureUnion syntax
FeatureUnion expects a list of tuples, each tuple with a name and a transformer.
Step 2: Check each option
FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) uses a list of tuples correctly. Options B, C, and D use wrong data structures or missing list.
Final Answer:
FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) -> Option C
Quick Check:
FeatureUnion needs list of (name, transformer) tuples [OK]
Hint: Use list of (name, transformer) tuples for FeatureUnion [OK]
Common Mistakes:
Passing a dictionary instead of list of tuples
Passing transformers without names
Passing transformers as separate arguments
3. Given the code below, what will be the shape of X_transformed?
from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import numpy as np
X = np.array([[1, 2, 3], [4, 5, 6]])
union = FeatureUnion([
('scale', StandardScaler()),
('pca', PCA(n_components=1))
])
X_transformed = union.fit_transform(X)
medium
A. (2, 1)
B. (2, 3)
C. (2, 2)
D. (2, 4)
Solution
Step 1: Analyze each transformer output
StandardScaler keeps original shape (2 samples, 3 features) so output shape is (2,3). PCA with n_components=1 outputs (2,1).
A. PCA cannot have n_components greater than input features
B. StandardScaler requires 3D input, but input is 2D
C. FeatureUnion requires transformers to have fit_predict method
D. Input data must be a pandas DataFrame, not a list
Solution
Step 1: Check input data shape
The input X = [[1,2],[3,4],[5,6]] has shape (3, 2), meaning 2 features.
Step 2: Analyze PCA configuration
PCA(n_components=3) requests 3 components, but only 2 features are available, causing a ValueError.
Final Answer:
PCA cannot have n_components greater than input features -> Option A
Quick Check:
PCA n_components ≤ features [OK]
Hint: Check PCA n_components ≤ number of features [OK]
Common Mistakes:
Assuming StandardScaler needs 3D input
Thinking FeatureUnion needs fit_predict
Believing input must be DataFrame
5. You want to combine text and numeric features for a model. You have a TfidfVectorizer for text and StandardScaler for numeric data. How do you use FeatureUnion to prepare the data correctly?
hard
A. Apply TfidfVectorizer and StandardScaler separately, then add their outputs manually
B. Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer
C. Use FeatureUnion directly on raw data without preprocessing
D. Use StandardScaler on text data and TfidfVectorizer on numeric data
Solution
Step 1: Understand data types and transformers
Text and numeric data need different preprocessing. TfidfVectorizer works on text, StandardScaler on numeric features.
Step 2: Use ColumnTransformer with FeatureUnion
Apply each transformer to correct columns using ColumnTransformer, then combine with FeatureUnion to merge features.
Final Answer:
Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer -> Option B
Quick Check:
Separate preprocessing per data type, then combine [OK]
Hint: Preprocess each data type separately, then combine features [OK]