Practice

(1/5)

1. What is the main purpose of using FeatureUnion in machine learning?

easy

A. To combine multiple feature extraction methods into a single feature set

B. To split data into training and testing sets

C. To reduce the number of features by selecting the best ones

D. To train multiple models and average their predictions

Solution

Step 1: Understand FeatureUnion's role
FeatureUnion is used to combine different feature extraction methods so their outputs join into one feature set.
Step 2: Compare with other options
Splitting data, feature selection, and model averaging are different tasks not done by FeatureUnion.
Final Answer:
To combine multiple feature extraction methods into a single feature set -> Option A
Quick Check:
FeatureUnion = Combine features [OK]

Hint: FeatureUnion joins features, not data splits or models [OK]

Common Mistakes:

Confusing FeatureUnion with data splitting
Thinking it selects features instead of combining
Mixing it up with model ensemble methods

2. Which of the following is the correct way to create a FeatureUnion with two transformers named 'tf1' and 'tf2'?

easy

A. FeatureUnion(tf1=transformer1, tf2=transformer2)

B. FeatureUnion({'tf1': transformer1, 'tf2': transformer2})

C. FeatureUnion([('tf1', transformer1), ('tf2', transformer2)])

D. FeatureUnion(transformer1, transformer2)

Solution

Step 1: Recall FeatureUnion syntax
FeatureUnion expects a list of tuples, each tuple with a name and a transformer.
Step 2: Check each option
FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) uses a list of tuples correctly. Options B, C, and D use wrong data structures or missing list.
Final Answer:
FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) -> Option C
Quick Check:
FeatureUnion needs list of (name, transformer) tuples [OK]

Hint: Use list of (name, transformer) tuples for FeatureUnion [OK]

Common Mistakes:

Passing a dictionary instead of list of tuples
Passing transformers without names
Passing transformers as separate arguments

3. Given the code below, what will be the shape of X_transformed?

from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import numpy as np

X = np.array([[1, 2, 3], [4, 5, 6]])

union = FeatureUnion([
    ('scale', StandardScaler()),
    ('pca', PCA(n_components=1))
])

X_transformed = union.fit_transform(X)

medium

A. (2, 1)

B. (2, 3)

C. (2, 2)

D. (2, 4)

Solution

Step 1: Analyze each transformer output
StandardScaler keeps original shape (2 samples, 3 features) so output shape is (2,3). PCA with n_components=1 outputs (2,1).
Step 2: Combine outputs with FeatureUnion
FeatureUnion concatenates outputs horizontally: (2,3) + (2,1) = (2,4).
Final Answer:
(2, 4) -> Option D
Quick Check:
Concatenate (2,3) and (2,1) = (2,4) [OK]

Hint: FeatureUnion concatenates horizontally, sum feature counts [OK]

Common Mistakes:

Assuming PCA output replaces original features
Thinking FeatureUnion stacks vertically
Ignoring output shapes of individual transformers

4. You wrote this code but get an error:

union = FeatureUnion([
    ('scale', StandardScaler()),
    ('pca', PCA(n_components=3))
])

X_transformed = union.fit_transform([[1, 2], [3, 4], [5, 6]])

What is the likely cause of the error?

medium

A. PCA cannot have n_components greater than input features

B. StandardScaler requires 3D input, but input is 2D

C. FeatureUnion requires transformers to have fit_predict method

D. Input data must be a pandas DataFrame, not a list

Solution

Step 1: Check input data shape
The input X = [[1,2],[3,4],[5,6]] has shape (3, 2), meaning 2 features.
Step 2: Analyze PCA configuration
PCA(n_components=3) requests 3 components, but only 2 features are available, causing a ValueError.
Final Answer:
PCA cannot have n_components greater than input features -> Option A
Quick Check:
PCA n_components ≤ features [OK]

Hint: Check PCA n_components ≤ number of features [OK]

Common Mistakes:

Assuming StandardScaler needs 3D input
Thinking FeatureUnion needs fit_predict
Believing input must be DataFrame

5. You want to combine text and numeric features for a model. You have a TfidfVectorizer for text and StandardScaler for numeric data. How do you use FeatureUnion to prepare the data correctly?

hard

A. Apply TfidfVectorizer and StandardScaler separately, then add their outputs manually

B. Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer

C. Use FeatureUnion directly on raw data without preprocessing

D. Use StandardScaler on text data and TfidfVectorizer on numeric data

Solution

Step 1: Understand data types and transformers
Text and numeric data need different preprocessing. TfidfVectorizer works on text, StandardScaler on numeric features.
Step 2: Use ColumnTransformer with FeatureUnion
Apply each transformer to correct columns using ColumnTransformer, then combine with FeatureUnion to merge features.
Final Answer:
Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer -> Option B
Quick Check:
Separate preprocessing per data type, then combine [OK]

Hint: Preprocess each data type separately, then combine features [OK]

Common Mistakes:

Applying wrong transformer to wrong data type
Skipping column selection before FeatureUnion
Trying to combine raw data without preprocessing

Why Feature union in ML Python? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand FeatureUnion's role

Step 2: Compare with other options

Final Answer:

Quick Check:

Solution

Step 1: Recall FeatureUnion syntax

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Analyze each transformer output

Step 2: Combine outputs with FeatureUnion

Final Answer:

Quick Check:

Solution

Step 1: Check input data shape

Step 2: Analyze PCA configuration

Final Answer:

Quick Check:

Solution

Step 1: Understand data types and transformers

Step 2: Use ColumnTransformer with FeatureUnion

Final Answer:

Quick Check: