Bird
Raised Fist0
ML Pythonml~20 mins

Feature union in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Feature Union Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
What does FeatureUnion do in a machine learning pipeline?

Imagine you want to combine different sets of features extracted from the same data before training a model. What is the main purpose of using FeatureUnion in this context?

AIt combines multiple feature extraction processes by concatenating their outputs into a single feature set.
BIt selects the best single feature extractor among many to use for the model.
CIt reduces the dimensionality of features by applying PCA on the combined features.
DIt splits the dataset into multiple parts to train separate models independently.
Attempts:
2 left
💡 Hint

Think about how you can use different ways to get features and then join them together before training.

Predict Output
intermediate
2:00remaining
Output shape after FeatureUnion transformation

Given the following code, what is the shape of X_transformed?

ML Python
from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import numpy as np

X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

fu = FeatureUnion([
    ('scale', StandardScaler()),
    ('pca', PCA(n_components=1))
])

X_transformed = fu.fit_transform(X)
A(3, 4)
B(3, 2)
C(3, 3)
D(3, 1)
Attempts:
2 left
💡 Hint

Consider the output dimensions of each transformer and how FeatureUnion combines them.

Model Choice
advanced
2:00remaining
Choosing transformers for FeatureUnion

You want to build a FeatureUnion that combines text features and numeric features for a classification task. Which combination of transformers is most appropriate?

AUse <code>PCA</code> for text features and <code>OneHotEncoder</code> for numeric features.
BUse <code>CountVectorizer</code> for numeric features and <code>MinMaxScaler</code> for text features.
CUse <code>TfidfVectorizer</code> for text and <code>StandardScaler</code> for numeric features.
DUse <code>LabelEncoder</code> for text and <code>Normalizer</code> for numeric features.
Attempts:
2 left
💡 Hint

Think about which transformers are designed for text and which for numeric data.

Hyperparameter
advanced
2:00remaining
Effect of n_jobs parameter in FeatureUnion

What is the effect of setting n_jobs=-1 in a FeatureUnion?

AIt disables parallelism and runs transformers sequentially.
BIt runs all transformers in parallel using all available CPU cores.
CIt limits the number of features to one per transformer.
DIt automatically tunes hyperparameters of each transformer.
Attempts:
2 left
💡 Hint

Consider what n_jobs=-1 usually means in scikit-learn.

🔧 Debug
expert
3:00remaining
Why does this FeatureUnion pipeline raise a ValueError?

Consider this code snippet:

from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import numpy as np

X = np.array([[1, 2], [3, 4], [5, 6]])

fu = FeatureUnion([
    ('scale', StandardScaler()),
    ('pca', PCA(n_components=3))
])

X_transformed = fu.fit_transform(X)

Why does this code raise a ValueError?

AThe input array X must be 1-dimensional for FeatureUnion.
BStandardScaler cannot process arrays with less than 3 rows.
CFeatureUnion requires all transformers to output the same number of features.
DPCA is asked to extract more components (3) than the number of features (2) in X.
Attempts:
2 left
💡 Hint

Check the shape of X and the n_components parameter of PCA.

Practice

(1/5)
1. What is the main purpose of using FeatureUnion in machine learning?
easy
A. To combine multiple feature extraction methods into a single feature set
B. To split data into training and testing sets
C. To reduce the number of features by selecting the best ones
D. To train multiple models and average their predictions

Solution

  1. Step 1: Understand FeatureUnion's role

    FeatureUnion is used to combine different feature extraction methods so their outputs join into one feature set.
  2. Step 2: Compare with other options

    Splitting data, feature selection, and model averaging are different tasks not done by FeatureUnion.
  3. Final Answer:

    To combine multiple feature extraction methods into a single feature set -> Option A
  4. Quick Check:

    FeatureUnion = Combine features [OK]
Hint: FeatureUnion joins features, not data splits or models [OK]
Common Mistakes:
  • Confusing FeatureUnion with data splitting
  • Thinking it selects features instead of combining
  • Mixing it up with model ensemble methods
2. Which of the following is the correct way to create a FeatureUnion with two transformers named 'tf1' and 'tf2'?
easy
A. FeatureUnion(tf1=transformer1, tf2=transformer2)
B. FeatureUnion({'tf1': transformer1, 'tf2': transformer2})
C. FeatureUnion([('tf1', transformer1), ('tf2', transformer2)])
D. FeatureUnion(transformer1, transformer2)

Solution

  1. Step 1: Recall FeatureUnion syntax

    FeatureUnion expects a list of tuples, each tuple with a name and a transformer.
  2. Step 2: Check each option

    FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) uses a list of tuples correctly. Options B, C, and D use wrong data structures or missing list.
  3. Final Answer:

    FeatureUnion([('tf1', transformer1), ('tf2', transformer2)]) -> Option C
  4. Quick Check:

    FeatureUnion needs list of (name, transformer) tuples [OK]
Hint: Use list of (name, transformer) tuples for FeatureUnion [OK]
Common Mistakes:
  • Passing a dictionary instead of list of tuples
  • Passing transformers without names
  • Passing transformers as separate arguments
3. Given the code below, what will be the shape of X_transformed?
from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import numpy as np

X = np.array([[1, 2, 3], [4, 5, 6]])

union = FeatureUnion([
    ('scale', StandardScaler()),
    ('pca', PCA(n_components=1))
])

X_transformed = union.fit_transform(X)
medium
A. (2, 1)
B. (2, 3)
C. (2, 2)
D. (2, 4)

Solution

  1. Step 1: Analyze each transformer output

    StandardScaler keeps original shape (2 samples, 3 features) so output shape is (2,3). PCA with n_components=1 outputs (2,1).
  2. Step 2: Combine outputs with FeatureUnion

    FeatureUnion concatenates outputs horizontally: (2,3) + (2,1) = (2,4).
  3. Final Answer:

    (2, 4) -> Option D
  4. Quick Check:

    Concatenate (2,3) and (2,1) = (2,4) [OK]
Hint: FeatureUnion concatenates horizontally, sum feature counts [OK]
Common Mistakes:
  • Assuming PCA output replaces original features
  • Thinking FeatureUnion stacks vertically
  • Ignoring output shapes of individual transformers
4. You wrote this code but get an error:
union = FeatureUnion([
    ('scale', StandardScaler()),
    ('pca', PCA(n_components=3))
])

X_transformed = union.fit_transform([[1, 2], [3, 4], [5, 6]])
What is the likely cause of the error?
medium
A. PCA cannot have n_components greater than input features
B. StandardScaler requires 3D input, but input is 2D
C. FeatureUnion requires transformers to have fit_predict method
D. Input data must be a pandas DataFrame, not a list

Solution

  1. Step 1: Check input data shape

    The input X = [[1,2],[3,4],[5,6]] has shape (3, 2), meaning 2 features.
  2. Step 2: Analyze PCA configuration

    PCA(n_components=3) requests 3 components, but only 2 features are available, causing a ValueError.
  3. Final Answer:

    PCA cannot have n_components greater than input features -> Option A
  4. Quick Check:

    PCA n_components ≤ features [OK]
Hint: Check PCA n_components ≤ number of features [OK]
Common Mistakes:
  • Assuming StandardScaler needs 3D input
  • Thinking FeatureUnion needs fit_predict
  • Believing input must be DataFrame
5. You want to combine text and numeric features for a model. You have a TfidfVectorizer for text and StandardScaler for numeric data. How do you use FeatureUnion to prepare the data correctly?
hard
A. Apply TfidfVectorizer and StandardScaler separately, then add their outputs manually
B. Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer
C. Use FeatureUnion directly on raw data without preprocessing
D. Use StandardScaler on text data and TfidfVectorizer on numeric data

Solution

  1. Step 1: Understand data types and transformers

    Text and numeric data need different preprocessing. TfidfVectorizer works on text, StandardScaler on numeric features.
  2. Step 2: Use ColumnTransformer with FeatureUnion

    Apply each transformer to correct columns using ColumnTransformer, then combine with FeatureUnion to merge features.
  3. Final Answer:

    Use FeatureUnion with transformers for text and numeric, each applied to their columns via ColumnTransformer -> Option B
  4. Quick Check:

    Separate preprocessing per data type, then combine [OK]
Hint: Preprocess each data type separately, then combine features [OK]
Common Mistakes:
  • Applying wrong transformer to wrong data type
  • Skipping column selection before FeatureUnion
  • Trying to combine raw data without preprocessing