0
0
ML Pythonml~5 mins

Feature union in ML Python

Choose your learning style9 modes available
Introduction

Feature union helps combine different sets of features into one. It makes it easy to use many types of information together for better predictions.

When you have different types of data like text and numbers and want to use both.
When you want to try different ways to extract features and combine them.
When you want to build a pipeline that uses multiple feature extraction steps.
When you want to improve model accuracy by using more information.
When you want to keep your code clean by combining features in one step.
Syntax
ML Python
from sklearn.pipeline import FeatureUnion

feature_union = FeatureUnion(transformer_list=[
    ('name1', transformer1),
    ('name2', transformer2),
    # ...
])

combined_features = feature_union.fit_transform(X)

transformer_list is a list of (name, transformer) pairs.

Each transformer should have fit and transform methods.

Examples
This example combines PCA for numeric data and TF-IDF for text data.
ML Python
from sklearn.pipeline import FeatureUnion
from sklearn.decomposition import PCA
from sklearn.feature_extraction.text import TfidfVectorizer

union = FeatureUnion([
    ('pca', PCA(n_components=2)),
    ('tfidf', TfidfVectorizer())
])
Combines a custom length feature with a count vectorizer for text.
ML Python
from sklearn.preprocessing import FunctionTransformer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.pipeline import FeatureUnion

union = FeatureUnion([
    ('length', FunctionTransformer(lambda x: x.apply(len).values.reshape(-1,1))),
    ('count', CountVectorizer())
])
Sample Model

This program loads iris data, scales it, applies PCA, and combines both features side by side.

ML Python
from sklearn.pipeline import FeatureUnion
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.datasets import load_iris

# Load sample data
iris = load_iris()
X = iris.data

# Define two feature transformers
scaler = StandardScaler()
pca = PCA(n_components=2)

# Combine features using FeatureUnion
union = FeatureUnion([
    ('scaled', scaler),
    ('pca', pca)
])

# Fit and transform data
X_combined = union.fit_transform(X)

print('Original shape:', X.shape)
print('Combined features shape:', X_combined.shape)
print('First 3 rows of combined features:')
print(X_combined[:3])
OutputSuccess
Important Notes

FeatureUnion concatenates features horizontally (side by side).

All transformers must output arrays with the same number of rows as input.

Use FeatureUnion inside pipelines to build complex workflows.

Summary

FeatureUnion combines multiple feature sets into one.

It helps use different data types or extraction methods together.

It keeps code organized and improves model input.