0
0
ML Pythonprogramming~5 mins

Principal Component Analysis (PCA) in ML Python

Choose your learning style9 modes available
Introduction
PCA helps us simplify complex data by turning many features into fewer ones while keeping most important information.
When you have many features and want to reduce them to understand data better.
Before training a model to speed up learning and reduce noise.
To visualize high-dimensional data in 2D or 3D plots.
When you want to remove redundant or correlated features.
To compress data while keeping most of its meaning.
Syntax
ML Python
from sklearn.decomposition import PCA

pca = PCA(n_components=k)
X_reduced = pca.fit_transform(X)
n_components=k means you keep k main features after reduction.
fit_transform finds the main directions and applies the change to your data.
Examples
Keep 2 main features from the original data.
ML Python
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(X)
Keep enough features to explain 95% of the data variance.
ML Python
pca = PCA(n_components=0.95)
X_reduced = pca.fit_transform(X)
Keep all components and see how much each explains the data.
ML Python
pca = PCA()
X_reduced = pca.fit_transform(X)
print(pca.explained_variance_ratio_)
Sample Program
This code loads the Iris dataset, applies PCA to reduce from 4 features to 2, and prints the results.
ML Python
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA

# Load sample data
iris = load_iris()
X = iris.data

# Create PCA to keep 2 components
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(X)

# Show shape before and after
print(f'Original shape: {X.shape}')
print(f'Reduced shape: {X_reduced.shape}')

# Show explained variance ratio
print('Explained variance ratio:', pca.explained_variance_ratio_)

# Show first 5 transformed samples
print('First 5 samples after PCA:')
print(X_reduced[:5])
OutputSuccess
Important Notes
PCA works best when features are scaled similarly; consider scaling before PCA.
The components are new features that are combinations of original ones.
Explained variance ratio shows how much information each component keeps.
Summary
PCA reduces many features into fewer main features while keeping important info.
Use PCA to simplify data, speed up models, or visualize high-dimensional data.
The explained variance ratio tells how much each new feature matters.