0
0
ML Pythonprogramming~5 mins

Anomaly detection basics in ML Python

Choose your learning style9 modes available
Introduction
Anomaly detection helps find things that are different or unusual in data. It is useful to spot problems or rare events early.
Detecting fraud in credit card transactions
Finding unusual activity in network security
Spotting defects in manufacturing products
Monitoring health data for abnormal signs
Identifying errors in sensor readings
Syntax
ML Python
from sklearn.ensemble import IsolationForest
model = IsolationForest(contamination=0.1)
model.fit(data)
predictions = model.predict(new_data)
The 'contamination' parameter tells the model the expected fraction of anomalies.
The model predicts -1 for anomalies and 1 for normal points.
Examples
This example trains an Isolation Forest to detect 5% anomalies in the test data.
ML Python
from sklearn.ensemble import IsolationForest
model = IsolationForest(contamination=0.05)
model.fit(X_train)
preds = model.predict(X_test)
Using One-Class SVM to find anomalies with 10% expected outliers.
ML Python
from sklearn.svm import OneClassSVM
model = OneClassSVM(nu=0.1, kernel='rbf')
model.fit(X_train)
preds = model.predict(X_test)
Sample Program
This code trains an Isolation Forest on normal data and tests it on new points including clear anomalies. It prints which points are normal or anomalies.
ML Python
import numpy as np
from sklearn.ensemble import IsolationForest

# Create sample data: mostly normal points around 0, some anomalies far away
np.random.seed(42)
X_train = np.random.randn(100, 2)
X_test = np.vstack([np.random.randn(20, 2), np.array([[10, 10], [15, 15]])])

# Train Isolation Forest
model = IsolationForest(contamination=0.1, random_state=42)
model.fit(X_train)

# Predict anomalies
predictions = model.predict(X_test)

# Print results
for i, pred in enumerate(predictions):
    label = 'Anomaly' if pred == -1 else 'Normal'
    print(f'Point {i}: {X_test[i]} is {label}')
OutputSuccess
Important Notes
Anomaly detection models often need you to guess how many anomalies to expect.
Isolation Forest works well for many types of data and is easy to use.
Anomalies are marked with -1, normal points with 1 in predictions.
Summary
Anomaly detection finds unusual data points that differ from normal patterns.
Isolation Forest is a simple and effective method to detect anomalies.
You train the model on normal data and then predict if new points are normal or anomalies.