MlopsConceptBeginner · 3 min read

What is Classification in Machine Learning in Python with sklearn

In machine learning, classification is the task of predicting which category or class an input belongs to. Using Python's sklearn library, you can train models that learn from labeled data to classify new data points into classes.

⚙️

How It Works

Classification is like sorting mail into different boxes based on the address. The machine learning model learns from examples where the correct box (class) is already known. It studies patterns in the data, such as colors, shapes, or numbers, to decide which box new mail should go into.

In Python, libraries like sklearn provide tools to create these models easily. You give the model a set of labeled examples (training data), and it finds rules to separate the classes. Later, when you give it new data, it uses those rules to predict the class.

💻

Example

This example shows how to use sklearn to classify iris flowers into species based on their measurements.

python

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load iris dataset
iris = load_iris()
X = iris.data  # features
y = iris.target  # labels

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create and train the model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

# Predict on test data
predictions = model.predict(X_test)

# Check accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2f}")

Output

Accuracy: 1.00

🎯

When to Use

Use classification when you want to assign items into categories. For example:

Detecting if an email is spam or not
Recognizing handwritten digits
Classifying types of flowers or animals
Medical diagnosis based on symptoms

Classification helps automate decisions where the output is a label or category.

✅

Key Points

Classification predicts categories or classes for data points.
sklearn in Python offers easy tools to build classification models.
Models learn from labeled examples to make predictions on new data.
Common algorithms include logistic regression, decision trees, and support vector machines.

✅

Key Takeaways

Classification assigns input data to predefined categories based on learned patterns.

Python's sklearn library simplifies building and using classification models.

Training requires labeled data to teach the model how to classify.

Classification is useful for tasks like spam detection, image recognition, and medical diagnosis.