What is Classification in Machine Learning in Python with sklearn
classification is the task of predicting which category or class an input belongs to. Using Python's sklearn library, you can train models that learn from labeled data to classify new data points into classes.How It Works
Classification is like sorting mail into different boxes based on the address. The machine learning model learns from examples where the correct box (class) is already known. It studies patterns in the data, such as colors, shapes, or numbers, to decide which box new mail should go into.
In Python, libraries like sklearn provide tools to create these models easily. You give the model a set of labeled examples (training data), and it finds rules to separate the classes. Later, when you give it new data, it uses those rules to predict the class.
Example
This example shows how to use sklearn to classify iris flowers into species based on their measurements.
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score # Load iris dataset iris = load_iris() X = iris.data # features y = iris.target # labels # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create and train the model model = LogisticRegression(max_iter=200) model.fit(X_train, y_train) # Predict on test data predictions = model.predict(X_test) # Check accuracy accuracy = accuracy_score(y_test, predictions) print(f"Accuracy: {accuracy:.2f}")
When to Use
Use classification when you want to assign items into categories. For example:
- Detecting if an email is spam or not
- Recognizing handwritten digits
- Classifying types of flowers or animals
- Medical diagnosis based on symptoms
Classification helps automate decisions where the output is a label or category.
Key Points
- Classification predicts categories or classes for data points.
sklearnin Python offers easy tools to build classification models.- Models learn from labeled examples to make predictions on new data.
- Common algorithms include logistic regression, decision trees, and support vector machines.