What is Supervised Learning in Python with sklearn
scikit-learn to make predictions. The model is trained on input-output pairs and then predicts outputs for new inputs.How It Works
Supervised learning is like teaching a child with flashcards. You show the model examples with the correct answers, so it learns the pattern between inputs and outputs. For example, if you show pictures of fruits labeled as 'apple' or 'banana', the model learns to recognize features that define each fruit.
In Python, libraries like scikit-learn help you create these models easily. You give the model a dataset with inputs (features) and the correct answers (labels). The model studies this data to find rules or patterns. Later, when you give it new inputs, it uses what it learned to predict the correct output.
Example
This example shows how to train a simple supervised learning model using scikit-learn to classify iris flowers based on their features.
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score # Load iris dataset iris = load_iris() X = iris.data # features y = iris.target # labels # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create a decision tree classifier model = DecisionTreeClassifier(random_state=42) # Train the model model.fit(X_train, y_train) # Predict on test data predictions = model.predict(X_test) # Calculate accuracy accuracy = accuracy_score(y_test, predictions) print(f"Accuracy: {accuracy:.2f}")
When to Use
Use supervised learning when you have clear examples with known answers and want to predict outcomes for new data. It works well for tasks like spam detection in emails, recognizing handwritten digits, or predicting house prices based on features.
It is best when you have labeled data and want the model to learn the relationship between inputs and outputs to make accurate predictions.
Key Points
- Supervised learning uses labeled data to train models.
- It predicts outputs for new inputs based on learned patterns.
scikit-learnis a popular Python library for supervised learning.- Common algorithms include decision trees, support vector machines, and linear regression.
- It is useful for classification and regression problems.