0
0
MlopsHow-ToBeginner · 3 min read

How to Use Logistic Regression with sklearn in Python

Use LogisticRegression from sklearn.linear_model by creating a model instance, fitting it with training data using fit(), and then predicting with predict(). This process trains the model to classify data based on input features.
📐

Syntax

The basic syntax to use logistic regression in sklearn involves importing the class, creating an instance, fitting the model to data, and making predictions.

  • LogisticRegression(): Creates the logistic regression model.
  • fit(X_train, y_train): Trains the model on features X_train and labels y_train.
  • predict(X_test): Predicts labels for new data X_test.
python
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
💻

Example

This example shows how to train a logistic regression model on a simple dataset and predict the class labels.

python
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X = iris.data
y = (iris.target == 0).astype(int)  # Binary classification: class 0 vs others

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create and train model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

# Predict
predictions = model.predict(X_test)

# Evaluate
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2f}")
Output
Accuracy: 1.00
⚠️

Common Pitfalls

Common mistakes when using logistic regression in sklearn include:

  • Not scaling features when needed, which can slow convergence.
  • Using default max_iter too low, causing the model not to converge.
  • Confusing predict() (class labels) with predict_proba() (probabilities).
  • Trying to use logistic regression for multi-class without specifying the right solver or multi-class option.
python
from sklearn.linear_model import LogisticRegression

# Wrong: default max_iter too low for some data
model = LogisticRegression()
model.fit(X_train, y_train)  # May warn about convergence

# Right: increase max_iter
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
📊

Quick Reference

Method/ParameterDescription
LogisticRegression()Create logistic regression model instance
fit(X, y)Train model on features X and labels y
predict(X)Predict class labels for X
predict_proba(X)Predict class probabilities for X
max_iterMaximum iterations for solver to converge (default 100)
solverAlgorithm to use (e.g., 'lbfgs', 'liblinear')
multi_class'auto', 'ovr', or 'multinomial' for multi-class handling

Key Takeaways

Import LogisticRegression from sklearn.linear_model to create the model.
Always fit the model with training data before predicting.
Increase max_iter if the model does not converge.
Use predict() for class labels and predict_proba() for probabilities.
Scale features if convergence is slow or data varies widely.