0
0
ML Pythonprogramming~5 mins

Decision tree classifier in ML Python

Choose your learning style9 modes available
Introduction

A decision tree classifier helps us make decisions by splitting data into simple yes/no questions. It is easy to understand and use for sorting things into groups.

When you want to classify emails as spam or not spam based on their content.
When you need to decide if a loan application should be approved or rejected.
When sorting fruits into types based on color, size, and shape.
When predicting if a patient has a disease based on symptoms.
When you want a simple model that you can explain to others easily.
Syntax
ML Python
from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(criterion='gini', max_depth=None, random_state=42)
model.fit(X_train, y_train)
predictions = model.predict(X_test)

criterion decides how to measure the quality of a split (common options: 'gini' or 'entropy').

max_depth limits how deep the tree can grow to avoid overfitting.

Examples
This creates a tree using information gain (entropy) and limits the tree depth to 3 levels.
ML Python
model = DecisionTreeClassifier(criterion='entropy', max_depth=3)
This creates a tree with default settings and fits it to training data.
ML Python
model = DecisionTreeClassifier(random_state=0)
model.fit(X_train, y_train)
This predicts the class labels for new data using the trained tree.
ML Python
predictions = model.predict(X_test)
Sample Program

This program trains a decision tree on the iris flower data to classify flower types. It then tests the model and prints the accuracy and predictions.

ML Python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load example data
iris = load_iris()
X, y = iris.data, iris.target

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create and train the decision tree classifier
model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)

# Predict on test data
predictions = model.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)

print(f"Accuracy: {accuracy:.2f}")
print(f"Predictions: {predictions}")
OutputSuccess
Important Notes

Decision trees can overfit if they grow too deep; controlling max_depth helps prevent this.

They work well with both numerical and categorical data.

Decision trees are easy to visualize and explain to others.

Summary

Decision tree classifiers split data by asking simple questions to classify it.

They are easy to use and understand, making them great for beginners.

Control tree depth to balance accuracy and avoid overfitting.