A decision tree classifier helps us make decisions by splitting data into simple yes/no questions. It is easy to understand and use for sorting things into groups.
0
0
Decision tree classifier in ML Python
Introduction
When you want to classify emails as spam or not spam based on their content.
When you need to decide if a loan application should be approved or rejected.
When sorting fruits into types based on color, size, and shape.
When predicting if a patient has a disease based on symptoms.
When you want a simple model that you can explain to others easily.
Syntax
ML Python
from sklearn.tree import DecisionTreeClassifier model = DecisionTreeClassifier(criterion='gini', max_depth=None, random_state=42) model.fit(X_train, y_train) predictions = model.predict(X_test)
criterion decides how to measure the quality of a split (common options: 'gini' or 'entropy').
max_depth limits how deep the tree can grow to avoid overfitting.
Examples
This creates a tree using information gain (entropy) and limits the tree depth to 3 levels.
ML Python
model = DecisionTreeClassifier(criterion='entropy', max_depth=3)
This creates a tree with default settings and fits it to training data.
ML Python
model = DecisionTreeClassifier(random_state=0)
model.fit(X_train, y_train)This predicts the class labels for new data using the trained tree.
ML Python
predictions = model.predict(X_test)
Sample Program
This program trains a decision tree on the iris flower data to classify flower types. It then tests the model and prints the accuracy and predictions.
ML Python
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.tree import DecisionTreeClassifier from sklearn.metrics import accuracy_score # Load example data iris = load_iris() X, y = iris.data, iris.target # Split data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Create and train the decision tree classifier model = DecisionTreeClassifier(random_state=42) model.fit(X_train, y_train) # Predict on test data predictions = model.predict(X_test) # Calculate accuracy accuracy = accuracy_score(y_test, predictions) print(f"Accuracy: {accuracy:.2f}") print(f"Predictions: {predictions}")
OutputSuccess
Important Notes
Decision trees can overfit if they grow too deep; controlling max_depth helps prevent this.
They work well with both numerical and categorical data.
Decision trees are easy to visualize and explain to others.
Summary
Decision tree classifiers split data by asking simple questions to classify it.
They are easy to use and understand, making them great for beginners.
Control tree depth to balance accuracy and avoid overfitting.