Experiment - Model interpretability (SHAP, LIME)

Problem:You have trained a classification model on the Iris dataset. The model achieves good accuracy, but you want to understand which features influence its predictions the most.

Current Metrics:Training accuracy: 95%, Validation accuracy: 93%

Issue:The model works well but is a 'black box'. You cannot explain why it makes certain predictions.

Your Task

Use SHAP and LIME to explain the model's predictions and identify the most important features influencing the output.

Use the existing trained model without changing its architecture or training process.

Use SHAP and LIME libraries for interpretability.

Explain at least one prediction with each method.

Hint 1

Hint 2

Hint 3

Solution

ML Python

import numpy as np
import shap
import lime
import lime.lime_tabular
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import matplotlib.pyplot as plt

# Load data
iris = load_iris()
X, y = iris.data, iris.target
feature_names = iris.feature_names

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

# SHAP explanation
explainer_shap = shap.TreeExplainer(model)
shap_values = explainer_shap.shap_values(X_test)

# Plot SHAP summary for class 0
shap.summary_plot(shap_values[0], X_test, feature_names=feature_names, show=False)
plt.title('SHAP Summary Plot for Class 0')
plt.tight_layout()
plt.savefig('shap_summary.png')
plt.close()

# Explain one prediction with SHAP
idx = 0
shap.force_plot(explainer_shap.expected_value[0], shap_values[0][idx], X_test[idx], feature_names=feature_names, matplotlib=True, show=False)
plt.title('SHAP Force Plot for One Prediction')
plt.tight_layout()
plt.savefig('shap_force_plot.png')
plt.close()

# LIME explanation
explainer_lime = lime.lime_tabular.LimeTabularExplainer(X_train, feature_names=feature_names, class_names=iris.target_names, discretize_continuous=True)

exp = explainer_lime.explain_instance(X_test[idx], model.predict_proba, num_features=4)

# Show LIME explanation as list
lime_exp_list = exp.as_list()

# Plot LIME explanation
fig = exp.as_pyplot_figure()
plt.title('LIME Explanation for One Prediction')
plt.tight_layout()
plt.savefig('lime_explanation.png')
plt.close()

# Output results
print(f"LIME explanation for test instance {idx}:")
for feature, weight in lime_exp_list:
    print(f"{feature}: {weight:.3f}")

Added SHAP TreeExplainer to explain feature importance globally and for one prediction.

Added LIME TabularExplainer to explain one prediction locally.

Visualized SHAP summary plot and force plot for interpretability.

Visualized LIME explanation plot and printed feature contributions.

Results Interpretation

Before: Model accuracy was high but no insight into feature influence.

After: SHAP and LIME provide clear visual and numeric explanations showing which features most affect predictions.

Using SHAP and LIME helps us open the 'black box' of machine learning models. We can see which features matter most and understand individual predictions better.

Bonus Experiment

Try using SHAP and LIME to explain predictions on a different dataset, such as the Breast Cancer dataset.

💡 Hint

Load the new dataset from sklearn, train a similar model, and apply the same SHAP and LIME methods to interpret it.