SHAP & LIME Explainability

Master SHAP & LIME for explainable AI: model interpretability, feature importance, and production deployment of XAI systems.

50 min read•Advanced

Not Started

What is Model Explainability?

Model explainability (XAI) provides insights into how machine learning models make predictions. SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are two leading frameworks that make black-box models interpretable for stakeholders.

Critical Need: Regulatory requirements (GDPR, CCPA), ethical AI, debugging model behavior, and building stakeholder trust all require explainable AI systems.

Explainability Analysis Calculator

Model Complexity

Dataset Size10,000 samples

Number of Features20 features

Explanation Method

Analysis Results

Computation Speed:Fast

Explanation Accuracy:High

Global Explanations:Excellent

Local Explanations:Excellent

Interpretability:High

Est. Runtime:100min

SHAP vs LIME: Core Differences

SHAP (SHapley Additive exPlanations)

Core Principle

Based on cooperative game theory. Fairly distributes feature contributions using Shapley values.

Strengths

• Mathematically grounded (satisfies efficiency, symmetry, dummy, additivity)
• Global and local explanations
• Model-agnostic and model-specific variants
• Consistent feature attributions

Best For

• Tree-based models (TreeSHAP)
• Deep learning (DeepSHAP)
• When you need theoretical guarantees
• Global feature importance analysis

LIME (Local Interpretable Model-agnostic Explanations)

Core Principle

Approximates model behavior locally by learning interpretable surrogate models around predictions.

Strengths

• Fast computation for individual predictions
• Works with any model type (truly model-agnostic)
• Intuitive local explanations
• Supports text, images, and tabular data

Best For

• Real-time explanation needs
• Individual prediction explanations
• When computational resources are limited
• Complex black-box models

Implementation Examples

SHAP Implementation for Tree Models

import shap
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt

# Load and prepare data
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)

# Initialize SHAP explainer (TreeExplainer for tree-based models)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

# Global explanations - Feature importance across all predictions
shap.summary_plot(shap_values[1], X, plot_type="bar", show=False)
plt.title("Global Feature Importance (SHAP)")
plt.tight_layout()
plt.show()

# Local explanation for a single prediction
sample_idx = 0
shap.waterfall_plot(
    explainer.expected_value[1], 
    shap_values[1][sample_idx], 
    X.iloc[sample_idx]
)

# Feature interaction analysis
shap_interaction_values = explainer.shap_interaction_values(X[:100])
shap.summary_plot(shap_interaction_values, X[:100], show=False)

print(f"Base value (expected model output): {explainer.expected_value[1]:.3f}")
print(f"SHAP values sum: {shap_values[1][sample_idx].sum():.3f}")
print(f"Model prediction: {model.predict_proba(X.iloc[sample_idx:sample_idx+1])[0][1]:.3f}")

LIME Implementation for Any Model

import lime
import lime.lime_tabular
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier

# Create a more complex model pipeline
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', MLPClassifier(hidden_layer_sizes=(100, 50), random_state=42))
])

# Train the pipeline
pipeline.fit(X, y)

# Initialize LIME explainer
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    X.values,
    feature_names=X.columns,
    class_names=['malignant', 'benign'],
    mode='classification',
    discretize_continuous=True
)

# Explain a single prediction
sample_idx = 0
lime_explanation = lime_explainer.explain_instance(
    X.iloc[sample_idx].values, 
    pipeline.predict_proba,
    num_features=10,
    num_samples=5000
)

# Display explanation
lime_explanation.show_in_notebook(show_table=True)

# Get explanation as list
explanation_list = lime_explanation.as_list()
print("LIME Feature Importances:")
for feature, importance in explanation_list:
    print(f"{feature}: {importance:.3f}")

# Save explanation as HTML
lime_explanation.save_to_file('lime_explanation.html')

Production Explainability Service

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
import shap
import lime.lime_tabular
import pandas as pd
import numpy as np
from typing import List, Dict, Any
import json

class ExplainabilityService:
    def __init__(self, model_path: str, feature_names: List[str]):
        self.model = joblib.load(model_path)
        self.feature_names = feature_names
        
        # Initialize SHAP explainer
        # Assuming we have background data for SHAP
        background_data = np.load('background_data.npy')  
        self.shap_explainer = shap.Explainer(self.model, background_data)
        
        # Initialize LIME explainer  
        self.lime_explainer = lime.lime_tabular.LimeTabularExplainer(
            background_data,
            feature_names=feature_names,
            mode='classification',
            discretize_continuous=True
        )
        
        # Cache for explanations
        self.explanation_cache = {}
        
    def explain_prediction_shap(self, features: np.ndarray, 
                               instance_id: str = None) -> Dict[str, Any]:
        """Generate SHAP explanation for a single prediction"""
        try:
            # Check cache first
            cache_key = f"shap_{instance_id}" if instance_id else None
            if cache_key and cache_key in self.explanation_cache:
                return self.explanation_cache[cache_key]
            
            # Compute SHAP values
            shap_values = self.shap_explainer(features.reshape(1, -1))
            
            # Extract explanation data
            explanation = {
                'method': 'SHAP',
                'base_value': float(shap_values.base_values[0]),
                'prediction': float(self.model.predict_proba(features.reshape(1, -1))[0][1]),
                'feature_contributions': {
                    self.feature_names[i]: float(shap_values.values[0][i])
                    for i in range(len(self.feature_names))
                },
                'feature_values': {
                    self.feature_names[i]: float(features[i])
                    for i in range(len(features))
                },
                'top_positive_features': [],
                'top_negative_features': []
            }
            
            # Get top contributing features
            contributions = list(explanation['feature_contributions'].items())
            contributions.sort(key=lambda x: abs(x[1]), reverse=True)
            
            explanation['top_positive_features'] = [
                {'feature': name, 'contribution': contrib, 'value': explanation['feature_values'][name]}
                for name, contrib in contributions if contrib > 0
            ][:5]
            
            explanation['top_negative_features'] = [
                {'feature': name, 'contribution': contrib, 'value': explanation['feature_values'][name]}
                for name, contrib in contributions if contrib < 0
            ][:5]
            
            # Cache the result
            if cache_key:
                self.explanation_cache[cache_key] = explanation
                
            return explanation
            
        except Exception as e:
            raise HTTPException(status_code=500, f"SHAP explanation failed: {str(e)}")
    
    def explain_prediction_lime(self, features: np.ndarray, 
                               num_features: int = 10) -> Dict[str, Any]:
        """Generate LIME explanation for a single prediction"""
        try:
            # Generate LIME explanation
            lime_explanation = self.lime_explainer.explain_instance(
                features, 
                self.model.predict_proba,
                num_features=num_features,
                num_samples=1000
            )
            
            # Extract explanation data
            explanation_list = lime_explanation.as_list()
            
            explanation = {
                'method': 'LIME',
                'prediction': float(self.model.predict_proba(features.reshape(1, -1))[0][1]),
                'local_prediction': float(lime_explanation.local_pred[1]),
                'intercept': float(lime_explanation.intercept[1]),
                'feature_contributions': dict(explanation_list),
                'feature_values': {
                    self.feature_names[i]: float(features[i])
                    for i in range(len(features))
                },
                'explanation_score': float(lime_explanation.score),
                'num_features_used': len(explanation_list)
            }
            
            return explanation
            
        except Exception as e:
            raise HTTPException(status_code=500, f"LIME explanation failed: {str(e)}")
    
    def compare_explanations(self, features: np.ndarray) -> Dict[str, Any]:
        """Compare SHAP and LIME explanations for the same prediction"""
        shap_exp = self.explain_prediction_shap(features)
        lime_exp = self.explain_prediction_lime(features)
        
        # Find overlapping important features
        shap_important = set([item['feature'] for item in 
                             (shap_exp['top_positive_features'] + shap_exp['top_negative_features'])[:10]])
        lime_important = set(lime_exp['feature_contributions'].keys())
        
        overlap = shap_important.intersection(lime_important)
        
        return {
            'shap_explanation': shap_exp,
            'lime_explanation': lime_exp,
            'agreement_analysis': {
                'overlapping_features': list(overlap),
                'agreement_ratio': len(overlap) / max(len(shap_important), len(lime_important)),
                'shap_only_features': list(shap_important - lime_important),
                'lime_only_features': list(lime_important - shap_important)
            },
            'recommendation': self._get_explanation_recommendation(shap_exp, lime_exp)
        }
    
    def _get_explanation_recommendation(self, shap_exp: Dict, lime_exp: Dict) -> str:
        """Provide recommendation on which explanation to trust more"""
        if abs(shap_exp['prediction'] - lime_exp['prediction']) < 0.05:
            return "Both methods agree on the prediction. High confidence in explanations."
        elif abs(shap_exp['prediction'] - lime_exp['local_prediction']) < 0.1:
            return "Methods show reasonable agreement. Consider ensemble explanation."
        else:
            return "Significant disagreement between methods. Manual review recommended."

# FastAPI Application
app = FastAPI(title="ML Explainability Service")

# Initialize service
feature_names = ['feature_' + str(i) for i in range(30)]  # Update with actual names
explainer_service = ExplainabilityService('model.pkl', feature_names)

class PredictionRequest(BaseModel):
    features: List[float]
    instance_id: str = None
    explanation_method: str = "both"  # "shap", "lime", or "both"

@app.post("/explain")
async def explain_prediction(request: PredictionRequest):
    features = np.array(request.features)
    
    if request.explanation_method == "shap":
        return explainer_service.explain_prediction_shap(features, request.instance_id)
    elif request.explanation_method == "lime":
        return explainer_service.explain_prediction_lime(features)
    elif request.explanation_method == "both":
        return explainer_service.compare_explanations(features)
    else:
        raise HTTPException(status_code=400, "Invalid explanation method")

@app.get("/health")
async def health_check():
    return {"status": "healthy", "service": "ML Explainability"}

@app.get("/methods")
async def available_methods():
    return {
        "available_methods": ["shap", "lime", "both"],
        "model_type": str(type(explainer_service.model)),
        "num_features": len(explainer_service.feature_names)
    }

Real-World Explainability Implementations

Microsoft Azure ML Explain

• SHAP integration for AutoML models
• Global and local explanations in Azure ML Studio
• Compliance with EU AI Act requirements
• Used in fraud detection (99.2% accuracy with explanations)
• Serves 50M+ explanations monthly across customers

Google Cloud Explainable AI

• Integrated Gradients for deep learning models
• SHAP support for tabular data models
• Real-time explanations via Vertex AI
• Used in healthcare AI (medical imaging explanations)
• Powers explanation features across Google products

Capital One Model Risk Management

• SHAP explanations for credit decisioning models
• LIME for fraud detection real-time explanations
• Regulatory compliance (Fair Credit Reporting Act)
• Processes 100M+ financial decisions annually
• Reduces model review time by 75%

Netflix Recommendation Explainability

• SHAP for understanding recommendation drivers
• Custom LIME implementation for content features
• A/B testing explanation effectiveness
• 15% increase in user engagement with explanations
• Applied to 500M+ daily recommendations

Explainability Best Practices

✅ Do

Choose explanation method based on model type and use case
Validate explanations with domain experts
Use background datasets representative of your training data
Monitor explanation stability over time
Implement explanation caching for performance
Document explanation methodology for compliance

❌ Don't

Treat explanations as absolute truth without validation
Use tiny sample sizes for LIME approximations
Ignore computational costs in production systems
Apply global explanations to individual predictions
Neglect explanation drift as models are updated
Overcomplicate explanations for non-technical stakeholders

No quiz questions available

Quiz ID "shap-lime-explainability" not found