A Unified Approach to Interpreting Model Predictions {SHAP}


TLDR

Similar to LIME, but instead of running permutations on features and analysing the resulting predictions, SHAP works off of feature sets. However, many types of ML models - like most neural networks, can't handle NULL data, i.e. incomplete feature sets. So you have to do some computationally intensive hacks to get around this. People seem to like it better than LIME, but that may just be because it's more confusing and nobody I've met seems to understand how it works. Which people love, of course. But hey, maybe it is better when it works, dunno!

SHAP Overview

SHAP is a unified measure of feature importance that assigns each feature an importance value for a particular prediction. It's based on the concept of Shapley values from cooperative game theory. The key idea of Shapley values is to fairly distribute the "payout" {in this case, the prediction of the model} among the "players" {the features}, taking into account all possible coalitions {subsets of features}.

The Shapley value for a feature is calculated as the weighted sum of the marginal contributions of the feature to the prediction for all possible coalitions of features. The marginal contribution of a feature is the difference in the prediction when including the feature versus not including the feature, keeping all other features the same.

In formal terms, the Shapley value (\phi_i) for feature (i) is given by:

[ \phi_i = \sum_{S \subseteq N \setminus {i}} \frac{|S|!(|N|-|S|-1)!}{|N|!} \left[f(S \cup {i}) - f(S)\right] ]

where:

The authors extended the concept of Shapley values to SHAP values, which have three desirable properties:

KernelSHAP and TreeSHAP

The paper also presents practical algorithms to compute SHAP values: KernelSHAP and TreeSHAP.

KernelSHAP is a model-agnostic method that uses a specially weighted local linear regression to estimate SHAP values. Given a prediction to explain, it samples instances, computes the similarities between the sampled instances and the instance of interest, and uses these similarities as weights in a linear regression. The coefficients of the linear regression are the estimated SHAP values.

TreeSHAP is a fast, exact method to compute SHAP values for tree-based models (e.g., decision trees, random forests, gradient boosting). It leverages the structure of tree-based models to compute SHAP values efficiently.

Implications

The SHAP method provides a unified, theoretically grounded approach to interpret the predictions of machine learning models. It has been widely adopted in various fields due to its ability to provide reliable and interpretable insights into complex models.

While SHAP greatly aids interpretability, it should be noted that it can be computationally expensive, particularly for high-dimensional data. Moreover, although SHAP provides a measure of feature importance, it does not directly provide insights into the interactions between features.

Finally, the interpretations given by SHAP are inherently local, which means they apply to individual predictions. While

these interpretations can be aggregated to understand global behavior, care must be taken to ensure these aggregations are meaningful and do not overlook important local behaviors.

The authors have also developed a Python library named shap that implements these methods, making them accessible to the broader data science community. The shap library includes functionality for visualizing SHAP values, which can help to communicate interpretations more effectively.



Tags: interpretability, SHAP
👁️ 1044
hills
03:02
21.06.23
you need login for comment