Interpret Model Prediction with SHAP

SHAP stands for SHapley Additive exPlanations. It is a game-theoretic approach to explain model predictions. The code and papers can be found here.

SHAP, in particular KernelExplainer, has been implemented in flow-forecast to provide model agnostic explanations. A few visualizations are created to assist the understanding of feature behaviors. The plots are logged on weights and biases. copper-sweep-920 is used as an example to show the meaning of each plot.

Barplot

The bar plot shows the overall feature ranking based on their average absolute shap values. As shown in the plot on the left, the most important feature is rolling_7 with mobility_transit_stations and mobility_workplaces ranking second and third.

 

The above bar plot shows the feature ranking for predictions at each future time step. In this example, the model only predicted one step ahead. Therefore, the plot looks identical to the overall feature ranking by SHAP values.

Another example from a test set is shown below.

 

 

 

Heatmap

The heatmap shows the feature contribution for at each historical time step to a future time step. In the example above, the historical window length is 9, and the prediction window length is 1. This means that the model is predicting one time-step ahead using features of 9 historical time steps. The color represents the SHAP value for that particular feature at that particular time-step. In the example above, red means higher positive SHAP values, which indicates the feature mobility_workplaces at time step 1, 2, 6, 7, drive the prediction value to be higher. While time step 4, and 8 on average tend to drive prediction value lower. The higher the absolute SHAP values are, the stronger the impact this feature has on the prediction either in a positive or negative way.

Note: there is a mislabel, the colormap should be SHAP values instead of feature values

Below shows an example plot on a test set with multiple prediction steps.

The test model has a historical window of 20 steps (x-axis) and predicts 10 steps (y-axis) ahead. It shows that feature cfs at historic step-0 strongly drives prediction at prediction step-0 higher, which it drives prediction at step-6 lower.

The same heatmap can also be generated for a specific prediction. For example, the heatmap below shows the SHAP values of feature mobility_workplaces for a prediction made on 2020-06-19. In this case, historic step-0 means mobility_workplaces value on 2020-06-11. The color represents its impact on prediction of COVID cases on 2020-06-20.

 

Scatter plot

The scatter plot helps explain the prediction at a particular time. The plot above is for feature mobility_grocery_pharmacy for a prediction made on 2020-06-19. X-axis shows the SHAP values, which indicates the impact of this feature on the prediction, and color represents the feature values described in the colormap. For this scatter plot, larger positive feature values (red) is clustered at lower negative SHAP values. This means that larger mobility_grocery_pharmacy values tend to drive the number of cases lower.