Interpretable Machine Learning

published Jul 16, 2020

State-of-the-art machine learning and deep learning algorithms are developed to always predict an output (even if the input has nothing to do with the training set) and have originally been designed for interpolation rather than extrapolation. Moreover, with the increase of data volume and model complexity, their predictions can be very accurate but prone to rely on spurious correlations, encode and magnify bias, and draw conclusions that do not incorporate the underlying dynamics governing the system. Because of that, the uncertainty of the predictions and our confidence in the model are difficult to estimate and the relation between inputs and outputs becomes hard to interpret.

While many promising proof-of-concept examples are being developed to improve the parameterization schemes in weather and climate modeling, little attention has been paid to uncertainty quantification and interpretability. In fact, most of machine learning and deep learning applications aim to optimize performance metrics (for instance accuracy), which are rarely good indicators of trust. Since it is challenging to shift a community from “black” to “glass” boxes, it is more useful to implement Explainable Artificial Intelligence (XAI) techniques right at the beginning of the machine learning and deep learning adoption rather than trying to fix fundamental problems later.

The good news is that most of the popular XAI techniques basically are sensitivity analyses because they consist of a systematic perturbation of some model components in order to observe how it affects the model predictions. Many XAI techniques comprise random sampling, Monte-Carlo simulations, and ensemble runs, which are common methods in weather and climate modeling. Moreover, many XAI techniques are reusable because they are model-agnostic and must be applied after the model has been fitted.

We will summarize some popular techniques of XAI and aleatory and epistemic Uncertainty Quantification: the Permutation Importance and Gaussian processes (i.e., the perturbation of the model inputs) and the Monte-Carlo Dropout, Deep ensembles, and Quantile Regression (i.e, the perturbation of the model architecture). An introduction to useful techniques to understand how did each feature in the data affect a particular prediction, like the Layerwise Relevance Propagation (LRP), Shapley values, and Local Interpretable Model-Agnostic Explanations (LIME), will be presented too. We will also introduce some best-practices, like the detection of anomalies in the training data, the implementation of fallbacks when the prediction is not reliable, and physics-guided learning.