肉體之愛的解釋圣經

by Patrick Ferris

帕特里克·費里斯(Patrick Ferris)

可以解釋的AI簡介，以及我們為什么需要它 (An introduction to explainable AI, and why we need it)

Neural networks (and all of their subtypes) are increasingly being used to build programs that can predict and classify in a myriad of different settings.

神經網絡(及其所有子類型)被越來越多地用于構建可以在多種不同設置中進行預測和分類的程序。

Examples include machine translation using recurrent neural networks, and image classification using a convolutional neural network. Research published by Google DeepMind has sparked interest in reinforcement learning.

示例包括使用遞歸神經網絡的機器翻譯和使用卷積神經網絡的圖像分類。谷歌DeepMind發表的研究引發了人們對強化學習的興趣。

All of these approaches have advanced many fields and produced usable models that can improve productivity and efficiency.

所有這些方法已在許多領域取得了進展，并產生了可提高生產率和效率的可用模型。

However, we don’t really know how they work.

但是， 我們真的不知道它們是如何工作的 。

I was fortunate enough to attend the Knowledge Discovery and Data Mining (KDD) conference this year. Of the talks I went to, there were two main areas of research that seem to be on a lot of people’s minds:

我很幸運地參加了今年的知識發現和數據挖掘 (KDD)會議。在我參加的演講中，很多人似乎都想到了兩個主要的研究領域：

Firstly, finding a meaningful representation of graph structures to feed into neural networks. Oriol Vinyals from DeepMind gave a talk about their Message Passing Neural Networks.
首先，找到圖結構的有意義的表示以饋入神經網絡。來自DeepMind的Oriol Vinyals發表了他們的消息通過神經網絡的演講。
The second area, and the focus of this article, are explainable AI models. As we generate newer and more innovative applications for neural networks, the question of ‘How do they work?’ becomes more and more important.
第二個領域以及本文的重點是可解釋的AI模型。隨著我們為神經網絡生成更新，更具創新性的應用程序，“它們如何工作？”的問題開始了。變得越來越重要。

為什么需要可解釋的模型？ (Why the need for Explainable Models?)

Neural Networks are not infallible.

神經網絡并非萬無一失。

Besides the problems of overfitting and underfitting that we’ve developed many tools (like Dropout or increasing the data size) to counteract, neural networks operate in an opaque way.

除了我們開發了許多工具(例如Dropout或增加數據大小)來抵消過擬合和欠擬合的問題之外，神經網絡還以不透明的方式運行。

We don’t really know why they make the choices they do. As models become more complex, the task of producing an interpretable version of the model becomes more difficult.

我們真的不知道他們為什么做出選擇。隨著模型變得越來越復雜，生成模型的可解釋版本的任務變得更加困難。

Take, for example, the one pixel attack (see here for a great video on the paper). This is carried out by using a sophisticated approach which analyzes the CNNs and applies differential evolution (a member of the evolutionary class of algorithms).

以單像素攻擊為例(請參閱此處，獲得紙質視頻 )。這是通過使用一種復雜的方法來執行的，該方法可以分析CNN并應用差分進化(算法進化類的成員)。

Unlike other optimisation strategies which restrict the objective function to be differentiable, this approach uses an iterative evolutionary algorithm to produce better solutions. Specifically, for this one pixel attack, the only information required was the probabilities of the class labels.

與其他限制目標函數可微的優化策略不同，此方法使用迭代進化算法來產生更好的解決方案。具體來說，對于這一像素攻擊，所需的唯一信息是類別標簽的概率。

The relative ease of fooling these neural networks is worrying. Beyond this lies a more systemic problem: trusting a neural network.

愚弄這些神經網絡的相對容易性令人擔憂。除此之外，還有一個系統性的問題：信任神經網絡。

The best example of this is in the medical domain. Say you are building a neural network (or any black-box model) to help predict heart disease given a patient’s records.

最好的例子是在醫學領域。假設您正在建立神經網絡(或任何黑匣子模型)以幫助根據患者的病情預測心臟病。

When you train and test your model, you get a good accuracy and a convincing positive predictive value. You bring it to a clinician and they agree it seems to be a powerful model.

在訓練和測試模型時，您將獲得良好的準確性和令人信服的積極預測價值。您將其帶給臨床醫生，他們同意這似乎是一個強大的模型。

But they will be hesitant to use it because you (or the model) cannot answer the simple question: “Why did you predict this person as more likely to develop heart disease?”

但是他們會猶豫使用它，因為您(或模型)無法回答一個簡單的問題：“您為什么預測這個人患心臟病的可能性更大？”

This lack of transparency is a problem for the clinician who wants to understand the way the model works to help them improve their service. It is also a problem for the patient who wants a concrete reason for this prediction.

對于臨床醫生來說，這種缺乏透明度是一個問題，他們希望了解該模型如何幫助他們改善服務質量。對于想要該預測的具體原因的患者來說也是一個問題。

Ethically, is it right to tell a patient that they have a higher probability of a disease if your only reasoning is that “the black-box told me so”? Health care is as much about science as it is about empathy for the patient.

從倫理上講，如果您唯一的理由是“黑匣子告訴我”，告訴患者他們患病的可能性較高是否正確？衛生保健既關乎科學，也關乎病人的同理心。

The field of explainable AI has grown in recent years, and this trend looks set to continue.

近年來，可解釋的AI領域不斷發展，而且這種趨勢似乎還將繼續。

What follows are some of the interesting and innovative avenues researchers and machine learning experts are exploring in their search for models which not only perform well, but can tell you why they make the choices they do.

接下來是研究人員和機器學習專家在尋找模型的過程中探索的一些有趣且創新的途徑，這些模型不僅性能良好，而且可以告訴您為什么他們做出選擇。

逆時注意模型(RETAIN) (Reversed Time Attention Model (RETAIN))

The RETAIN model was developed at Georgia Institute of Technology by Edward Choi et al. It was introduced to help doctors understand why a model was predicting patients to be at risk of heart failure.

RETAIN模型是由Edward Choi等人在喬治亞理工學院開發的。引入它是為了幫助醫生了解為什么模型預測患者有心力衰竭的風險。

The idea is, given a patients’ hospital visits records which also contain the events of the visit, they could predict the risk of heart failure.

這個想法是，給定患者的醫院就診記錄(其中也包含就診事件)，他們可以預測心力衰竭的風險。

The researchers split the input into two recurrent neural networks. This let them use the attention mechanism on each to understand what the neural network was focusing on.

研究人員將輸入分為兩個循環神經網絡。這樣一來，他們就可以利用每種注意力機制來了解神經網絡所關注的內容。

Once trained, the model could predict a patient’s risk. But it could also make use of the alpha and beta parameters to output which hospital visits (and which events within a visit) influenced its choice.

訓練后，該模型可以預測患者的風險。但是它也可以利用alpha和beta參數來輸出哪些醫院就診(以及一次就診中的哪些事件)影響了其選擇。

本地可解釋模型不可知的解釋(LIME) (Local Interpretable Model-Agnostic Explanations (LIME))

Another approach that has become fairly common in use is LIME.

LIME是另一種已變得相當普遍使用的方法。

This is a post-hoc model — it provides an explanation of a decision after it has been made. This means it isn’t a pure ‘glass-box’, transparent model (like decision trees) from start to finish.

這是事后模型-做出決定后提供了解釋。這意味著從始至終，它并不是一個純粹的“玻璃盒子”，透明模型(例如決策樹)。

One of the main strengths of this approach is that it’s model agnostic. It can be applied to any model in order to produce explanations for its predictions.

這種方法的主要優勢之一是它與模型無關。它可以應用于任何模型，以便為其預測提供解釋。

The key concept underlying this approach is perturbing the inputs and watching how doing so affects the model’s outputs. This lets us build up a picture of which inputs the model is focusing on and using to make its predictions.

這種方法的基本概念是擾動輸入并觀察這樣做如何影響模型的輸出。這使我們可以建立模型所關注的輸入并用于進行預測的圖片。

For instance, imagine some kind of CNN for image classification. There are four main steps to using the LIME model to produce an explanation:

例如，想象一下用于圖像分類的CNN。使用LIME模型產生解釋的主要步驟有四個：

Start with a normal image and use the black-box model to produce a probability distribution over the classes.
從正常圖像開始，并使用黑盒模型在類上產生概率分布。
Then perturb the input in some way. For images, this could be hiding pixels by coloring them grey. Now run these through the black-box model to see the how the probabilities for the class it originally predicted changed.
然后以某種方式干擾輸入。對于圖像，這可以通過將像素著色為灰色來隱藏像素。現在，通過黑盒模型運行它們，以查看其最初預測的類的概率如何變化。
Use an interpretable (usually linear) model like a decision tree on this dataset of perturbations and probabilities to extract the key features which explain the changes. The model is locally weighted — meaning that we care more about the perturbations that are most similar to the original image we were using.
在這種擾動和概率數據集上使用諸如決策樹之類的可解釋(通常是線性)模型來提取解釋這些變化的關鍵特征。該模型是局部加權的-意味著我們更關心與我們使用的原始圖像最相似的擾動。
Output the features (in our case, pixels) with the greatest weights as our explanation.
輸出權重最大的特征(在本例中為像素)作為說明。

分層相關傳播(LRP) (Layer-wise Relevance Propagation (LRP))

This approach uses the idea of relevance redistribution and conservation.

這種方法使用了相關性重新分配和保護的思想。

We start with an input (say, an image) and its probability of a classification. Then, work backwards to redistribute this to all of the inputs (in this case pixels).

我們從輸入(例如圖像)及其分類概率開始。然后，向后工作以將其重新分配給所有輸入(在本例中為像素)。

The redistribution process is fairly simple from layer to layer.

每一層的重新分配過程非常簡單。

In the above equation, each term represents the following ideas:

在上述等式中，每個術語表示以下想法：

x_j — the activation value for neuron j in layer l
x_j —第l層中神經元j的激活值
w_j,k — the weighting of the connection between neuron j in layer l and neuron k in layer l + 1
w_j,k —層l中的神經元j和層l + 1中的神經元k之間的連接權重
R_j — Relevance scores for each neuron in layer l
R_j —第l層中每個神經元的相關性得分
R_k — Relevance scores for each neuron in layer l+1
R_k第l + 1層中每個神經元的相關性得分

The epsilon is just a small value to prevent dividing by zero.

ε只是一個很小的值，以防止被零除。

As you can see, we can work our way backwards to determine the relevance of individual inputs. Further, we can sort these in order of relevance. This lets us extract a meaningful subset of inputs as our most useful or powerful in making a prediction.

如您所見，我們可以倒退確定各個輸入的相關性。此外，我們可以按照相關性對它們進行排序。這使我們可以提取有意義的輸入子集，作為我們進行預測時最有用或最有效的輸入。

接下來是什么？ (What next?)

The above methods for producing explainable models are by no means exhaustive. They are a sample of some of the approaches researchers have tried using to produce interpretable predictions from black-box models.

以上產生可解釋模型的方法絕不是窮舉。他們是研究人員嘗試從黑盒模型中產生可解釋的預測的一些方法的樣本。

Hopefully this post also sheds some light onto why it is such an important area of research. We need to continue researching these methods, and develop new ones, in order for machine learning to benefit as many fields as possible — in a safe and trustworthy fashion.

希望這篇文章也能闡明為什么它是如此重要的研究領域。我們需要繼續研究這些方法，并開發新方法，以使機器學習以安全和可信賴的方式從盡可能多的領域中受益。

If you find yourself wanting more papers and areas to read about, try some of the following.

如果發現自己想論文和領域，請嘗試以下一些方法。

DeepMind’s research on Concept Activation Vectors, as well as the slides from Victoria Krakovna’s talk at Neural Information Processing Systems (NIPS) conference.
DeepMind對概念激活向量的研究，以及Victoria Krakovna在神經信息處理系統(NIPS)會議上的演講的幻燈片。
A paper by Dung Huk Park et al. on datasets for measuring explainable models.
一紙由糞學今公園等。在測量可解釋模型的數據集上。
Finale Doshi-Velez and Been Kim’s paper on the field in general
總決賽Doshi-Velez和Been Kim在該領域的論文

Artificial intelligence should not become a powerful deity which we follow blindly. But neither should we forget about it and the beneficial insight it can have. Ideally, we will build flexible and interpretable models that can work in collaboration with experts and their domain knowledge to provide a brighter future for everyone.

人工智能不應成為我們盲目跟隨的強大神靈。但是我們也不應該忘記它以及它可以擁有的有益的見識。理想情況下，我們將構建可以與專家及其領域知識協作的靈活且可解釋的模型，從而為每個人提供光明的未來。