貝葉斯網絡建模

I am feeling sick. Fever. Cough. Stuffy nose. And it’s wintertime. Do I have the flu? Likely. Plus I have muscle pain. More likely.

我感到惡心。發熱。咳嗽。鼻塞。現在是冬天。我有流感嗎？可能吧另外我有肌肉疼痛。更傾向于。

Bayesian networks are great for these types of inferences. We have variables, some whose values have been fixed. We are interested in the probabilities of some free variables given these fixed values.

貝葉斯網絡非常適合這些類型的推斷。我們有變量，有些變量的值是固定的。給定這些固定值，我們對一些自由變量的概率感興趣。

In our example, we want the probability that we have the flu, given some symptoms we have observed, and the season we are in.

在我們的示例中，鑒于我們觀察到的某些癥狀以及我們所處的季節，我們希望獲得流感的可能性。

So far it looks like reasoning with conditional probabilities. Is there more to it? Yes. A lot more. Let’s scale up this example and it will come out.

到目前為止，它看起來像是帶有條件概率的推理。還有更多嗎？是。多很多。讓我們擴大這個例子，它就會出來。

Towards A Large-scale Bayes Network

邁向大規模貝葉斯網絡

Imagine that our network models every possible symptom, every possible disease, outcomes of every possible medical test, and every possible external factor that might potentially affect the probability of some disease. External factors break down into behavioral ones (smoking, being a couch potato, eating too much), physiological ones ( weight, gender, age), and others. For good measure, let’s also throw in treatments. And side-effects.

想象一下，我們的網絡對每種可能的癥狀，每種可能的疾病，每種可能的醫學檢查的結果以及每種可能影響某種疾病發生概率的外部因素進行建模。外部因素可分為行為因素(吸煙，吃土豆，進食過多)，生理因素(體重，性別，年齡)等。好的，讓我們也進行一些治療。和副作用。

By now there is enough and useful medical knowledge to capture tens of thousands of variables (at the very least) and their interactions. For any set of symptoms, together with the values of some of the behavioral, physiological, and other external factors, we could estimate the probabilities of various diseases. And more. For a given disease, we could ask it to give us the most likely symptoms. And way more. Such as I have a cough and high fever but the flu has been diagnosed out, what other diseases are likely? For a given diagnosis, and our particular symptoms, and possibly additional factors such as our gender and age, we could ask it to recommend treatments.

到目前為止，已有足夠且有用的醫學知識可以捕獲成千上萬的變量(至少)和它們之間的相互作用。對于任何一組癥狀，以及某些行為，生理和其他外部因素的價值，我們可以估計各種疾病的可能性。和更多。對于給定的疾病，我們可以要求它給我們最可能的癥狀。還有更多。例如我咳嗽和高燒，但已經診斷出流感，還有什么其他疾病可能 ？對于給定的診斷，我們的特殊癥狀以及可能的其他因素，例如我們的性別和年齡，我們可以要求其推薦治療方法。

Now we are getting somewhere. How does all this magic work? This is what we will explore here.

現在我們到了某個地方。 所有這些魔術如何起作用？ 這就是我們將在這里探討的內容。

Connectivity

連接性

First question, where does the network come in? In modeling the interactions among the tens of thousands of variables.

第一個問題，網絡從哪里來？在建模中數以萬計的變量之間的相互作用。

Modeling all possible interactions among that-many variables is nearly impossible. It is the network that gives us a mechanism to cut through this complexity. By letting us specify which interactions to model. The aim is to seek a model that is rich enough. But not overly complex.

對這多個變量之間所有可能的相互作用進行建模幾乎是不可能的。正是網絡為我們提供了一種消除這種復雜性的機制。通過讓我們指定要建模的交互。目的是尋求足夠豐富的模型。但不要過于復雜。

Speaking of interactions, how do we decide which ones to model? Typically via domain knowledge. In our case, leveraging the collective knowledge of the medical field acquired over millennia of clinical practice and research.

說到交互，我們如何確定要建模的模型？通常通過領域知識。在我們的案例中，利用了幾千年來臨床實踐和研究獲得的醫學領域的集體知識。

What would our Bayes net look like? Structurally, a giant directed graph with nodes for the various symptoms, diseases, medical tests, behavioral factors, physiological factors, and treatment options. With suitably chosen (or inferred) arcs to model significant interactions among them. Such as among specific symptoms and specific diseases.

我們的貝葉斯網會是什么樣？在結構上，一個巨型有向圖，其節點包含各種癥狀，疾病，醫學檢查，行為因素，生理因素和治療選擇。使用適當選擇(或推斷)的弧來模擬它們之間的重要交互。例如特定的癥狀和特定的疾病。

Connectivity Refined

完善的連通性

A Bayes network is structurally a directed graph, an acyclic one at that. Directed means that edges have a direction to them, which is why they are called arcs. Acyclic means there are no directed cycles. Here is an example of a directed cycle: A → B → C → A.

貝葉斯網絡在結構上是有向圖，此時是無環圖。導演意味著邊緣有一個方向給他們，這就是為什么他們被稱為弧。 非循環意味著沒有定向循環。這是一個有向循環的示例： A → B → C → A 。

Apart from the acyclicity constraint, the modeler has full control over what nodes to connect with arcs and how to orient them. That said, in complex real-world use cases such as the one we are discussing here (medical diagnosis) there is an appealing guiding principle.

除了非循環性約束之外，建模者還可以完全控制要與弧連接的節點以及如何定向弧。就是說，在復雜的實際用例(例如我們在這里討論的用例)(醫學診斷)中，有一個吸引人的指導原則。

Choose arcs to model direct causes. Orient them in the direction of causality.

選擇弧以模擬直接原因。 使他們朝向因果關系的方向 。

So if A is a direct cause of B, we would add the arc A → B. Such a network is called a causal Bayes network.

因此，如果A是B的直接原因，我們將添加弧A → B 。這樣的網絡稱為因果貝葉斯網絡。

A causal network’s structure is only as accurate as its variables and the fidelity of the causal relationships. For instance, the truth might be that A causes B and B causes C. But we might not even know of B’s existence. So the best we would be able to do is to model this via the arc A → C.

因果網絡的結構僅取決于其變量和因果關系的保真度。例如，事實可能是A導致B且B導致C。但是我們甚至可能不知道B的存在。因此，我們最好的辦法是通過弧A → C對此進行建模。

Causal Modeling

因果模型

Okay, so let’s think causally in the medical setting. This is what we come up with.

好吧，讓我們在醫療環境中考慮一下。這就是我們想出的。

Variable Type A causes Variable Type B        Exampledisease         causes symptom            flu causes you to coughbehavior        causes disease            smoking causes lung cancerphysiological   causes disease            aging “causes” various    
factor                                    diseasestreatment       "causes" disease          chemotherapy reduces 
                                          cancertreatment       causes side-effect        chemotherapy causes 
                                          hair-loss

Before closing this section, let’s note that we shouldn’t worry too much about getting a few causal arcs wrong. (Of course, we prefer not to.) The consequences are not severe. In fact, we’ll likely have quite a new non-causal arcs in the network anyhow. To model correlations whose links to causation are unclear or non-existent. In fact, the network can’t even distinguish between casual and non-casual arcs. Not in our use case.

在關閉本節之前，讓我們注意，我們不要太擔心弄錯一些因果關系。 (當然，我們不愿意這樣做。)后果并不嚴重。實際上，無論如何，我們很可能會在網絡中出現一個新的非因果弧。建模與因果關系不清楚或不存在的關聯。實際上，網絡甚至無法區分臨時弧和非臨時弧。不在我們的用例中。

Take this example. Say A and B are strongly correlated. Say you thought A causes B, so modeled this with the arc A → B. But you were wrong. Adding this arc is still a good thing, as it models the correlation. The next section discusses non-causal arcs in more detail.

舉這個例子。說A和B是高度相關的。假設您認為A導致B ，所以用弧A → B對此建模。但是你錯了。添加弧線仍然是一件好事，因為它可以對相關性進行建模。下一節將更詳細地討論非因果弧。

Non-causal Arcs

非因果弧

Causality is a compelling guiding principle in the network’s design. However, it is not sufficient. That is, adding non-causal arcs can improve the model further.

因果關系是網絡設計中令人信服的指導原則。但是，這還不夠。也就是說，添加非因果弧可以進一步改善模型。

Consider correlations among variables. Such as among a set of symptoms or a set of diseases. Causal relationships within the set may not be known or even exist. We do want to model the correlations though. So we should add suitable “non-causal” arcs.

考慮變量之間的相關性。如一組癥狀或一組疾病。集合內的因果關系可能未知，甚至不存在。我們確實想對相關性進行建模。因此，我們應該添加合適的“非因果”弧。

Here is a simple example. Say there is strong belief or evidence that dry cough and irritated throat are correlated. Say these are the only two variables in the network. Connecting them with an arc in either direction will capture this correlation. Leaving the arc out will treat them as independent. We don’t want that.

這是一個簡單的例子。說有強烈的信念或證據表明干咳和喉嚨發炎是相關的。假設這些是網絡中僅有的兩個變量。將它們與任一方向的弧形連接將捕獲此相關性。放任不管，將它們視為獨立的。我們不想要那個。

The Network’s Master Equation

網絡的主要方程式

At some juncture, just like a picture can reveal a vista, so can math. We are at that point. So here goes.

在某個關頭，就像圖片可以展現遠景一樣，數學也可以展現遠景。我們到了這一點。所以去。

Formally, a Bayes Network is a directed acyclic graph on n nodes. The nodes, call them X1, X2, …, Xn, model random variables. The arcs model interactions among them.

形式上，貝葉斯網絡是n個節點上的有向無環圖。節點稱它們為X 1， X 2，…， X n，對隨機變量進行建模。弧模型模擬了它們之間的相互作用。

More precisely, the structure of the network factors the joint distribution over the n variables as

更準確地說，網絡的結構將n個變量的聯合分布作為

P(X1, X2, …, Xn) = product_i P(Xi|parents(Xi))

P (X1， X 2，…，Xn)= product_ i P ( X i | 父母 (Xi))

There is a lot to unpack here. Let’s start with: parents(Xi) is the set of nodes with arcs coming into Xi. Huh?

這里有很多要解壓的東西。讓我們開始：parents( X i)是進入X i的弧的節點集。 ?？

Let’s ease into it with simple examples. All have the same 5 nodes A, B, C, D, E.

讓我們通過簡單的示例來簡化它。全部具有相同的5個節點A，B，C，D，E。

Our first network will have no arcs. So none of the nodes will have any parents either. So

我們的第一個網絡將沒有弧線。因此，任何節點都不會有任何父節點。所以

P(A,B,C,D,E) = P(A)P(B)P(C)P(D)P(E)

P(A，B，C，D，E)= P(A)P(B)P(C)P(D)P(E)

Our second network will be a Markov chain. Structurally, the graph is a single path A → B → C → D → E. Node A does not have any parents. Node B’s parent is A. Node C’s parent is B. Etc. So

我們的第二個網絡將是馬爾可夫鏈。從結構上講，該圖是單個路徑A→B→C→D→E。節點A沒有任何父代。節點B的父節點是A。節點C的父節點是B。

P(A,B,C,D,E) = P(A)P(B|A)P(C|B)P(D|C)P(E|D)

P(A，B，C，D，E)= P(A)P(B | A)P(C | B)P(D | C)P(E | D)

Our third network is the Naive Bayes classifier in which E serves as the class variable and A, B, C, and D as the predictor variables. It’s graphical structure is

我們的第三個網絡是樸素貝葉斯分類器，其中E充當類變量，而A，B，C和D充當預測變量。它的圖形結構是

E → A, E → B, E → C, E → D

E→A，E→B，E→C，E→D

E has no parents. Each of A, B, C, and D has one parent: E. Accordingly

E沒有父母。 A，B，C和D中的每個都有一個父對象：E。

P(A,B,C,D,E) = P(A|E)P(B|E)P(C|E)P(D|E)P(E)

P(A，B，C，D，E)= P(A | E)P(B | E)P(C | E)P(D | E)P(E)

Readers familiar with naive Bayes classifiers will recognize the form on the right-hand side of this equation. Think of A, B, C, D as the predictors, E as the class variable.

熟悉樸素貝葉斯分類器的讀者會認識到該方程式右側的形式。將A，B，C，D視為預測變量，將E視為類變量。

Now we are ready for a clinical example.

現在我們準備好一個臨床例子。

Clinical Network Example: Flu and its Symptoms

臨床網絡示例：流感及其癥狀

Consider the network whose variables are flu, fever, cough, stuffy nose, and season. For simplicity suppose the first four are boolean (yes/no) and the third categorical (spring, summer, fall, winter).

考慮一下網絡，其變量包括流感，發燒，咳嗽，鼻塞和季節。為簡單起見，假設前四個是布爾值(是/否)，第三個是布爾值(Spring，夏季，秋季，冬季)。

Causal modeling would yield the following arcs:

因果建模將產生以下弧：

flu → fever, flu → cough, flu → stuffy nose

To these let’s add the arc flu → season. This is not a causal arc, i.e., we could have flipped its direction. But we won’t. So that its direction is aligned with the direction of the causal arcs emanating from flu. This will be convenient for the diagnosis covered in the next section.

除了這些，我們還可以添加arc flu → season 。這不是因果關系，也就是說，我們可以改變其方向。但是我們不會。使其方向與由流感引起的因果弧的方向一致。這將為下一節中介紹的診斷提供方便。

Interestingly, it’s not a coincidence that this network’s structure is that of the naive Bayes classifier.

有趣的是，該網絡的結構不是樸素的貝葉斯分類器的結構并非巧合。

Diagnosis: From Symptoms To Flu

診斷：從癥狀到流感

We want the probability that we have the flu, given that we have a fever, cough, stuffy nose, and wintertime. Let’s formally express this as

考慮到我們發燒，咳嗽，鼻塞和冬天，我們希望有感冒的可能性。讓我們正式表達為

P(flu = yes | fever = yes, cough = yes, stuffy nose = yes, season = winter)

or more concisely (and a bit more generally) as

或更簡潔(和更普遍一些)

P(flu|fever,cough,stuffy nose, season)

To infer this, we just apply the Bayes rule:

為了推斷這一點，我們僅應用貝葉斯規則：

numerator(x) = P(fever|flu=x)*P(cough|flu=x)*P(stuffy nose|flu=x)*P(season | flu=x)*P(flu=x)P(flu=yes|fever, cough, stuffy nose, season) = numerator(yes)/(numerator(yes)+numerator(no))

This is why this network is called a Bayesian network. The inference from symptoms to a disease involves Bayesian reasoning.

這就是為什么將此網絡稱為貝葉斯網絡的原因。從癥狀推斷出疾病涉及貝葉斯推理。

The “Beyond Flu” Network

“超越流感”網絡

We already have a prescription, so let’s execute. First, start adding nodes for additional diseases and symptoms. Second, add nodes for behaviors, physiological factors, medical tests, etc. Third, start adding more causality arcs, following the guidance given earlier. Such as

我們已經有了處方，所以讓我們執行吧。首先，開始添加其他疾病和癥狀的節點。第二，添加行為，生理因素，醫學檢查等方面的節點。第三，按照先前給出的指導，開始添加更多因果關系弧。如

smoking → lung cancer, aging → disease-1, aging → disease-2, …, aging → disease-kchemotherapy → cancer, chemotherapy → hair-loss

Next, start adding suitable non-causal arcs. To capture correlations among symptoms, correlations among diseases, etc.

接下來，開始添加合適的非因果弧。捕獲癥狀之間的關聯，疾病之間的關聯等

The macrostructure of the “backbone” of such a network is below.

這種網絡的“骨干”的宏觀結構如下。

behaviors, physiological factors ? diseases 
treatments ? diseases
diseases ? symptoms 
treatments ? side-effects

tests?

測試？

The terms in plural denote sets of nodes of certain types. Such as diseases. X ? Y denotes a set of arcs from X to Y. This level does not reveal the heads and tails of specific arcs.

復數形式的術語表示某些類型的節點集。如疾病。 X?Y表示從X到Y的一組弧。此級別不顯示特定弧的首尾。

We have already discussed why the arc sets are oriented the way they are. The reason we have chosen behaviors and physiological factors to jointly influence diseases is that these two types of factors interact. For instance, the adverse effect of certain bad behavior choices on certain diseases is often higher in older people than in younger people.

我們已經討論了為什么弧集以這種方式定向。我們選擇行為和生理因素共同影響疾病的原因是這兩種因素相互作用。例如，某些不良行為選擇對某些疾病的不利影響通常在老年人中比在年輕人中高。

The macro-parents of diseases could in fact be more elaborate. Such as

實際上，疾病的宏觀父母可能更加復雜。如

behaviors, physiological factors, treatments ? diseases

This would model the joint interaction of all three types of factors, behaviors, physiological factors, and treatments on diseases. That said, such a macro-level interaction would in general produce quite a complex network. So to convey the essence of the backbone, we’ll stick to our earlier macro-structure. That said, exceptions, i.e. specific triplets of (behavior, physiological factor, treatment) that influence a particular disease can always be added in. The macro-structure is just a big picture view, not an enforceable schema. The schema is only at the fine-level, specified by the network’s arcs.

這將模擬所有三種類型的因素，行為， 生理因素和疾病治療的聯合相互作用。就是說，這種宏觀層面的互動通常會產生相當復雜的網絡。因此，為了傳達骨干網的本質，我們將繼續使用我們先前的宏觀結構。也就是說，總是可以添加例外，即影響特定疾病的特定三聯體( 行為， 生理因素 ，治療 )。宏觀結構只是一幅全景圖，而不是可強制執行的方案。該模式僅處于由網絡弧線指定的精細級別。

Notice we have a set of nodes, tests, which is dangling. We’ll let you ponder how this set should be connected to the rest of the network. Should we have tests ? diseases, or diseases ? tests, or some other?

注意，我們有一組懸掛的節點test 。我們將讓您考慮如何將此設備連接到網絡的其余部分。我們應該進行檢查 ? 疾病，還是疾病 ? 檢查，或其他一些檢查？

Training the “Beyond Flu” Network

培訓“超越流感”網絡

Training means estimating the various probability distributions P(Xi|parents(Xi)) of the model from data, belief, or a combination.

訓練意味著根據數據，信念或組合來估計模型的各種概率分布P ( X i | 父母 (Xi))。

Training Symptom Distributions

訓練癥狀分布

Let’s start with learning the probability distribution of any one symptom conditioned on its parents. Let’s make a simplifying assumption that a symptom’s parents can only be diseases. For instance, parents of the symptom cough would include flu and bronchitis.

讓我們從學習以其父母為條件的任何一種癥狀的概率分布開始。讓我們做一個簡化的假設，即癥狀的父母只能是疾病。例如，癥狀咳嗽的父母包括流感和支氣管炎 。

Given a symptom S and its parents pa(S), the conditional probability table to capture P(S|pa(S)) is exponential in the number of diseases in pa(S). This is because in principle any subset of the n diseases in pa(S) can occur. (By “occur” we mean diagnosed in a particular visit.) There are 2^n such subsets. This can be quite large when n is large.

給定癥狀S及其父項pa ( S )，捕獲P ( S | pa ( S ))的條件概率表在pa ( S )中的疾病數上呈指數關系。這是因為原則上可以發生pa ( S )中n種疾病的任何子集。 (“發生”是指在特定的訪問中被診斷出。)有2 ^ n個這樣的子集。當n大時，這可能會很大。

Three factors will collectively mitigate this issue. One is that most symptoms will not have a huge number of parents, i.e. a huge number of diseases that can cause them.

三個因素將共同緩解這一問題。一個是大多數癥狀不會有很多父母，也就是會導致這些癥狀的許多疾病。

The second is that in any one instance, the diagnosed diseases will be a sparse subset of the parents. A diagnosis instance corresponds to taking a snapshot of the state of the diseases of a particular person displaying the symptom. Of all the potential diseases the symptom can appear in, a single person will almost certainly be diagnosed with at most a few. If even more than one. This sparsity will greatly help the training. Simply put, sparsity implies “no significant higher-order interactions”. A numeric example below will illustrate this phenomenon.

其次，在任何情況下，被診斷出的疾病將是父母的稀疏子集。診斷實例對應于拍攝顯示癥狀的特定人的疾病狀態的快照。在癥狀可能出現的所有潛在疾病中，幾乎可以肯定一個人被診斷出最多。如果不止一個。這種稀疏性將極大地幫助培訓。簡而言之，稀疏性意味著“沒有明顯的高階相互作用”。下面的數字示例將說明此現象。

The third factor is that we have some control over what we deem to include in the set of parents pa(S) of a given symptom S. If a symptom’s parent set gets especially large, we can prune away diseases that are less correlated with the symptom.

第三個因素是，我們對我們認為要包含在給定癥狀S的父母pa ( S )中的內容具有一定的控制權。如果癥狀的父集變得特別大，我們可以修剪掉與癥狀相關性較低的疾病。

Discovering A Symptom’s Parents From Data

從數據中發現癥狀的父母

Which diseases should we set as the parents of a given symptom S? Previously we suggested, as a general guideline, using domain knowledge for this. In our particular case, there is a better way. Patient records will reveal which symptoms correlate with which diseases. So this aspect of the structure can also be fruitfully learned from data. The patient records capture within them the collective wisdom of lots of experts making diagnoses in varying scenarios.

我們應將哪些疾病定為給定癥狀S的父母？之前，我們建議將域知識用于一般指導原則。在我們的特定情況下，有更好的方法。患者記錄將揭示哪些癥狀與哪些疾病相關。因此，也可以從數據中學到結構的這一方面。患者記錄收集了許多專家在各種情況下進行診斷的集體智慧。

The benefit of learning a symptom’s parents from the data are huge. This avoids the network designer from having to acquire the domain knowledge to do this — whether it be via discussions with domain experts, extended readings, or some more elaborate mechanism. Even if this work were distributed over a large team of modelers and domain experts such manual design is laborious and error-prone. There are too many symptoms and too many diseases.

從數據中學習癥狀父母的好處是巨大的。這樣就避免了網絡設計人員必須獲取領域知識才能做到這一點-無論是通過與領域專家的討論，擴展的閱讀范圍或更復雜的機制進行的。即使這項工作分散在一大批建模者和領域專家的團隊中，這種手動設計也很費力且容易出錯。癥狀太多，疾病太多。

That said, domain knowledge can still help fill in the gaps for situations that may not be covered by patient records, or to surface inconsistencies between belief and data. Simply put, domain-knowledge + data-driven learning is generally better than either alone.

也就是說，領域知識仍然可以幫助填補患者記錄可能無法覆蓋的情況的空白，或填補信念和數據之間的不一致之處。簡而言之，領域知識+數據驅動的學習通常比任何一個都要好。

We’ll discuss patient visit records in detail in the next section, as we will anyhow need them for learning the parameters of the network, such as the probabilities in P(S|pa(S)). Regardless of how we have arrived at the structure of pa(S).

在下一節中，我們將詳細討論患者就診記錄，因為無論如何我們都將需要它們來學習網絡參數，例如P ( S | pa ( S ))的概率。無論我們如何得出pa ( S )的結構。

Patient Visit Records

患者就診記錄

We’ll assume every interaction with a medical expert generates a new record, capturing the symptoms observed and the diseases diagnosed. If multiple diseases were diagnosed, which of the observed symptoms were implicated in which disease are also captured. As deemed by the medical expert. The diagnosis may be as certain or as speculative as the expert sees fit. All we care about is that it was done by a professional.

我們假設與醫學專家的每次互動都會產生新的記錄，記錄觀察到的癥狀和診斷出的疾病。如果診斷出多種疾病，則涉及哪些觀察到的癥狀，哪些疾病也被捕獲。由醫學專家認為。診斷可以按照專家認為適當的確定或推測。我們只關心它是由專業人士完成的。

Let’s see an example patient visit record. Made up. Not medical advice!

讓我們看一個示例患者訪問記錄。捏造。沒有醫療建議！

(symptoms = high fever, cough, sore throat, lump in throat; disease = flu)
(symptoms = lump in throat, chest pain; disease = gerd)

During this visit, two diseases were diagnosed: flu and GERD. The health expert implicated lump in throat in both.

在這次訪問中，診斷出兩種疾病：流感和GERD 。這位健康專家暗示這兩種情況都有喉嚨腫塊 。

From such a record we can derive symptom-centered representations, one for each observed symptom. Such a representation lists the diagnosed diseases implicated to that symptom during the visit. These diseases will also be referred to as the symptom’s parents in that visit record.

從這樣的記錄中，我們可以得出以癥狀為中心的表示形式，每種觀察到的癥狀都有一個。這樣的表述列出了就診期間與該癥狀有關的診斷疾病。在該訪問記錄中，也將這些疾病稱為癥狀的父母。

In our above example, lump in throat’s parents in the record are flu and GERD.

在我們上面的示例中，記錄中的喉嚨父母中有流感和GERD 。

Symptom-centered representations lend themselves to learning symptom distributions.

以癥狀為中心的表示形式有助于學習癥狀分布。

Discovering A Symptom’s Parents

發現癥狀的父母

From the collection of symptom-centered representations derived from all the patient visit records we have access to, we can easily determine the symptom’s parents. These are all the diseases implicated in this data. The parents of lump in throat would be flu and GERD if all we had is the single patient visit record to learn from.

從我們可以訪問的所有患者就診記錄中得出的以癥狀為中心的表示形式中，我們可以輕松確定癥狀的父母。這些都是與該數據有關的疾病。如果我們僅有的單次患者就診記錄，那么父母的喉嚨會是流感和GERD 。

A huge and diverse set of patient visit records may yield, for some symptoms, huge sets of parents. As mentioned earlier, we can prune such large sets by dropping parents that are less correlated with the symptom.

對于某些癥狀，大量多樣的患者就診記錄可能會產生大量父母。如前所述，我們可以通過刪除與癥狀相關性較低的父母來刪節這些大集合。

Training Symptom Distributions From Patient Visit Records

從患者就診記錄中訓練癥狀分布

We want to learn, for each symptom, its distribution conditioned on its parents. We have a symptom-centered data set available for this learning. (This was derived from patient visit records as described earlier.)

對于每種癥狀，我們都希望了解其癥狀以其父母為條件。我們有一個以癥狀為中心的數據集可用于此學習。 (這是根據先前所述的患者就診記錄得出的。)

Consider any one instance in this data set. It lists a symptom, together with the diseases implicated with it during a patient visit. What it does not list is the diseases among the symptom’s parents that were not implicated. As we will see below, we need this information as well. Fortunately, we can deduce these diseases by subtracting the implicated diseases from the symptom’s parents.

考慮此數據集中的任何一個實例。它列出了癥狀以及患者就診時涉及的疾病。它沒有列出的是癥狀父母之間沒有牽連的疾病。正如我們將在下面看到的，我們也需要此信息。幸運的是，我們可以通過從癥狀的父母中減去所涉及的疾病來推斷出這些疾病。

Let’s see an example. Say cough’s parents are flu, pneumonia, and asthma. (In a real network this list would include a lot more diseases.) Say cough’s parents in a particular patient record are flu. From this, we can deduce that in this instance cough is not caused by pneumonia or asthma. While this deduction is not correct with 100% certainty in this instance repeated occurrences of this same deduction do give a good estimate of the associated conditional probabilities.

讓我們來看一個例子。說咳嗽的父母是流感，肺炎和哮喘。 (在真實的網絡中，此列表將包括更多的疾病。) 咳嗽的父母在特定患者記錄中都是流感。據此，我們可以推斷出在這種情況下咳嗽不是由肺炎或哮喘引起的。盡管在這種情況下此推論不是100％肯定正確的，但重復出現相同的推論確實可以很好地估計相關的條件概率。

From these two pieces of information — which diseases among a symptom’s parents are implicated to and which not in a particular patient record — we will derive a training vector of the following form.

從這兩條信息(癥狀的父母當中涉及哪些疾病，而在特定的患者記錄中沒有涉及)，我們將得出以下形式的訓練向量。

cough flu pneumonia asthma
  1    1      0       0

This is easy to read. It says that, in this patient record, cough is present, and of cough’s parents, flu is diagnosed, pneumonia is not diagnosed, and asthma is not diagnosed.

這很容易閱讀。它說，在此患者記錄中，存在咳嗽，并且在咳嗽的父母中，確診為流感，未診斷為肺炎，也未診斷為哮喘。

Next, consider a patient record whose observed list of symptoms does not include cough. Next, derive values for cough’s parents in this record depending on whether a disease in this set of parents is diagnosed in that record or not.

接下來，考慮患者記錄，其觀察到的癥狀清單不包括咳嗽。接下來，根據該記錄中是否診斷出該組父母中的疾病，導出該記錄中咳嗽父母的值。

Here is an example. Say a patient record resulted in the diagnosis

這是一個例子。說病歷導致診斷

(symptoms = shortness of breath, chest pain, wheezing; diseases = asthma)

From this, we may derive the record

由此，我們可以得出記錄

cough flu pneumonia asthma
  0    0      0       1

Armed with a rich enough collection of such records, which of course will keep growing as people will keep getting sick in the foreseeable future, we can learn P(cough|parents(cough)). More broadly, the distribution for any symptom conditioned on its parents.

有了足夠豐富的此類記錄，隨著人們在可預見的將來會不斷生病，這些記錄當然會繼續增長，我們可以學習P ( 咳嗽 | 父母 ( 咳嗽 ))。更廣泛地說，任何癥狀的分布都取決于其父母。

Are such training instances, looked at individually, perfect? No. The absence of a disease in a diagnosis does not mean with certainty that it is not present, now or soon. The same applies to a symptom. That said, over a larger number of training instances in diverse-enough settings, such noise should get drowned out by the signal. For example, if only 30% of the records in which flu is diagnosed also reveal cough as an observed symptom, we can infer with high confidence that flu produces cough as an observed symptom no more than half the time.

這樣的培訓實例(單獨查看)是否完美？否。診斷中沒有疾病并不意味著可以肯定地說現在或不久就不存在這種疾病。癥狀也是如此。也就是說，在足夠多的不同環境下進行大量訓練時，這種噪聲應該被信號淹沒。例如，如果僅30％診斷為流感的記錄也顯示出咳嗽為觀察到的癥狀，我們可以高度肯定地推斷出流感產生的咳嗽為觀察到的癥狀的時間不超過一半。

Training The Influence Of Behaviors And Physiological Factors On Diseases

訓練行為和生理因素對疾病的影響

Here we refine the macro-structure

在這里，我們優化宏觀結構

behaviors, physiological factors ? diseases

We’ll assume the needed information may also be derived from patient records.

我們假設所需的信息也可能來自患者記錄。

We seek to estimate, for every disease D, the parameters of D’s distribution conditioned on its parents. The parents of D are suitable subsets of the behaviors and physiological factors. Which behaviors and which physiological factors? These could be set via domain knowledge as a lot is known about which behaviors affect which diseases. (Adversely or beneficially.) Similarly for physiological factors. Alternatively or in addition, a disease’s parents could also be inferred from data.

我們力求針對每種疾病D估計D的分布參數，該參數取決于其父本。 D的父母是行為和生理因素的合適子集。哪些行為和哪些生理因素？這些可以通過領域知識來設置，因為人們知道哪些行為會影響哪些疾病。 (不利或有益。)對于生理因素也是如此。替代地或附加地，還可以從數據推斷出疾病的父母。

Let’s illustrate such training from data. Consider the following patient record

讓我們從數據中說明這種訓練。考慮以下患者記錄

smoker, 50 years old, male, diagnosed: lung cancer

First, from a collection of such records we can infer lung cancer’s parents, i.e. the behaviors and physiological factors that influence its diagnosis. As with symptom distributions, we need two more types of information to estimate the distribution of lung cancer given its parents.

首先，從此類記錄的集合中，我們可以推斷出肺癌的父母，即影響其診斷的行為和生理因素。與癥狀分布一樣，我們需要兩種以上的信息來估計肺癌的父母分布情況。

In a particular diagnosis of lung cancer, which of the parents were missing?
在特定的肺癌診斷中 ，哪些父母失蹤了？
How to estimate the probability that one does not have lung cancer in the presence of some of its parents?
如何估算某些父母在場的情況下沒有肺癌的可能性？

For 1, as in the symptoms case, the missing parents are the full set of parents minus those in this patient record. For 2, again as in the symptoms case, we derive these from patient records in which some of lung cancer’s parents occur whereas the patient is diagnosed as being free of lung cancer. An example is a smoker who does not have lung cancer. How do we decide whether a factor is “key” or not? Try domain knowledge.

對于1，如在癥狀案例中，缺少的父母是減去該患者記錄中父母的全部父母。對于2，同樣如在癥狀案例中一樣，我們從患者記錄中得出這些數據，其中一些肺癌的父母會出現，而被診斷為沒有肺癌。一個例子是沒有肺癌的吸煙者。我們如何確定一個因素是否為“關鍵”？嘗試領域知識。

Training The Influence Of Treatments On Diseases

訓練治療方法對疾病的影響

We have a problem here. Our macro-structure schema had

我們這里有問題。我們的宏觀結構模式有

behaviors, physiological factors ? diseases 
treatments                       ? diseases

That is, any single disease D would have two sets of parents, one involving certain combinations of behaviors and physiological factors, and the other involving treatments. We could, of course, combine these two sets of parents into one. Doing this widely has the issues discussed earlier. That said, specific triplets of behavior, physiological factor, and treatment in the context of specific diseases may be worth including. (As was discussed earlier.)

也就是說，任何一種疾病D都會有兩組父母，一組涉及行為和生理因素的某些組合，另一組涉及治療。當然，我們可以將這兩組父母合并為一個。廣泛進行此操作具有前面討論的問題。也就是說，在特定疾病的背景下，特定的三聯癥的行為，生理因素和治療可能值得考慮。 (如前所述。)

To summarize we wouldn’t want to collapse

總而言之，我們不想崩潰

behaviors, physiological factors ? diseases 
treatments                       ? diseases

into

進入

behaviors, physiological factors, treatments ? diseases

as a general rule.

作為基本規則。

Keeping Two Sets Of Parents Separate

使兩組父母分開

So how do we keep the two sets of parents separate for a given disease D? One way is to introduce an additional variable for D (we’ll call it DI) as below.

那么，如何針對給定的疾病D使兩組父母分開？一種方法是為D引入一個附加變量(我們將其稱為DI )，如下所示。

behaviors, physiological factors ? DItreatments, DI                   ? D

We can think of DI as modeling disease onset and D as modeling the disease’s next state, following one or more treatments. That said, this scheme is incapable of modeling the dynamic evolution of a disease in response to treatments. This would require D to be a parent of DI, which would violate the acyclicity constraint on a Bayes network.

我們可以將DI看作是疾病發作的建模，而D則是將一種或多種治療方法模擬為疾病的下一個狀態。也就是說，該方案無法對響應治療的疾病動態演變建模。這將要求D是DI的父代，這將違反Bayes網絡上的非循環性約束。

Let’s see this in a specific example.

讓我們在一個特定的示例中看到這一點。

diet, age, gender               → heart disease-I
heart-disease-I, treatment      → heart disease

Treatments And Side-Effects

治療和副作用

Let’s start simple. We have a node for every side-effect. We have a node for every treatment. A side-effect’s parents are all treatments that have that side-effect.

讓我們開始簡單。每個副作用都有一個節點。我們為每個治療提供一個節點。副作用的父母都是具有該副作用的治療方法。

Let’s see an example.

讓我們來看一個例子。

chemotherapy, bone marrow transplantation, …, → fatigue

What is the value of including such arcs in our network? One is that it lets us seek treatments that are both effective for a particular disease and have relatively mild side-effects.

在我們的網絡中包含此類弧的價值是什么？其一是它使我們尋求既對特定疾病有效又具有相對溫和副作用的治療方法。

Inferences In This Scaled Network

此規模網絡中的推論

Let’s start by repeating our network’s macro-structure here. This helps to see what types of inferences the network lends itself to.

讓我們從這里重復網絡的宏觀結構開始。這有助于了解網絡適用于哪些類型的推理。

behaviors, physiological factors ? diseases 
treatments                       ? diseases
diseases                         ? symptoms 
treatments                       ? side-effects
tests ?

Now onto specific inferences. Each is followed by an explanation of how it can be made to work. In this explanation, we focus on whether and how the various probabilities involved can be computed from data or domain knowledge. The aim is to provide insights into how the structure of the network simplifies various calculations.

現在介紹具體的推論。每一個后面都有一個解釋，說明了如何使其工作。在此說明中，我們重點關注是否可以從數據或領域知識中計算出涉及的各種概率，以及如何計算這些概率。目的是提供有關網絡結構如何簡化各種計算的見解。

In practice, one may be using an inference algorithm as a black-box, which will do whatever it does behind the scenes.

在實踐中，可能會將推理算法用作黑盒，這將在后臺執行任何操作。

What is the likelihood of getting lung cancer if I smoke, am a female, and am 75 years old?

如果我吸煙，成年女性和75歲，罹患肺癌的可能性有多大？

We seek P(lung cancer | smokes, female, 75 years old).

我們尋求P ( 肺癌 | 吸煙，女性， 現年75歲 )。

The good news is that all the observations this inference is conditioned on are lung cancer’s parents.

好消息是，此推斷所依據的所有觀察結果都是肺癌的父母。

The bad news is that lung cancer may have additional parents. These need to be marginalized out. Marginalization involves averaging over the various values these additional parents can take, weighted by their probabilities. As the number of such values is exponential in the number of additional parents, marginalization is a slow process. Sophisticated algorithms do exist to speed it up. Their discussion is beyond the scope of this post.

壞消息是肺癌可能會有更多的父母。這些需要被邊緣化。邊緣化涉及對這些額外的父母可以接受的各種值進行平均，并按其概率加權。由于此類值的數量與其他父母的數量成指數關系，因此邊緣化是一個緩慢的過程。確實存在完善的算法可以加快速度。他們的討論超出了本文的范圍。

Frequently used restrictions of node distributions can be cached at the node. Think of this as attaching, to a node S, not only P(S|parents(S)) but also P(S|subset(parents(S)) for suitable subsets of parents(S). Such cached distributions may then be used as appropriate, reducing the need for on-the-fly marginalization.

節點分布的常用限制可以緩存在該節點上。可以認為這不僅是將P ( S | 父代 ( S ))附加到節點S上，而且是將P ( S | 子集 ( 父代 ( S ))附加到適當的父代 ( S )上。適當使用，以減少進行實時邊緣化的需求。

I smoke, am a female, and am 75 years old. And I have a persistent cough. What is the likelihood I have lung cancer?

我吸煙，是位女性，現年75歲。 而且我持續咳嗽。 我患肺癌的可能性有多大？

We seek P(lung cancer | smokes, female, 75 years old, persistent cough). By Bayes rule,

我們尋求P ( 肺癌 | 吸煙，女性， 75歲 ， 持續咳嗽 )。根據貝葉斯規則，

P(lung cancer | smokes, female, 75 years old, persistent cough) =
P(smokes, female, 75 years old, persistent cough | lung cancer)*P(lung cancer)/P(smokes, female, 75 years old, persistent cough)

(We’ll explain the bold-face font later.)

(稍后我們將解釋黑體字體。)

Next, we leverage an important property.

接下來，我們利用重要屬性。

A node is conditionally independent of its non-descendants given its parents.

一個節點有條件地獨立于給定其父代的非后代 。

As this is the first time we are seeing this property in this post, let’s delve into it a bit. Consider the network A → B → C. (A Markov chain.) Applying the aforementioned conditional independence probability, we get that C is independent of A given B. That is, P(C|B, A) equals P(C|B). Or in other words, once we have observed B, the value of A provides no additional information towards predicting the value of C.

由于這是我們在本文中第一次看到此屬性，因此讓我們對其進行深入研究。考慮網絡A → B → C 。 (一條馬爾可夫鏈。)應用上述條件獨立概率，我們得出C獨立于A給定B。即， P ( C | B ， A )等于P ( C | B )。換句話說，一旦我們觀察到B ， A的值就沒有提供任何有關預測C值的信息。

Applying this conditional independence property to our situation gives

將這種條件獨立屬性應用于我們的情況可以得出

P(smokes, female, 75 years old, persistent cough | lung cancer) =P(smokes,female,75 years old|lung cancer)*P(persistent cough|lung cancer)

Okay, let’s now collect together all the terms in bold. These are what remain to be estimated. We have copied them below.

好的，讓我們現在將所有術語加粗在一起。這些都是有待估計的。我們已經在下面復制了它們。

P(lung cancer)
P(smokes, female, 75 years old, persistent cough)
P(smokes,female,75 years old|lung cancer)
P(persistent cough|lung cancer)

P(lung cancer) is easy to estimate from a sufficiently rich set of patient records. Some usable estimates may already exist in the public domain.

從一組足夠豐富的患者記錄中很容易估計出P ( 肺癌 )。在公共領域中可能已經存在一些可用的估計。

P(persistent cough|lung cancer) can also be estimated from patient records as the fraction of records diagnosed with lung cancer that have persistent cough as an observed symptom.

P ( 持續性咳嗽 | 肺癌 )也可以從患者記錄中評估為診斷為患有持續性咳嗽作為觀察到癥狀的肺癌記錄的一部分。

To estimate P(smokes, female, 75 years old, persistent cough), we’ll invoke the independence assumption. This leaves us with P(smokes), P(age), P(persistent cough), and P(female). The first three are easy to estimate from data combined with knowledge. The last one we can just set to 0.5.

為了估計P ( 吸煙，女性， 75歲 ， 持續咳嗽 )，我們將調用獨立性假設。這給我們留下了P ( 煙 )， P ( 年齡 )， P ( 持續性咳嗽 )和P ( 女性 )。前三個很容易從結合知識的數據中估算出來。我們可以將最后一個設置為0.5。

As a slight digression, strictly speaking, the variables mentioned in the previous paragraph are not all entirely independent. For instance, women live longer than men so age and gender are at least mildly dependent.

嚴格來講，上段提到的變量并不是全部獨立的。例如，婦女的壽命比男子長，因此年齡和性別至少有一定程度的依賴性。

Finally, we are left with P(smokes, female, 75 years old|lung cancer). Conditioning (smokes, female, 75 years old) on lung cancer makes the former three conditionally dependent. So we should avoid invoking independence if we can. If we can’t, well it’s not the end of the world. The resulting inference is still meaningfully interpretable. Specifically, it operates as a Naive Bayes classifier which predicts lung cancer from smokes, female, age, and persistent cough treated as conditionally independent of the outcome.

最后，我們剩下P ( 抽煙，女性， 75歲 | 肺癌 )。對肺癌進行調理( 吸煙，女性， 75歲 )使前三個有條件依賴。因此，如果可以的話，我們應該避免調用獨立性。如果我們做不到，那不是世界末日。由此產生的推斷仍然可以有意義地解釋。具體來說，它可以作為樸素貝葉斯分類器，可根據煙，女性，年齡和持續咳嗽 (有條件地獨立于結果)預測肺癌。

Macro Lesson

宏觀課

The macro lesson from the above example is that when seeking to diagnose a disease from some observed physiological factors and some observed symptoms, the physiological factors can be reasonably assumed to be independent of the symptoms given the disease. Sure older people may be more likely to exhibit certain symptoms than younger ones. However, when we additionally condition on a disease that could explain the symptom, the added influence of being old is small in comparison.

上面示例的宏觀教訓是，當試圖從某些觀察到的生理因素和某些觀察到的癥狀來診斷疾病時，可以合理地認為生理因素與給定疾病的癥狀無關。當然，老年人比年輕人可能更容易表現出某些癥狀。但是，當我們另外考慮一種可以解釋癥狀的疾病時，相比之下，變老的額外影響很小。

What cancer treatments have minimal side-effects?

哪些癌癥療法副作用最小？

Let’s express this in terms of a hybrid of logic and probabilities. We seek treatments T such that P(cancer|T) is high and for every side-effect SE, P(SE|T) is low. The key observation here is that in both probabilities, the variable being conditioned on is among the parents of the variable whose probability distribution we seek to compute. (In the previous sentence, if the word “variable” is causing confusion, replace it by “event”.) Thus we can leverage the network’s structure to compute what we want efficiently.

讓我們用邏輯和概率的混合來表達這一點。我們尋求使T ( P ( 癌癥 | T )高，而對SE的每個副作用都低P ( SE | T )的治療方法T。此處的主要觀察結果是，在這兩種概率中，以其為條件的變量位于我們要計算其概率分布的變量的父級中。 (在前一句話中，如果“變量”一詞引起混亂，請用“事件”代替。)因此，我們可以利用網絡的結構來有效地計算所需的內容。

Further Reading

進一步閱讀

https://www.sciencedirect.com/science/article/pii/S1532046418302041

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5519723/ In this article, disease, and symptom mentions are also extracted from unstructured text such as Nurse notes. Named entity recognition (NER) techniques are useful for this purpose. (In this case, the named entities are diseases and symptoms.) Check out https://towardsdatascience.com/named-entity-recognition-in-nlp-be09139fa7b8 for more on NER.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5519723/在本文中，還從非結構化文本(如護士筆記)中提取了疾病和癥狀。命名實體識別(NER)技術可用于此目的。 (在這種情況下，命名的實體是疾病和癥狀。)請訪問https://towardsdatascience.com/named-entity-recognition-in-nlp-be09139fa7b8了解有關NER的更多信息。

http://www.cs.cmu.edu/~guestrin/Class/10701-S05/slides/bns-inference.pdf Insightful example here

http://www.cs.cmu.edu/~guestrin/Class/10701-S05/slides/bns-inference.pdf此處很有見地的示例

flu, allergy → sinus, sinus → headache, sinus → nose

Read this as “flu or allergy cause sinus, sinus causes a headache, and sinus can hamper the proper functioning of your nose”.

將此讀為“流感或過敏引起鼻竇，鼻竇引起頭痛，鼻竇會妨礙鼻子正常工作”。

翻譯自: https://towardsdatascience.com/modeling-with-bayesian-networks-c7ebf28a8b6b

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/387876.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/387876.shtml
英文地址，請注明出處：http://en.pswp.cn/news/387876.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！

長春南關區凈月大街附近都有哪些課后班？

長春南關區凈月大街附近都有哪些課后班？在學校的教育不能滿足廣大學生的需求的時候，一對一輔導、文化課輔導、高考輔導等越來越多的家長和孩子的選擇。相對于學校的大課教育，一對一輔導有著自身獨特的優勢，一對一輔導有著學校教學…

dev中文本框等獲取焦點事件

<ClientSideEvents GotFocus"GotFocus" /> editContract.SetFocus()//設置文本框等的焦點 function GotFocus(s, e) { window.top.DLG.show(700, 600, "PrePayment/ContractSelect.aspx", "選擇", null ); }…

數據科學家數據分析師_使您的分析師和數據科學家在數據處理方面保持一致

數據科學家數據分析師According to a recent survey conducted by Dimensional Research, only 50 percent of data analysts’ time is actually spent analyzing data. What’s the other half spent on? Data cleanup — that tedious and repetitive work that must be do…

神經網絡使用情景

神經網絡使用情景人臉／圖像識別語音搜索文本到語音（轉錄）垃圾郵件篩選（異常情況探測）欺詐探測推薦系統（客戶關系管理、廣告技術、避免用戶流失）回歸分析為何選擇Deeplearning4j？ …

BZOJ4890 Tjoi2017城市

顯然刪掉的邊肯定是直徑上的邊。考慮枚舉刪哪一條。然后考慮怎么連。顯然新邊應該滿足其兩端點在各自樹中作為根能使樹深度最小。只要線性求出這個東西就可以了，這與求樹的重心的過程類似。 #include<iostream> #include<cstdio> #include<cmath>…

【國際專場】laravel多用戶平臺(SaaS, 如淘寶多用戶商城）的搭建策略

想不想用Laravel來搭建一個多用戶、或多租戶平臺？比如像淘寶那樣的多商戶平臺呢？聽上去很復雜，不是嗎？怎么能一個程序，給那么多的機構用戶來用呢？如何協調管理它們呢？數據庫怎么搭建呢&#xff…

GitHub常用命令及使用

GitHub使用介紹摘要： 常用命令： git init 新建一個空的倉庫git status 查看狀態git add . 添加文件git commit -m 注釋提交添加的文件并備注說明git remote add origin gitgithub.com:jinzhaogit/git.git 連接遠程倉庫git push -u origin master 將本地…

神經網絡的類型

KNN DNN SVM DL BP DBN RBF CNN RNN ANN 概述本文主要介紹了當前常用的神經網絡，這些神經網絡主要有哪些用途，以及各種神經網絡的優點和局限性。 1 BP神經網絡 BP (Back Propagation)神經網絡是一種神經網絡學習算法。其由輸入層、中間層、輸出層組成的…

python db2查詢_如何將DB2查詢轉換為python腳本

python db2查詢Many companies are running common data analytics tasks using python scripts. They are asking employees to convert scripts that may currently exist in SAS or other toolsets to python. One step of this process is being able to pull in the same …