5分鐘內完成胸部CT掃描機器學習

This post provides an overview of chest CT scan machine learning organized by clinical goal, data representation, task, and model.

這篇文章按臨床目標,數據表示,任務和模型組織了胸部CT掃描機器學習的概述。

A chest CT scan is a grayscale 3-dimensional medical image that depicts the chest, including the heart and lungs. CT scans are used for the diagnosis and monitoring of many different conditions including cancer, fractures, and infections.

胸部CT掃描是描繪胸部(包括心臟和肺)的3維灰度醫學圖像。 CT掃描用于診斷和監視許多不同的狀況,包括癌癥,骨折和感染。

臨床目標 (Clinical Goal)

The clinical goal refers to the medical abnormality that is the focus of the study. The following figure illustrates some example abnormalities, shown as 2D axial slices through the CT volume:

臨床目標是指作為研究重點的醫學異常。 下圖說明了一些示例異常,顯示為通過CT體積的2D軸向切片:

Image for post
Radiology Assistant, pneumonia 放射學助理 ,肺炎Kalpana Bansal, nodules 卡爾帕納邦薩爾 ,結節pulmonarychronicles, honeycombing pulmonarychronicles ,蜂窩狀Radiopaedia.org, emphysema Radiopaedia.org ,肺氣腫TES.com, atelectasis TES.com ,肺不張ResearchGate研究之門

Many CT machine learning papers focus on lung nodules.

許多CT機器學習論文著重于肺結節 。

Other recent work has looked at pneumonia (lung infection), emphysema (a kind of lung damage that can be caused by smoking), lung cancer, or pneumothorax (air outside of the lungs rather than inside the lungs).

最近的其他工作研究了肺炎 (肺部感染), 肺氣腫 (一種可能由吸煙引起的肺損傷), 肺癌或氣胸 (肺部空氣而不是肺部空氣)。

I have been focused on multiple abnormality prediction, in which the model predicts 83 different abnormal findings simultaneously.

我一直致力于多個異常預測,其中該模型同時預測83個不同的異常發現 。

數據 (Data)

There are several different ways to represent CT data in a machine learning model, illustrated in this figure:

有幾種不同的方法來表示機器學習模型中的CT數據,如圖所示:

Image for post
Image by Author
圖片作者

3D representations include a whole CT volume which is roughly 1000 x 512 x 512 pixels, and a 3D patch which can be large (e.g. half or a quarter of a whole volume) or small (e.g. 32 x 32 x 32 pixels).

3D表示包括大約1000 x 512 x 512像素的整個CT體積,以及可以大(例如,整個體積的一半或四分之一)或小(例如,32 x 32 x 32像素)的3D補丁。

2.5D representations make use of different perpendicular planes.

2.5D表示法使用不同的垂直平面。

  • The axial plane is horizontal like a belt, the coronal plane is vertical like a headband or old-style headphones, and the sagittal plane is vertical like the plane of a bow and arrow in front of an archer.

    軸向平面像皮帶一樣水平,冠狀平面像頭帶或老式耳機一樣垂直,而矢狀面像弓箭手前面的弓箭平面一樣垂直。
  • If we take one axial slice, one sagittal slice, and one coronal slice, and stack them up into a 3-channel image, then we have a 2.5D slice representation.

    如果我們獲取一個軸向切片,一個矢狀切片和一個冠狀切片,并將它們堆疊成3通道圖像,則我們將獲得2.5D切片表示。
  • If this is done with small patches, e.g. 32 x 32 pixels, then we have a 2.5D patch representation.

    如果使用小補丁(例如32 x 32像素)完成此操作,那么我們將獲得2.5D補丁表示。

Finally, 2D representations are also used. This could be a full slice (e.g. 512 x 512), or a 2D patch (e.g. 16 x 16, 32 x 32, 48 x 48). These 2D slices or patches are usually from the axial view.

最后,還使用2D表示。 這可以是完整切片(例如512 x 512)或2D補丁(例如16 x 16、32 x 32、48 x 48)。 這些2D切片或面片通常是從軸向觀察的。

任務 (Task)

There are many different tasks in chest CT machine learning.

胸部CT機器學習中有許多不同的任務。

The following figure illustrates a few tasks:

下圖說明了一些任務:

Image for post
Image by Author. Sub-images from Yan et al. 2018 DeepLesion and Jiang et al. 2019
圖片由作者提供。 Yan等人的子圖片。 2018 DeepLesion和Jiang等。 2019年

Binary classification involves assigning a 1 or 0 to the CT representation, for the presence (1) or absence (0) of an abnormality.

二進制分類涉及為異常的存在(1)或不存在(0)給CT表示分配1或0。

Multi-class classification is for mutually exclusive categories, like different clinical subtypes of interstitial lung disease. In this case the model assigns 0 to all categories except for 1 category.

多類別分類適用于互斥類別,例如間質性肺疾病的不同臨床亞型。 在這種情況下,模型會將0分配給除1個類別以外的所有類別。

Multi-label classification is for non-mutually-exclusive categories, like atelectasis (collapsed lung tissue), cardiomegaly (enlarged heart), and mass. A CT scan might have some, all, or none of these findings, and the model determines which ones if any are present.

多標簽分類適用于非互斥類別,例如肺不張(肺組織塌陷),心臟肥大(心臟擴大)和腫塊。 CT掃描可能有部分,全部或沒有這些發現,并且模型確定存在哪些發現。

Object detection involves predicting the coordinates of bounding boxes around abnormalities of interest.

對象檢測涉及預測感興趣異常周圍的邊界框的坐標。

Segmentation involves labeling every pixel, which is conceptually like “tracing the outlines of abnormalities and coloring them in.”

分割涉及標記每個像素,從概念上講就像“追蹤異常輪廓并將其著色”。

Different labels are needed to train these models. “Presence or absence” labels for abnormalities are needed to train classification models, e.g. [atelectasis=0, cardiomegaly = 1, mass = 0]. Bounding box labels are needed to train an object detection model. Segmentation masks (traced and filled in outlines) are needed to train a segmentation model. Only “presence or absence” labels are scalable to tens of thousands of CT scans, if these labels are extracted automatically from free-text radiology reports (e.g. the RAD-ChestCT data set of 36,316 CTs). Segmentation masks are the most time-consuming to obtain because they must be drawn manually on each slice; thus, segmentation studies typically use on the order of 100–1,000 CT scans.

需要不同的標簽來訓練這些模型。 訓練分類模型需要使用“存在或不存在”的異常標簽,例如[肺不張= 0,心臟腫大= 1,質量= 0]。 需要邊界框標簽來訓練對象檢測模型。 需要分割蒙版(跟蹤并填充輪廓)來訓練分割模型。 如果這些標簽是從自由文本放射學報告中自動提取的(例如,包含36,316個CT的RAD-ChestCT數據集 ),則只有“存在或不存在”的標簽才能擴展到成千上萬的CT掃描。 分割蒙版是最耗時的,因為必須在每個切片上手動繪制它們。 因此,分割研究通常使用100-1,000次CT掃描。

模型 (Model)

Convolutional neural networks are the most popular machine learning model used on CT data. For a 5-minute intro to CNNs, see this article.

卷積神經網絡是用于CT數據的最流行的機器學習模型。 有關CNN的5分鐘介紹,請參閱本文 。

  • 3D CNNs are used for whole CT volumes or 3D patches

    3D CNN用于整個CT體積或3D補丁
  • 2D CNNs are used for 2.5D representations (3 channels, axial/coronal/sagittal), in the same way that 2D CNNs can take a 3-channel RGB image as input (3 channels, red/green/blue).

    2D CNN用于2.5D表示(3通道,軸向/冠狀/矢狀),就像2D CNN可以將3通道RGB圖像作為輸入(3通道,紅色/綠色/藍色)一樣。
  • 2D CNNs are used for 2D slices or 2D patches.

    2D CNN用于2D切片或2D面片。

Some CNNs combine 2D and 3D convolutions. CNNs can also be “pretrained” which typically refers to first training the CNN on a natural image dataset like ImageNet and then refining the CNN’s weights on the CT data.

一些CNN結合了2D和3D卷積。 CNN也可以是“預訓練”的,通常是指首先在自然圖像數據集(如ImageNet)上訓練CNN,然后在CT數據上細化CNN的權重。

Here is an example architecture in which a pretrained 2D CNN (ResNet18) is applied to groups of 3 adjacent slices, followed by 3D convolution:

這是一個示例架構 ,其中將預訓練的2D CNN(ResNet18)應用于3個相鄰切片的組,然后進行3D卷積:

Image for post
Image by Author
圖片作者

間質性肺疾病分類實例 (Interstitial Lung Disease Classification Examples)

The following table includes several example studies focused on interstitial lung disease, organized by clinical goal, data, task, and model.

下表包括按臨床目標,數據,任務和模型組織的,針對間質性肺疾病的幾個示例研究。

  • Clinical goal: these papers are all focused on interstitial lung disease. The exact classes used differ between studies. Some studies focus on clinical groupings like idiopathic pulmonary fibrosis or idiopathic non-specific interstitial pneumonia (e.g. Wang et al. 2019 and Walsh et al. 2018). Other studies focus on lung patterns like reticulation or honeycombing (e.g. Anthimopoulos et al. 2016 and Gao et al. 2016).

    臨床目標:這些論文都集中于間質性肺疾病。 研究之間使用的確切類別有所不同。 一些研究側重于臨床分組,如特發性肺纖維化或特發性非特異性間質性肺炎(例如Wang等人2019和Walsh等人2018)。 其他研究集中在網狀或蜂窩狀等肺部模式上(例如Anthimopoulos等,2016; Gao等,2016)。
  • Data: the data sets consist of 100–1,200 CTs because all of these studies rely on manual labeling of patches, slices, or pixels, which is very time-consuming. The upside of doing patch, slice, or pixel-level classification is that it provides localization information in addition to diagnostic information.

    數據:數據集包含100–1,200個CT,因為所有這些研究都依賴于手動標記斑塊,切片或像素,這非常耗時。 進行補丁,切片或像素級分類的好處是,它除了提供診斷信息外,還提供定位信息。
  • Task: the tasks are mostly multi-class classification, in which each patch or slice is assigned to exactly one class out of multiple possible classes.

    任務:任務主要是多類分類,其中每個補丁或切片都被分配給多個可能類中的一個類。
  • Model: some of the studies use custom CNN architectures, like Wang et al. 2019 and Gao et al. 2018, whereas other studies adapt existing CNN architectures like ResNet and AlexNet.

    模型:有些研究使用了定制的CNN架構,例如Wang等。 2019和Gao等。 2018年,而其他研究調整現有CNN架構像RESNET和AlexNet 。

Image for post

附加閱讀 (Additional Reading)

  • For a longer, more in-depth article on this topic, see Automatic Interpretation of Chest CT Scans with Machine Learning

    有關此主題的更長時間,更深入的文章,請參閱使用機器學習對胸部CT掃描進行自動解釋

  • For an article about machine learning in chest x-rays, which are 2D medical images of the chest rather than 3D medical images of the chest, see Automated Chest X-Ray Interpretation

    有關胸部X射線是機器的2D醫學圖像而不是3D胸部醫學圖像的機器學習文章,請參閱自動胸部X射線解釋

  • For more info about CNNs, see Convolutional Neural Networks in 5 minutes and How Computers See: Intro to Convolutional Neural Networks

    有關CNN的更多信息,請參閱5分鐘內的卷積神經網絡和《計算機的外觀:卷積神經網絡簡介》。

  • For more details about segmentation tasks, see Segmentation: U-Net, Mask R-CNN, and Medical Applications

    有關細分任務的更多詳細信息,請參閱細分:U-Net,Mask R-CNN和醫療應用

  • For more details about classification tasks, see Multi-label vs. Multi-class Classification: Sigmoid vs. Softmax

    有關分類任務的更多詳細信息,請參閱多標簽分類與多分類分類:Sigmoid與Softmax

Originally published at http://glassboxmedicine.com on August 4, 2020.

最初于 2020年8月4日 發布在 http://glassboxmedicine.com 上。

翻譯自: https://towardsdatascience.com/chest-ct-scan-machine-learning-in-5-minutes-ae7613192fdc

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389382.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389382.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389382.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

Pytorch高階API示范——線性回歸模型

本文與《20天吃透Pytorch》有所不同,《20天吃透Pytorch》中是繼承之前的模型進行擬合,本文是單獨建立網絡進行擬合。 代碼實現: import torch import numpy as np import matplotlib.pyplot as plt import pandas as pd from torch import …

vue 上傳圖片限制大小和格式

<div class"upload-box clear"><span class"fl">上傳圖片</span><div class"artistDet-logo-box fl"><el-upload :action"this.baseServerUrl/fileUpload/uploadPic?filepathartwork" list-type"pic…

作業要求 20181023-3 每周例行報告

本周要求參見&#xff1a;https://edu.cnblogs.com/campus/nenu/2018fall/homework/2282 1、本周PSP 總計&#xff1a;927min 2、本周進度條 代碼行數 博文字數 用到的軟件工程知識點 217 757 PSP、版本控制 3、累積進度圖 &#xff08;1&#xff09;累積代碼折線圖 &…

算命數據_未來的數據科學家或算命精神向導

算命數據Real Estate Sale Prices, Regression, and Classification: Data Science is the Future of Fortune Telling房地產銷售價格&#xff0c;回歸和分類&#xff1a;數據科學是算命的未來 As we all know, I am unusually blessed with totally-real psychic abilities.眾…

openai-gpt_為什么到處都看到GPT-3?

openai-gptDisclaimer: My opinions are informed by my experience maintaining Cortex, an open source platform for machine learning engineering.免責聲明&#xff1a;我的看法是基于我維護 機器學習工程的開源平臺 Cortex的 經驗而 得出 的。 If you frequent any part…

Pytorch高階API示范——DNN二分類模型

代碼部分&#xff1a; import numpy as np import pandas as pd from matplotlib import pyplot as plt import torch from torch import nn import torch.nn.functional as F from torch.utils.data import Dataset,DataLoader,TensorDataset""" 準備數據 &qu…

OO期末總結

$0 寫在前面 善始善終&#xff0c;臨近期末&#xff0c;為一學期的收獲和努力畫一個圓滿的句號。 $1 測試與正確性論證的比較 $1-0 什么是測試&#xff1f; 測試是使用人工操作或者程序自動運行的方式來檢驗它是否滿足規定的需求或弄清預期結果與實際結果之間的差別的過程。 它…

puppet puppet模塊、file模塊

轉載&#xff1a;http://blog.51cto.com/ywzhou/1577356 作用&#xff1a;通過puppet模塊自動控制客戶端的puppet配置&#xff0c;當需要修改客戶端的puppet配置時不用在客戶端一一設置。 1、服務端配置puppet模塊 &#xff08;1&#xff09;模塊清單 [rootpuppet ~]# tree /et…

數據可視化及其重要性:Python

Data visualization is an important skill to possess for anyone trying to extract and communicate insights from data. In the field of machine learning, visualization plays a key role throughout the entire process of analysis.對于任何試圖從數據中提取和傳達見…

熊貓數據集_熊貓邁向數據科學的第三部分

熊貓數據集Data is almost never perfect. Data Scientist spend more time in preprocessing dataset than in creating a model. Often we come across scenario where we find some missing data in data set. Such data points are represented with NaN or Not a Number i…

Pytorch有關張量的各種操作

一&#xff0c;創建張量 1. 生成float格式的張量: a torch.tensor([1,2,3],dtype torch.float)2. 生成從1到10&#xff0c;間隔是2的張量: b torch.arange(1,10,step 2)3. 隨機生成從0.0到6.28的10個張量 注意&#xff1a; (1).生成的10個張量中包含0.0和6.28&#xff…

mongodb安裝失敗與解決方法(附安裝教程)

安裝mongodb遇到的一些坑 浪費了大量的時間 在此記錄一下 主要是電腦系統win10企業版自帶的防火墻 當然還有其他的一些坑 一般的問題在第6步驟都可以解決&#xff0c;本教程的安裝步驟不夠詳細的話 請自行百度或谷歌 安裝教程很多 我是基于node.js使用mongodb結合Robo 3T數…

【洛谷算法題】P1046-[NOIP2005 普及組] 陶陶摘蘋果【入門2分支結構】Java題解

&#x1f468;?&#x1f4bb;博客主頁&#xff1a;花無缺 歡迎 點贊&#x1f44d; 收藏? 留言&#x1f4dd; 加關注?! 本文由 花無缺 原創 收錄于專欄 【洛谷算法題】 文章目錄 【洛谷算法題】P1046-[NOIP2005 普及組] 陶陶摘蘋果【入門2分支結構】Java題解&#x1f30f;題目…

web性能優化(理論)

什么是性能優化&#xff1f; 就是讓用戶感覺你的網站加載速度很快。。。哈哈哈。 分析 讓我們來分析一下從用戶按下回車鍵到網站呈現出來經歷了哪些和前端相關的過程。 緩存 首先看本地是否有緩存&#xff0c;如果有符合使用條件的緩存則不需要向服務器發送請求了。DNS查詢建立…

python多項式回歸_如何在Python中實現多項式回歸模型

python多項式回歸Let’s start with an example. We want to predict the Price of a home based on the Area and Age. The function below was used to generate Home Prices and we can pretend this is “real-world data” and our “job” is to create a model which wi…

充分利用UC berkeleys數據科學專業

By Kyra Wong and Kendall Kikkawa黃凱拉(Kyra Wong)和菊川健多 ( Kendall Kikkawa) 什么是“數據科學”&#xff1f; (What is ‘Data Science’?) Data collection, an important aspect of “data science”, is not a new idea. Before the tech boom, every industry al…

文本二叉樹折半查詢及其截取值

using System;using System.ComponentModel;using System.Data;using System.Drawing;using System.Text;using System.Windows.Forms;using System.Collections;using System.IO;namespace CS_ScanSample1{ /// <summary> /// Logic 的摘要說明。 /// </summary> …

nn.functional 和 nn.Module入門講解

本文來自《20天吃透Pytorch》 一&#xff0c;nn.functional 和 nn.Module 前面我們介紹了Pytorch的張量的結構操作和數學運算中的一些常用API。 利用這些張量的API我們可以構建出神經網絡相關的組件(如激活函數&#xff0c;模型層&#xff0c;損失函數)。 Pytorch和神經網絡…

10.30PMP試題每日一題

SC>0&#xff0c;CPI<1&#xff0c;說明項目截止到當前&#xff1a;A、進度超前&#xff0c;成本超值B、進度落后&#xff0c;成本結余C、進度超前&#xff0c;成本結余D、無法判斷 答案將于明天和新題一起揭曉&#xff01; 10.29試題答案&#xff1a;A轉載于:https://bl…

02-web框架

1 while True:print(server is waiting...)conn, addr server.accept()data conn.recv(1024) print(data:, data)# 1.得到請求的url路徑# ------------dict/obj d["path":"/login"]# d.get(”path“)# 按著http請求協議解析數據# 專注于web業…