分類結果可視化python
I love good data visualizations. Back in the days when I did my PhD in particle physics, I was stunned by the histograms my colleagues built and how much information was accumulated in one single plot.
我喜歡出色的數據可視化。 早在我獲得粒子物理學博士學位時,我就被同事建立的直方圖以及在一張圖中積累了多少信息而感到震驚。
繪圖中的信息 (Information in Plots)
It is really challenging to improve existing visualization methods or to transport methods from other research fields. You have to think about the dimensions in your plot and the ways to add more of them. A good example is the path from a boxplot to a violinplot to a swarmplot. It is a continuous process of adding dimensions and thus information.
改善現有的可視化方法或從其他研究領域轉移方法確實是一項挑戰。 您必須考慮繪圖中的尺寸以及添加更多尺寸的方法。 一個很好的例子是從箱形圖到小提琴圖再到黑線的路徑。 這是添加維度和信息的連續過程。
The possibilities of adding information or dimensions to a plot are almost endless. Categories can be added with different marker shapes, color maps like in a heat map can serve as another dimension and the size of a marker can give insight to further parameters.
向地塊添加信息或尺寸的可能性幾乎是無限的。 可以添加具有不同標記形狀的類別,像熱圖一樣的顏色圖可以用作另一個維度,標記的大小可以洞察其他參數。
分類器效果圖 (Plots of Classifier Performance)
When it comes to machine learning, there are many ways to plot the performance of a classifier. There is an overwhelming amount of metrics to compare different estimators like accuracy, precision, recall or the helpful MMC.
在機器學習方面,有許多方法可以繪制分類器的性能。 有大量指標可以比較不同的估算器,例如準確性,準確性,召回率或有用的MMC。
All of the common classification metrics are calculated from true positive, true negative, false positive and false negative incidents. The most popular plots are definitely ROC curve, PRC, CAP curve and the confusion matrix.
所有常見分類指標都是根據真實肯定,真實否定 , 錯誤肯定和錯誤否定事件計算的。 最受歡迎的圖肯定是ROC曲線,PRC,CAP曲線和混淆矩陣。
I won’t get into detail of the three curves, but there are many different ways to handle the confusion matrix, like adding a heat map.
我不會詳細介紹這三個曲線,但是有許多不同的方法來處理混淆矩陣,例如添加熱圖。

分類拼接圖 (A Classification Mosaic Diagram)
For many cases, this is probably sufficient and easy to pick up all relevant information, but for a multi class problem, it can get much harder to do so.
在許多情況下,這可能足夠容易地提取所有相關信息,但是對于多類問題,這樣做會變得更加困難。
While reading some papers, I stumbled across:
在閱讀一些論文時,我偶然發現:
Jakob Raymaekers, Peter J. Rousseeuw, Mia Hubert. Visualizing classification results. arXiv:2007.14495 [stat.ML]
Jakob Raymaekers,Peter J.Rousseeuw和Mia Hubert。 可視化分類結果。 arXiv:2007.14495 [stat.ML]
and from there to
然后從那里
Friendly, Michael. “Mosaic Displays for Multi-Way Contingency Tables.” Journal of the American Statistical Association, vol. 89, no. 425, 1994, pp. 190–200. JSTOR, www.jstor.org/stable/2291215. Accessed 13 Aug. 2020.
友好,邁克爾。 “多向列聯表的馬賽克顯示。” 美國統計協會雜志 ,第一卷。 89號 425,1994,第190-200頁。 JSTOR , www.jstor.org / stable / 2291215。 于2020年8月13日訪問。
The authors propose a mosaic diagram to plot discrete values. We can transport this idea to the field of machine learning with the predicted classes as the discrete values.
作者提出了一個馬賽克圖來繪制離散值。 我們可以將這種思想以預測的類作為離散值傳輸到機器學習領域。
In a multi class environment, such a plot would look like the following:
在多類環境中,這種繪圖如下所示:

It has several advantages over a classical confusion matrix. One can easily see the predicted classes on the y-axis and the number proportion of each class on the x-axis. The big difference from a simple bar plot is the width of the bars, which are giving an idea of the class imbalance.
與經典的混淆矩陣相比,它具有多個優點。 可以輕松地在y軸上看到預測的類別,并在x軸上看到每個類別的數量比例。 與簡單條形圖的最大區別在于條形的寬度,這使人們對類的不平衡有所了解。
You can find the code for such a plot fed with a confusion matrix here:
您可以在此處找到此類代碼的代碼,其中包含混淆矩陣:
Have fun plotting your next classification results!
祝您規劃下一個分類結果愉快!
翻譯自: https://towardsdatascience.com/a-different-way-to-visualize-classification-results-c4d45a0a37bb
分類結果可視化python
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389333.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389333.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389333.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!