教你如何借助AI精讀文獻

- 1. 原文
- 2. 對文獻有一個快速的理解
- 3. 專業術語解讀
- 4. 解答疑問
- 5. 借助AI翻譯摘要和引言部分
- - 5.1 **摘要 (Abstract)**
  - 5.2 **引言 (Introduction)**
- 6. 介紹論文中的“Stack-Propagation”
- 7. 查閱論文里的參考文獻，看看他是如何在Introduction中引述研究進展文獻（“研究進展”：寫作時該領域的研究歷史和現狀），并：通過網絡找到一篇發表在2023 或2024 年的使用ATIS 數據集作為實驗數據集的論文，以及找一篇發表在2023或2024 年的有對比SMOTE方法的論文
- - 7.1 如何在Introduction中引述研究進展文獻
  - 7.2 找一篇發表在2023 或2024 年的使用ATIS 數據集作為實驗數據集的論文
  - 7.3 找一篇發表在2023或2024 年的有對比SMOTE方法的論文

1. 原文

還在為滿是術語的英文文獻頭疼嗎？面對一篇頂會論文，是不是常常感到無從下手，讀了半天也抓不住重點？

別擔心，這早已不是少數人的困境。幸運的是，我們處在一個AI工具唾手可得的時代。

本文將以NLP領域的知名論文《A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding》 (Qin et al., EMNLP 2019) 為例，手把手教你如何借助AI，將精讀文獻的效率和深度提升一個量級。

AStack-Propagation Framework with Token-Level Intent Detection for
Spoken Language Understanding

原文是英文的，看起來就頭疼，不要緊，我們通常是借助AI進行翻譯，~~某人說過~~看一個上交教授講cs論文精讀公開課都說推薦直接ai翻譯，只有遇到讀不懂的不通順地方再查找精翻，不過最好不要用WPS等翻譯，因為大概率會翻譯得詞不達意！

2. 對文獻有一個快速的理解

以下是提示詞：

**角色設定：**你現在是一位專業的學術助手，精通多領域的知識，特別是計算機、數學、統計學以及各種工程技術領域，深度學習領域。你的任務是幫助用戶閱讀、翻譯英文論文為中文，并對論文內容進行總結，提煉出關鍵要點。對于涉及數學計算的部分，你需要詳細解釋公式的推導過程、每一步的計算細節及其背后的邏輯和結論。同時，你還需要將復雜的概念以通俗易懂的語言進行解釋，確保非專業背景的人也能理解。具體來說，你的工作流程如下：1. 首先，仔細閱讀并理解用戶提供給你的英文論文。
2. 然后，準確地將論文翻譯成中文，注意保持原文的專業性和準確性，要有中英對照。
3. 接著，總結論文的主要觀點、研究方法、實驗結果及結論等核心要素，并用簡潔明了的語言概括這些要點。
4. 對于文中的數學公式和計算部分，提供一步一步的詳細解析，包括但不限于：公式的來源、每個變量的意義、推導過程中的關鍵步驟、最終得出的結論等。
5. 最后，用通俗易懂的語言解釋上述要點和復雜概念，盡量減少使用技術術語，或在必要時給出清晰的定義和解釋，以便讓用戶能夠充分理解論文的內容。請始終保持客觀、嚴謹的態度對待學術資料，并確保信息傳遞的準確性。

這時AI生成的文獻解讀是全面而粗糙的，因為上下文長度限制，很有可能只粗略翻譯了三四段，概念通俗化部分也過于通俗而不能深入理解某個概念在實際場景下是如何工作的。

但是，這個初步的理解又是極為重要的，在閱讀的過程中，一定會產生很多AI沒有解答的疑問（如果你是小白的話），比如說：F1分數是什么？、棧傳播到底是如何工作的？、解碼器和編碼器是什么？…如果你對機器學習方面的只是并不了解，你也很有可能好奇CNN、RNN、自注意力機制、損失函數等都是些什么，不要緊！帶著疑問繼續閱讀，并把疑問的地方標注起來。

3. 專業術語解讀

帶著問題清單，我們首先要解決最基礎的障礙——專業術語。如果連論文的“語言”都看不懂，就談不上理解其思想。

如果你是機器學習小白,那么你當然需要知道CNN、RNN、Softmax函數是什么，需要知道論文里所有相關專業術語的含義，他往往是默認你知道這些術語的。

以下是提示詞：

總結出論文當中的所有相關專業術語，并告訴我具體含義。
例如：高級負荷曲線分析 (LPA) ，指的是......
光伏 (PV) 負荷識別，指的是......
視覺 Transformer (ViT) 技術，指的是......
......
不僅給出具體含義，還需要給出現實情況的例子。
注意，是所有相關名詞！不止我提到的這些！

一個好的例子，勝過千言萬語。要求AI提供通俗比喻，可以幫助我們快速建立對一個抽象概念的直觀理解。

當然，對于這些只是一個粗略的了解，你并不需要知道他們背后是如何工作的，如何實現的，不然就太耗時間了！

4. 解答疑問

F1分數是什么？、棧傳播到底是如何工作的？、解碼器和編碼器是什么？…

讓AI來幫你解答吧！

以下是提示詞：

**主題：**  `[在此處插入您想了解的任何概念、技術或主題]`**請你遵循以下兩個階段來為我解釋這個主題，以確保我能從入門到深入地完全理解：****第一階段：通俗化解釋 (讓我“知道它是什么”)**1. **核心比喻：**  首先，請用一個極其通俗易懂的生活化比喻，來解釋 `[主題]` 的核心思想是什么。在這個比喻中，完全不要使用專業術語。
2. **基本定義：**  在比喻之后，再給出一個簡潔、清晰的基本定義。**第二階段：深入原理與實際應用 (讓我“明白它是如何工作的”)**1. **選擇典型場景：**  請選擇一個最常見、最能體現 `[主題]` 價值的實際應用場景。
2. **拆解工作流程：**- **輸入與輸出：**  在這個場景下，它的“輸入”是什么，“輸出”又是什么？-  **“黑箱”過程：**  請詳細地、一步一步地拆解從“輸入”到“輸出”的中間過程。這里是關鍵，請解釋其內部的核心原理和機制，不要跳過任何關鍵步驟。如果涉及算法，請解釋其關鍵的決策邏輯。
3. **價值與原因：**  解釋為什么在這個場景下需要使用 `[主題]`？它解決了什么具體問題？我們為什么選擇用它，而不是用其他方法？**最終目標：** 
在您解釋完之后，我應該能夠達到兩個目標：- 能用自己的話，通過那個生活化的比喻，向一個完全不懂的人解釋 `[主題]` 是什么。
- 能理解 `[主題]` 在一個真實世界場景中的具體實現過程和其背后的核心工作原理。

如果你已經看完，也知道了其中所有專業術語的含義，相信你一定對于這篇文獻所研究的內容以及核心思想有了一個大概的了解，那么我準備了以下幾個任務，帶著這些任務一起繼續探索吧。

翻譯各自的摘要（Abstract）和引言（Introduction）部分
簡要介紹論文中的“Stack-Propagation”
查閱論文里的參考文獻，看看他們是如何在Introduction 中引述研究進展文獻（“研究進展”：寫作時該領域的研究歷史和現狀），并：通過網絡找到一篇發表在2023 或2024 年的使用ATIS 數據集作為實驗數據集的論文，以及找一篇發表在2023或2024 年的有對比SMOTE方法的論文

5. 借助AI翻譯摘要和引言部分

摘要是論文的靈魂，引言是論文的地圖。吃透這兩部分，就等于掌握了論文的80%。

翻譯的訣竅：

告訴AI你需要中英對照
告訴AI你需要解讀

以下是提示詞：

請為我翻譯論文當中的摘要和引言部分，具體工作流程：1. **中英對照翻譯：** 然后，準確地將論文翻譯成中文。請提供**中英對照**的版本，以便我隨時查閱原文。翻譯需保持學術專業性和準確性。
2. **核心精粹解讀【解讀】：**- 在翻譯之后，你需要提供一個名為 **【解讀】** 的核心摘要部分。- 這個部分需要像專家一樣，一針見血地提煉出整篇論文的“骨架”，清晰地概括其精華。請**務必**使用以下結構化格式進行總結：- **1. 問題 (Problem):**  本文旨在解決什么核心問題？現有方法存在哪些不足？- **2. 方案 (Solution):**  作者提出了什么核心思想、模型或框架來解決上述問題？- **3. 優化/創新點 (Optimization/Innovation):**  相較于以往的方法，該方案有哪些具體的創新或優化之處？（如果有多點，請分點闡述）- **4. 結果 (Result):**  實驗結果如何？是否達到了預期目標（例如，SOTA - State-of-the-art）？- **5. 增強 (Enhancement):**  （可選）作者是否使用了像BERT這樣的現有強大技術來進一步增強其模型或方案的表現？**最終要求：** 
請始終保持客觀、嚴謹的學術態度。你的回答應該結構清晰，先通過【解讀】讓我快速掌握論文全貌，再通過深入剖析讓我理解具體實現細節。

以下是要求1的回答：

5.1 摘要 (Abstract)

【英文原文】

Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The two tasks are closely tied and the slots often highly depend on the intent. In this paper, we propose a novel framework for SLU to better incorporate the intent information, which further guides the slot filling. In our framework, we adopt a joint model with Stack-Propagation which can directly use the intent information as input for slot filling, thus to capture the intent semantic knowledge. In addition, to further alleviate the error propagation, we perform the token-level intent detection for the Stack-Propagation framework. Experiments on two publicly datasets show that our model achieves the state-of-the-art performance and outperforms other previous methods by a large margin. Finally, we use the Bidirectional Encoder Representation from Transformer (BERT) model in our framework, which further boost our performance in SLU task.

【中文翻譯】

意圖檢測和槽位填充是構建口語理解（SLU）系統的兩大主要任務。這兩個任務聯系緊密，并且槽位經常高度依賴于意圖。在這篇論文中，我們為SLU更好地融合意圖信息構建了一個新穎的框架，進一步指導槽位填充。在我們的框架中，我們采用了一個帶有棧傳播機制的聯合模型，這個模型可以直接讓意圖信息作為槽位填充的輸入，以此來獲取意圖的語義知識。此外，為了進一步緩解錯誤傳播問題，我們為棧傳播框架執行了token級別的意圖檢測。在兩個公開數據集上的實驗表明，我們的模型取得了當前最先進的表現，并大幅超越了以往的其他方法。最后，我們在框架中使用了BERT模型，進一步提升了我們在SLU任務上的表現。

5.2 引言 (Introduction)

【英文原文】

Spoken language understanding (SLU) is a critical component in task-oriented dialogue systems. It usually consists of intent detection to identify users’ intents and slot filling task to extract semantic constituents from the natural language utterances (Tur and De Mori, 2011). As shown in Table 1, given a movie-related utterance “watch action movie”, there are different slot labels for each token and an intent for the whole utterance.

Usually, intent detection and slot filling are implemented separately. But intuitively, these two tasks are not independent and the slots often highly depend on the intent (Goo et al., 2018). For example, if the intent of a utterance is WatchMovie, it is more likely to contain the slot movie_name rather than the slot music_name. Hence, it is promising to incorporate the intent information to guide the slot filling.

Considering this strong correlation between the two tasks, some joint models are proposed based on the multi-task learning framework (Zhang and Wang, 2016; Hakkani-Tür et al., 2016; Liu and Lane, 2016) and all these models outperform the pipeline models via mutual enhancement between two tasks. However, their work just modeled the relationship between intent and slots by sharing parameters. Recently, some work begins to model the intent information for slot filling explicitly in joint model. Goo et al. (2018) and Li et al. (2018) proposed the gate mechanism to explore incorporating the intent information for slot filling. Though achieving the promising performance, their models still suffer from two issues including: (1) They all adopt the gate vector to incorporate the intent information. In the paper, we argue that it is risky to simply rely on the gate function to summarize or memorize the intent information. Besides, the interpretability of how the intent information guides slot filling procedure is still weak due to the interaction with hidden vector between the two tasks. (2) The utterance-level intent information they use for slot filling may mislead the prediction for all slots in an utterance if the predicted utterance-level intent is incorrect.

In this paper, we propose a novel framework to address both two issues above. For the first issue, inspired by the Stack-Propagation which was proposed by Zhang and Weiss (2016) to leverage the POS tagging features for parsing and achieved good performance, we propose a joint model with Stack-Propagation for SLU tasks. Our framework directly use the output of the intent detection as the input for slot filling to better guide the slot prediction process. In addition, the framework make it easy to design oracle intent experiment to intuitively show how intent information enhances slot filling task. For the second issue, we perform a token-level intent prediction in our framework, which can provide the token-level intent information for slot filling. If some token-level intents in the utterance are predicted incorrectly, other correct token-level intents will still be useful for the corresponding slot prediction. In practice, we use a self-attentive encoder for intent detection to capture the contextual information at each token and hence predict an intent label at each token. The intent of an utterance is computed by voting from predictions at each token of the utterance. This token-level prediction, like ensemble neural networks (Lee et al., 2016), reduces the predicted variance to improve the performance of intent detection. And it fits better in our Stack-Propagation framework, where intent detection can provide token-level intent features and retain more useful intent information for slot filling.

We conduct experiments on two benchmarks== SNIPS== (Coucke et al., 2018) and== ATIS== (Goo et al., 2018) datasets. The results of both experiments show the effectiveness of our framework by outperforming the current state-of-the-art methods by a large margin. Finally, Bidirectional Encoder Representation from Transformer (Devlin et al., 2018, BERT), as the pre-trained model, is used to further boost the performance of our model.

To summarize, the contributions of this work are as follows:

We propose a Stack-Propagation framework in SLU task, which can better incorporate the intent semantic knowledge to guide the slot filling and make our joint model more interpretable.
We perform the token-level intent detection for Stack-Propagation framework, which improves the intent detection performance and further alleviate the error propagation.
We present extensive experiments demonstrating the benefit of our proposed framework. Our experiments on two publicly available datasets show substantial improvement and our framework achieve the state-of-the-art performance.
We explore and analyze the effect of incorporating BERT in SLU tasks.

【中文翻譯】

口語理解（Spoken Language Understanding, SLU）是面向任務的對話系統中的一個關鍵組成部分。它通常包括意圖檢測來識別用戶意圖和槽位填充來從自然語言語句中提取語義成分兩個任務 (Tur and De Mori, 2011)。如表1所示，對于一個與電影相關的語句“watch action movie”，每個token都有不同的槽位標簽，而整個語句則對應一個整體的意圖。

通常情況下，意圖檢測和槽位填充是分開執行的。但直觀上，這兩個任務并非各自獨立，槽位往往在很大程度上依賴于意圖 (Goo et al., 2018)。例如，如果一個語句的意圖是“WatchMovie”，那么它就更有可能包含movie_name這個槽位，而不是music_name。因此，引入意圖信息來指導槽位填充是很有前景的。

考慮到這兩個任務之間的強相關性，一些基于多任務學習框架的聯合模型被提了出來 (Zhang and Wang, 2016; Hakkani-Tür et al., 2016; Liu and Lane, 2016)，并且這些模型通過兩個任務間的相互促進，表現均優于流水線模型。然而，他們的工作僅僅通過共享參數來對意圖和槽位之間的關系進行建模。最近，一些工作開始在聯合模型中顯式地為槽位填充任務對意圖信息進行建模。Goo等人 (2018) 和Li等人 (2018) 提出了門控機制來探索如何將意圖信息融入槽位填充。盡管取得了不錯的表現，但他們的模型仍然存在兩個問題：(1) 他們都采用門控向量來融合意圖信息。我們在論文中指出，僅僅依靠門控函數來總結或記憶意圖信息是有風險的。此外，由于兩個任務之間是通過隱藏向量進行交互的，意圖信息如何指導槽位填充過程的可解釋性仍然很弱。(2) 如果預測的語句級意圖不正確，他們用于槽位填充的語句級意圖信息可能會誤導該語句中所有槽位的預測。

在本文中，我們提出了一個新的框架來解決上述兩個問題。針對第一個問題，我們受到Stack-Propagation的啟發——該方法由Zhang和Weiss (2016) 提出，利用詞性標注的特征來進行句法分析并取得了良好表現——我們為口語理解任務提出了一個帶有Stack-Propagation的聯合模型。我們的框架直接將意圖檢測的輸出作為槽位填充的輸入，從而更好地指導槽位預測過程。此外，該框架也使得設計oracle意圖實驗變得容易，可以直觀地展示意圖信息如何增強槽位填充任務。針對第二個問題，我們在框架中執行了token級的意圖檢測，這可以為槽位填充提供token級的意圖信息。如果語句中的某些token級意圖預測錯誤，其他正確的token級意圖對于相應的槽位預測仍然有用。在實踐中，我們使用一個自注意力編碼器進行意圖檢測，以捕捉每個token的上下文信息，從而預測每個token的意圖標簽。整個語句的意圖是通過對每個token的預測結果進行投票來計算的。這種token級的預測，類似于集成神經網絡 (Lee et al., 2016)，通過降低預測方差來提高意圖檢測的表現。并且它更適合我們的Stack-Propagation框架，在該框架中，意圖檢測可以提供token級的意圖特征，并為槽位填充保留更多有用的意圖信息。

我們在兩個基準數據集SNIPS (Coucke et al., 2018) 和 ATIS (Goo et al., 2018) 上進行了實驗。兩個實驗的結果都顯示了我們框架的有效性，其表現大幅超越了當前最先進的方法。最后，我們在模型中使用了預訓練的BERT模型 (Devlin et al., 2018)，進一步提升了我們在口語理解任務上的表現。

總而言之，本文的貢獻如下：

我們提出了一個用于口語理解任務的Stack-Propagation框架，該框架可以更好地融合意圖語義知識來指導槽位填充，并使我們的聯合模型更具可解釋性。
我們為Stack-Propagation框架執行了token級的意圖檢測，這提升了意圖檢測的表現，并進一步減輕了錯誤傳播。
我們通過大量的實驗證明了我們所提出框架的優勢。我們在兩個公開數據集上的實驗顯示出顯著的改進，并且我們的框架達到了最先進的表現。
我們探索并分析了在口語理解任務中引入BERT模型的效果。

6. 介紹論文中的“Stack-Propagation”

要求2可以用第四部分“解答疑問”的提示詞來寫。

以下是要求2的回答部分：

stack-propagation是一個用于ai多任務學習的框架，這個框架將意圖檢測的輸出結果與原始數據進行拼接，形成一個新的輸入，傳遞給槽位填充進行處理，為槽位填充任務顯式地融合意圖信息。

比如用戶說：

“幫我訂一張明天去上海的機票”

input是這個字符串，output是json格式的輸出：

{"intent": "BookFlight","slots": {"departure_date": "明天","destination_city": "上海"}
}

中間就是用了stack-propagation處理的過程，具體來說就是：

句子中的每個詞（“幫我”、“訂”、“一張”、“明天”、“去”、“上海”、“的”、“機票”）會被轉換成詞向量，這些向量包含了每個詞的原始語義信息
通過BiLSTM和Transformer編碼器來雙向整理句子，理解每個詞在當前語境的具體含義，經過這一步，詞“上海”的向量，就已經不僅僅代表“上海”這個城市了，它還吸收了旁邊“去”和“機票”的信息，它的向量特征強烈地指向“一個旅行的目的地”，通過這個模型的output就是為每個詞生成一個整合了當前語境的新的向量**e**
通過意圖檢測：input為向量**e，通過一個線性層和Softmax函數，output為向量e**在當前語境下，指向各個可能意圖的概率 $y^I$
通過stack-propagation：input為每個詞的**e**和 $y^I$ ，然后進行簡單的向量拼接，也就是[富含上下文的詞向量] + [關于這個詞的意圖概率列表]，output為 $y_i^I ⊕ e_i)$
通過槽位填充：input為 $h_{i-1}^S, y_{i-1}^S, (y_i^I ⊕ e_i)$ ，通過LSTM在每個 $t_i$ 結合三個部分，更新狀態 $h_i^S = f(h_{i-1}^S, y_{i-1}^S, y_i^I ⊕ e_i)$ ，轉換為概率分布 $y_i^S = softmax(W_S * h_i^S)$ ，返回概率向量中值最大的位置的標簽 $o_i^S = argmax(y_i^S)$ ，output就是$o_i^S $是為第 i 個詞確定的最終槽位標簽

7. 查閱論文里的參考文獻，看看他是如何在Introduction中引述研究進展文獻（“研究進展”：寫作時該領域的研究歷史和現狀），并：通過網絡找到一篇發表在2023 或2024 年的使用ATIS 數據集作為實驗數據集的論文，以及找一篇發表在2023或2024 年的有對比SMOTE方法的論文

7.1 如何在Introduction中引述研究進展文獻

增加權威性:
- 引述: It usually consists of intent detection to identify users' intents and slot filling task to extract semantic constituents from the natural language utterances (Tur and De Mori, 2011).它通常包括意圖檢測來識別用戶意圖和槽位填充來從自然語言語句中提取語義成分兩個任務
- 分析: 作者說了SLU的核心任務是意圖檢測盒槽位填充后，引用著作，增加權威性，表面這是學界公認的概念
指出任務間的關聯性:
- 引述: But intuitively, these two tasks are not independent and the slots often highly depend on the intent (Goo et al., 2018).但直觀上，這兩個任務并非各自獨立，槽位往往在很大程度上依賴于意圖
- 分析:引用論文論證兩個任務是相互關聯的，后者依賴前者
回顧聯合模型的早期發展:
- 引述: Considering this strong correlation between the two tasks, some joint models are proposed based on the multi-task learning framework (Zhang and Wang, 2016; Hakkani-Tür et al., 2016; Liu and Lane, 2016) and all these models outperform the pipeline models via mutual enhancement between two tasks.考慮到這兩個任務之間的強相關性，一些基于多任務學習框架的聯合模型被提了出來，并且這些模型通過兩個任務間的相互促進，表現均優于流水線模型
- 分析: 在認識到任務相關性后，作者介紹了學術界提出的聯合模型解決方案。引用了三篇具有里程碑意義的論文：
  - Zhang and Wang (2016) : 提出了基于 RNN 的聯合模型。
  - Hakkani-Tür et al. (2016) : 提出了使用單個循環神經網絡進行多領域聯合語義分析的方法。
  - Liu and Lane (2016) : 提出了基于注意力機制的聯合模型。
- 引用這些關鍵文獻，將論述的重心放在joint model上，為后來他提出的joint model做鋪墊
指出當前主流方法的局限性:
- 引述: Goo et al. (2018) and Li et al. (2018) proposed the gate mechanism to explore incorporating the intent information for slot filling.提出了門控機制來探索如何將意圖信息融入槽位填充
- 分析: 是正常論文會干的事，通過指出別的方法的局限性來為說明自己方法的優越性做鋪墊
引出本文的創新思路:
- 引述: For the first issue, inspired by the Stack-Propagation which was proposed by Zhang and Weiss (2016) to leverage the POS tagging features for parsing and achieved good performance...我們受到Stack-Propagation的啟發
- 分析: 說明自己方法的理論來源，增加合理性和說服力
- 引述: This token-level prediction, like ensemble neural networks (Lee et al., 2016), reduces the predicted variance to improve the performance of intent detection.
- 分析: 類比，幫助讀者更好地理解token level的意圖檢測是怎么實現的
點明實驗所用的基準數據集:
- 引述: We conduct experiments on two benchmarks SNIPS (Coucke et al., 2018) and ATIS (Goo et al., 2018) datasets.
- 分析: 用公開的數據集，就有對比，一般用公開數據集進行實驗的項目比用私有數據集更好，因為可以與別的項目進行對比，能看出自己的方法到底好不好。

7.2 找一篇發表在2023 或2024 年的使用ATIS 數據集作為實驗數據集的論文

查找論文一般用谷歌學術，當然我更喜歡用整合版。

論文標題: Joint intent detection and slot filling with syntactic and semantic features using multichannel CNN-BiLSTM
發表于: PeerJ Computer Science,8 October 2024
簡介: 這篇論文提出了一種名為MCNN-BiLSTM的新型深度學習模型，其核心創新在于通過一個多通道架構，將上下文（BERT）、通用語義（word2vec）和句法（詞性標注）三種不同的語言特征進行融合，從而能更全面、準確地同時完成對話系統中的意圖檢測和槽位填充任務，并在兩個標準數據集上取得了超越以往先進方法的性能。
鏈接: Joint intent detection and slot filling with syntactic and semantic features using multichannel CNN-BiLSTM [PeerJ]

7.3 找一篇發表在2023或2024 年的有對比SMOTE方法的論文

論文標題: CLE-SMOTE: Addressing Extreme Imbalanced Data Classification with Contrastive Learning-Enhanced SMOTE
發表于: 04 Mar 2024, Last Modified: 02 May 2024
簡介: 這篇論文主要研究了如何解決機器學習中極端類別不平衡的分類問題。針對廣泛使用的SMOTE過采樣技術有時會生成“有害”噪聲樣本從而損害模型性能的缺陷，作者提出了一種名為CLE-SMOTE的新方法。該方法通過一個兩階段過程來優化訓練：首先，利用對比學習預訓練一個網絡，使其能夠為每個類別學習到清晰且獨特的特征空間；然后，在正式微調階段，通過一個創新的帶上限的損失函數來智能地評估每個SMOTE合成樣本的質量，并限制那些質量差、偏離其類別核心特征的噪聲樣本對模型訓練的負面影響，從而在有效擴充少數類數據的同時，顯著提升了模型在不平衡數據上的魯棒性和分類準確率。
PDF: https://openreview.net/pdf?id=e641mv2xsf