php如何減緩gc_管理信息傳播-使用數據科學減緩錯誤信息的傳播

php如何減緩gc

With more people now than ever relying on social media to stay updated on current events, there is an ethical responsibility for hosting companies to defend against false information. Disinformation, which is a type of misinformation that is intended to manipulate and mislead, can create unrest and panic. Other types of misinformation such as rumors and hoaxes, if left unchecked, also has the potential to bring mental and physical harm to unwary readers. The key to stopping the spread of misinformation is taking swift action against them since they have the tendency to travel very quickly. In fact, studies show that falsehood spreads exponentially faster than the truth (source). Social media companies have put in place protocols to limit the virality of inaccurate content, but they only take effect once the content has been reviewed by third-party fact-checking partners. Therefore, the focus is on rapid assessment of veracity. We’ve seen remarkable ingenuity from technology companies in this capacity. Namely, the use of Machine Learning algorithms to complement fact-checking programs for identifying inaccurate content. However, this is yet to be a complete solution. In this article, we’ll study the process and explore how it might evolve.

如今,比以往任何時候都更多的人依賴社交媒體來了解最新新聞,因此托管公司有道德責任承擔防范虛假信息的責任。 虛假信息是一種旨在操縱和誤導的虛假信息,會引起騷動和恐慌。 如果不加以制止,其他類型的錯誤信息,例如謠言和惡作劇,也有可能給粗心的讀者帶來精神和身體上的傷害。 阻止錯誤信息傳播的關鍵是對它們采取Swift的行動,因為它們傾向于快速傳播。 實際上,研究表明,虛假的傳播速度比真相的傳播速度快( 來源 )。 社交媒體公司已經制定了協議來限制不準確內容的病毒性,但是只有在第三方事實檢查合作伙伴對內容進行審核后,它們才會生效。 因此,重點是對準確性進行快速評估。 我們已經看到技術公司在此方面具有非凡的創造力。 即,使用機器學習算法來補充事實檢查程序,以識別不正確的內容。 但是,這尚未成為一個完整的解決方案。 在本文中,我們將研究該過程并探討其可能如何發展。

如何識別錯誤信息 (How Misinformation is Identified)

Image for post
Fact-Checking Program workflow
事實檢查計劃工作流程

The process of evaluating the content’s accuracy begins with an internal screening of potential falsehood. This involves the utilization of Automation and Machine Learning models to pick up various signals. If the content is determined to potentially be misinformation, it’s routed to fact-checking partners for further review. After manual research and/or consultation with the primary source, a content rating is assigned. The resulting rating notifies the social media company if action needs to be taken. Further, the rating also helps train the Machine Learning models to become better at catching misinformation in the future. Below is how Machine Learning contributes to the process:

評估內容準確性的過程始于對潛在虛假性的內部篩選。 這涉及利用自動化和機器學習模型來拾取各種信號。 如果確定內容可能是錯誤信息,則將其發送給事實檢查合作伙伴以進行進一步檢查。 在對主要來源進行人工研究和/或咨詢后,會分配內容分級。 如果需要采取行動,則由此產生的評級將通知社交媒體公司。 此外,該等級還有助于訓練機器學習模型,使其在將來更好地捕捉錯誤信息。 以下是機器學習對流程的貢獻:

  • The prediction models significantly reduce the number of reviews third-party fact-checking partners need to perform

    預測模型大大減少了第三方事實檢查合作伙伴需要執行的審閱次數
  • Finding duplicate or near-duplicate content frees up capacity for fact-checking partners to review new instances of misinformation

    查找重復或幾乎重復的內容可釋放事實檢查合作伙伴查看新的錯誤信息實例的能力

It’s quite a robust process, but not one without challenges. Below are the main challenges for this process:

這是一個強大的過程,但并非沒有挑戰。 以下是此過程的主要挑戰:

  • The large and growing number of active users makes the platform a target for coordinated propaganda attacks, bringing urgency and heavy workload for the fact-checking program

    大量活躍用戶使該平臺成為協調宣傳攻擊的目標,為事實檢查程序帶來了緊迫性和繁重的工作量
  • The scarcity of verified deceptive content to be used as the corpora for predictive classification model training is a roadblock for Machine Learning methods. This is further exacerbated by the desire to have more narrow categories of “truthiness” since they require different treatments, thus diluting the available data

    缺乏可用于預測分類模型訓練的經過驗證的欺騙性內容是機器學習方法的障礙。 由于對“真實性”的分類更窄,因此它們的需求進一步加劇,因為它們需要不同的處理方式,從而稀釋了可用數據
  • “Bad actors” who hide misleading context behind genuine content are hard to detect. For example, a Meme can use text layered on top of a photo or video to form deceitful content

    在真實內容后隱藏誤導性上下文的“壞演員”很難被發現。 例如,一個Meme可以使用在照片或視頻上分層的文字來構成欺騙性內容
  • Satirical may be misunderstood by people and are even more difficult for computers

    諷刺語可能會被人們誤解,并且對于計算機而言甚至更加困難
Image for post
Monthly Active Users continue to grow as social media become the dominant medium for people to get news
隨著社交媒體成為人們獲取新聞的主要媒介,每月活躍用戶持續增長

仔細檢查篩選過程 (A Closer Look at the Screening Process)

Image for post
Automation and Machine Learning look for signals to screen content
自動化和機器學習尋找屏幕內容的信號

開發中 (In Development)

Technology companies are working to improve this process by significantly expanding their databases that will help them build Artificial Intelligence to combat sophisticated attacks such as “deep fakes” and “weaponized memes”. The effectiveness of the algorithms and models largely depend on the having a diverse data set to train on. Fortunately, with the wide collaboration across the technology community in terms of data sharing, the models are becoming better at understanding content. Nevertheless, this is work in progress.

科技公司正在努力通過顯著擴展其數據庫來改善此過程,這將幫助它們構建人工智能來對抗復雜的攻擊,例如“深造假”和“武器化模因”。 算法和模型的有效性在很大程度上取決于要訓練的多樣化數據集。 幸運的是,隨著整個技術社區在數據共享方面的廣泛合作,這些模型在理解內容方面變得越來越好。 盡管如此,這項工作仍在進行中。

推薦建議 (Recommendations)

There are considerations that should be explored to make immediate improvements. One recommendation that I’m exploring is the prioritization and specialization of contents for third-party fact-checkers. We can perform A/B testing to compare the turn-over and overall virality to measure the impact of these measures.

應該探索一些考慮因素以立即進行改進。 我正在探索的一項建議是對第三方事實檢查者的內容進行優先級劃分和專業化處理。 我們可以進行A / B測試,以比較周轉率和整體病毒性來衡量這些措施的影響。

  • Prioritization of dangerous content that have a propensity to spread before they become viral

    優先確定容易傳播的易于傳播的危險內容
  • Specialization of content directs content to third-party fact-checkers within their area of expertise to cut the amount of time require to review

    內容的專業化將內容定向到其專業領域內的第三方事實檢查人員,以減少審核所需的時間

摘要 (Summary)

Infodemic is a disease that has plague us long before the recent health crisis. Without proper management, it can do tremendous harm to our society. Thankfully, there are technological tools to help us mitigate those risks. We reviewed the fact-checking progress and specifically how Machine Learning is being applied in this use case.

信息病是在最近的健康危機之前很久困擾我們的疾病。 如果沒有適當的管理,它將對我們的社會造成巨大傷害。 值得慶幸的是,有技術工具可以幫助我們減輕這些風險。 我們回顧了事實檢查的進展,特別是在此用例中如何應用機器學習。

翻譯自: https://towardsdatascience.com/managing-infodemics-slowing-the-spread-of-misinformation-b8b74e3e2618

php如何減緩gc

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/388154.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/388154.shtml
英文地址,請注明出處:http://en.pswp.cn/news/388154.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

[UE4]刪除UI:Remove from Parent

同時要將保存UI的變量清空,以釋放占用的系統內存 轉載于:https://www.cnblogs.com/timy/p/9842206.html

MySQL基礎部分總結

MySQL 1、選擇數據庫 use dbnameshow databases;2、數據表 show tablesmysql> show columns from customers;mysql> desc customers;3、show 語句 show statusshow create databasesshow create tableshow grants4、select 檢索 4.1.1版本后不再區分大小寫,但…

BZOJ2503: 相框

Description P大的基礎電路實驗課是一個無聊至極的課。每次實驗,T君總是提前完成,管理員卻不讓T君離開,T君只能干坐在那兒無所事事。先說說這個實驗課,無非就是把幾根導線和某些元器件(電阻、電容、電感等)…

泰坦尼克號 數據分析_第1部分:泰坦尼克號-數據分析基礎

泰坦尼克號 數據分析My goal was to get a better understanding of how to work with tabular data so I challenged myself and started with the Titanic -project. I think this was an excellent way to learn the basics of data analysis with python.我的目標是更好地了…

Imperva開源域目錄控制器,簡化活動目錄集成

Imperva已公開發布域目錄控制器(Domain Directory Controller,DDC)的源代碼,這是一個Java庫,用于簡化常見的Active Directory集成。 與Java的LdapContext不同,這個庫構建在Apache Directory LDAP之上&#…

2018.10.24 NOIP模擬 小 C 的序列(鏈表+數論)

傳送門 考慮到a[l],gcd(a[l],a[l1]),gcd(a[l],a[l1],a[l2])....gcd(a[l]...a[r])a[l],gcd(a[l],a[l1]),gcd(a[l],a[l1],a[l2])....gcd(a[l]...a[r])a[l],gcd(a[l],a[l1]),gcd(a[l],a[l1],a[l2])....gcd(a[l]...a[r])是可以分成最多logloglog段且段內的數都是相同的。 那么我們用…

vba數組dim_NDArray — —一個基于Java的N-Dim數組工具包

vba數組dim介紹 (Introduction) Within many development languages, there is a popular paradigm of using N-Dimensional arrays. They allow you to write numerical code that would otherwise require many levels of nested loops in only a few simple operations. Bec…

Nodejs教程08:同時處理GET/POST請求

示例代碼請訪問我的GitHub: github.com/chencl1986/… 同時處理GET/POST請求 通常在開發過程中,同一臺服務器需要接收多種類型的請求,并區分不同接口,向客戶端返回數據。 最常用的方式,就是對請求的方法、url進行區分判…

關于position的四個標簽

四個標簽是static,relative,absolute,fixed。 static 該值是正常流,并且是默認值,因此你很少看到(如果存在的話)指定該值。 relative:框的位置能夠相對于它在正常流中的位置有所偏移…

python算法和數據結構_Python中的數據結構和算法

python算法和數據結構To至 Leonardo da Vinci達芬奇(Leonardo da Vinci) 介紹 (Introduction) The purpose of this article is to give you a panorama of data structures and algorithms in Python. This topic is very important for a Data Scientist in order to help …

CSS:元素塌陷問題

2019獨角獸企業重金招聘Python工程師標準>>> 描述: 在文檔流中,父元素的高度默認是被子元素撐開的,也就是子元素多高,父元素就多高。但是當子元素設置浮動之后,子元素會完全脫離文檔流,此時將會…

Celery介紹及常見錯誤

celery 情景:用戶發起request,并等待response返回。在本些views中,可能需要執行一段耗時的程序,那么用戶就會等待很長時間,造成不好的用戶體驗,比如發送郵件、手機驗證碼等。 使用celery后,情況…

python dash_Dash是Databricks Spark后端的理想基于Python的前端

python dash📌 Learn how to deliver AI for Big Data using Dash & Databricks this recorded webinar with Peter Kim of Plotly and Prasad Kona of Databricks.this通過Plotly的Peter Kim和Databricks的Prasad Kona的網絡研討會了解如何使用Dash&#xff06…

js里的數據類型轉換

1、類型轉換 轉換為字符串 - String(x)- x.toString(x, 10)- x 轉換為數字 - Number(x)- parseInt(x, 10) - parseFloat(x) - x - 0- x 轉換為boolean - Boolean(x)- !!x 2、falsy值(false) - 0- NaN- - null- undefined 3、內存圖 - object存儲的是地址…

Eclipse 插件開發遇到問題心得總結

Eclipse 插件開發遇到問題心得總結 Posted on 2011-07-17 00:51 季楓 閱讀(3997) 評論(0) 編輯 收藏1、Eclipse 中插件開發多語言的實現 為了使用 .properties 文件,需要在 META-INF/MANIFEST.MF 文件中定義: Bundle-Localization: plugin 這樣就會…

/src/applicationContext.xml

<?xml version"1.0" encoding"UTF-8"?> <beans xmlns"http://www.springframework.org/schema/beans" xmlns:xsi"http://www.w3.org/2001/XMLSchema-instance" xmlns:context"http://www.springframework.org/schema…

在Python中查找子字符串索引的5種方法

在Python中查找字符串中子字符串索引的5種方法 (5 Ways to Find the Index of a Substring in Strings in Python) str.find() str.find() str.rfind() str.rfind() str.index() str.index() str.rindex() str.rindex() re.search() re.search() str.find() (str.find()) …

[LeetCode] 3. Longest Substring Without Repeating Characters 題解

問題描述 輸入一個字符串&#xff0c;找到其中最長的不重復子串 例1&#xff1a; 輸入&#xff1a;"abcabcbb" 輸出&#xff1a;3 解釋&#xff1a;最長非重復子串為"abc" 復制代碼例2&#xff1a; 輸入&#xff1a;"bbbbb" 輸出&#xff1a;1 解…

WPF中MVVM模式的 Event 處理

WPF的有些UI元素有Command屬性可以直接實現綁定&#xff0c;如Button 但是很多Event的觸發如何綁定到ViewModel中的Command呢&#xff1f; 答案就是使用EventTrigger可以實現。 繼續上一篇對Slider的研究&#xff0c;在View中修改Interaction. <i:Interaction.Triggers>&…

Eclipse 插件開發 向導

閱讀目錄 最近由于特殊需要&#xff0c;開始學習插件開發。   下面就直接弄一個簡單的插件吧!   1 新建一個插件工程   2 創建自己的插件名字&#xff0c;這個名字最好特殊一點&#xff0c;一遍融合到eclipse的時候&#xff0c;不會發生沖突。   3 下一步&#xff0c;進…