openai-gpt_為什么到處都看到GPT-3?

openai-gpt

Disclaimer: My opinions are informed by my experience maintaining Cortex, an open source platform for machine learning engineering.

免責聲明:我的看法是基于我維護 機器學習工程的開源平臺 Cortex的 經驗而 得出 的。

If you frequent any part of the tech internet, you’ve come across GPT-3, OpenAI’s new state of the art language model. While hype cycles forming around new technology isn’t new—GPT-3’s predecessor, GPT-2, generated quite a few headlines as well—GPT-3 is in a league of its own.

如果您經常光顧技術互聯網的任何部分,就會遇到OpenAI的最新語言模型GPT-3。 盡管圍繞新技術形成的炒作周期并不新鮮-GPT-3的前身GPT-2也引起了許多關注,但GPT-3卻是一個聯盟。

Looking at Hacker News for the last couple months, there have been dozens of hugely popular posts, all about GPT-3:

在最近幾個月的Hacker News中,有數十篇非常受歡迎的帖子,都是關于GPT-3的:

If you’re on Twitter, you’ve no doubt seen projects built on GPT-3 going viral, like this Apple engineer who used GPT-3 to write Javascript using a specific 3D rendering library:

如果您在Twitter上,那么您無疑會看到基于GPT-3構建的項目正在蓬勃發展,例如這位蘋果工程師使用GPT-3使用特定的3D渲染庫編寫Javascript:

And of course, there have been plenty of “Is this the beginning of SkyNet?” articles written:

當然,有很多“這是天網的開始嗎?” 撰寫文章:

Image for post
Nuanced and insightful journalism courtesy of Coindesk
Coindesk提供細致入微的新聞報道

The excitement over GPT-3 is just a piece of an bigger trend. Every month, we see more and more new initiatives release, all built on machine learning.

對GPT-3的興奮只是一大趨勢。 每個月,我們都會看到越來越多的新計劃發布,它們都是基于機器學習的。

To understand why this is happening, and what the trend’s broader implications are, GPT-3 serves as a useful study.

要了解這種情況的發生原因以及趨勢的更廣泛含義,GPT-3是一項有用的研究。

GPT-3有什么特別之處? (What’s so special about GPT-3?)

The obvious take here is that GPT-3 is simply more powerful than any other language model, and that the increase in production machine learning lately can be chalked up to similar improvements across the field.

顯而易見,GPT-3比其他任何語言模型都強大,并且最近生產機器學習的增加可以歸結為該領域的類似改進。

Undoubtedly, yes. This is a factor. But, and this is crucial, GPT-3 isn’t so popular just because it’s powerful. GPT-3 is ubiquitous because it is usable.

毫無疑問,是的。 這是一個因素。 但是,這很關鍵,GPT-3并不是因為它強大而流行。 GPT-3因其可用而無處不在。

By “usable,” I mean that anyone can build with it, and it’s easy. For context, after the full GPT-2 was released, most of the popular projects built on it were built by machine learning specialists, and required substantial effort:

“可用”是指任何人都可以使用它進行構建,而且很容易。 就上下文而言,在完整的GPT-2發布之后,基于它的大多數流行項目都是由機器學習專家構建的,并且需要大量的精力:

Comparatively, it has only been a couple of months since GPT-3's announcement, and we’re already seeing dozens of viral projects built on it, often of the “I got bored and built this in an afternoon” variety:

相比較而言,距GPT-3發布僅兩個月,我們已經看到了數十個病毒式項目,這些項目通常是“我無聊并在下午建造的”這類項目:

Anyone with some basic engineering chops can now build an application leveraging state of the art machine learning, and this increase in the usability of models—not just their raw power—is an industry-wide phenomenon.

現在,任何具有基本工程知識的人都可以利用最先進的機器學習來構建應用程序,并且模型可用性 (不僅是原始能力)的這種增加是整個行業的現象。

為什么用機器學習突然變得如此容易 (Why it’s suddenly so easy to build with machine learning)

One of the biggest blockers to using machine learning in production has been infrastructure. We’ve had models capable of doing incredible things for a long time, but actually building with them has remained a major challenge.

基礎設施是在生產中使用機器學習的最大障礙之一。 我們擁有的模型能夠長時間執行令人難以置信的工作,但實際上如何構建它們仍然是一個重大挑戰。

For example, consider GPT-2. How would you build a GPT-2 application?

例如,考慮使用GPT-2。 您將如何構建GPT-2應用程序?

Intuitively, the model is more or less an input-output machine, and the most logical thing to do would be to treat it as some sort of microservice, a predict() function your application could call. Pass in some text and receive GPT-2 generated text in return, just like any other API.

直觀地講,該模型或多或少是一臺輸入/輸出機器,最合乎邏輯的事情是將其視為某種微服務,即應用程序可以調用的predict()函數。 與其他任何API一樣,傳遞一些文本并接收GPT-2生成的文本作為回報。

This is the main way of deploying GPT-2 (what is known as realtime inference), and it comes with some serious challenges:

這是部署GPT-2(稱為實時推斷)的主要方式,并且面臨一些嚴峻的挑戰:

  • GPT-2 is massive. The fully trained model is roughly 6 GB. Hosting a GPT-2 microservice requires a lot of disk space.

    GPT-2非常龐大 。 經過全面訓練的模型大約為6 GB。 托管GPT-2微服務需要大量磁盤空間。

  • GPT-2 is compute hungry. Without at least one GPU, you will not be able to generate predictions with anywhere near acceptable latency.

    GPT-2非常餓 。 如果沒有至少一個GPU,您將無法在接近可接受延遲的任何位置生成預測。

  • GPT-2 is expensive. Given the above, you need to deploy GPT-2 to a cluster provisioned with large GPU instances—very expensive at scale.

    GPT-2價格昂貴。 鑒于上述情況,您需要將GPT-2部署到配備了大型GPU實例的集群上,這在規模上非常昂貴。

And this is just for the vanilla, pretrained GPT-2 model. If you want to fine tune GPT-2 for other tasks, that too will be its own technical challenge.

這僅適用于經過預訓練的原始GPT-2模型。 如果您想對GPT-2進行微調以完成其他任務,那也將是其自身的技術挑戰。

This is why machine learning has been so unusable. Using it in production required you not only to be versed in machine learning, but also DevOps and backend development. This describes very few people.

這就是為什么機器學習如此無法使用的原因。 在生產中使用它不僅需要精通機器學習,還需要DevOps和后端開發。 這說明很少有人。

Over the last several years, this has changed. There has been an emphasis in the community to improve infrastructure, and as a result, it’s gotten much easier to actually use models. Now, you can take a new model, write your API, and hit deploy—no DevOps needed.

在過去的幾年中,這種情況發生了變化。 社區一直在強調改善基礎結構,因此,實際使用模型變得更加容易。 現在,您可以采用新模型,編寫API并進行deploy -無需DevOps。

GPT-3 is an extreme example of this trend. The model, which is almost certainly too large for most teams to host, was actually released as an API.

GPT-3是這種趨勢的極端例子。 該模型幾乎可以肯定對于大多數團隊來說太大了,實際上是作為API發布的。

While this move rankled many, it had a secondary effect. All of a sudden, using the most powerful language model in the world was easier than sending a text message with Twilio or setting up payments with Stripe.

盡管此舉激怒了許多人,但產生了輔助作用。 突然之間,使用世界上最強大的語言模型比通過Twilio發送短信或通過Stripe設置付款要容易得多。

In other words, you could call GPT-3 the most complex language model in history, but you could also call it just another API.

換句話說,您可以將GPT-3稱為歷史上最復雜的語言模型,但也可以將其稱為另一個API

The number of people who can query an API, as it turns out, is orders of magnitude higher than the number of people that can deploy GPT-2 to production, hence the huge number of GPT-3 projects.

事實證明,可以查詢API的人數比可以將GPT-2部署到生產環境的人數高出幾個數量級,因此存在大量的GPT-3項目。

機器學習工程現已成為主流 (Machine learning engineering is mainstream now)

GPT-3’s hype train is a convergence of things. It does have unprecedented accuracy, but it is also incredibly usable, and was released at a time when machine learning engineering has matured as an ecosystem and discipline.

GPT-3的炒作是事物的融合。 它確實具有史無前例的準確性,但也非常有用,并且是在機器學習工程作為一種生態系統和學科成熟時發布的。

For context, machine learning engineering is a field focused on building applications out of models. “How can I train a model to most accurately generate text?” is an ML research question. “How can I use GPT-2 to write folk music?” is a machine learning engineering question.

就上下文而言,機器學習工程是一個專注于用模型構建應用程序的領域。 “如何訓練模型以最準確地生成文本?” 是一個機器學習研究問題。 “如何使用GPT-2 編寫民間音樂 ?” 是一個機器學習工程問題。

Because the machine learning engineering community is growing rapidly, companies are releasing new models like web frameworks, hoping to attract engineers to build with them. A consideration, therefore, has to be usability—they want to release not just the most powerful, but the most used model.

由于機器學習工程界正在Swift發展,因此公司正在發布諸如Web框架之類的新模型,希望吸引工程師與之一起構建。 因此,必須考慮可用性-他們不僅要發布功能最強大的模型,而且要發布使用最多的模型。

Obviously, the proliferation of machine learning has many implications, but for engineers, there are two big conclusions to draw from this GPT-3 situation:

顯然,機器學習的普及具有很多含義,但是對于工程師來說,從GPT-3的情況可以得出兩個大結論:

  • It is easier than ever for you to actually build with machine learning.

    使用機器學習進行實際構建比以往任何時候都容易。
  • It is unlikely that in the near future you will be working on a piece of software that doesn’t not incorporate machine learning in some way.

    在不久的將來,您不太可能會開發一款不會以某種方式并入機器學習的軟件。

Machine learning is becoming a standard part of the software stack, and that trend is only accelerating. If you’re not already, it’s time to get familiar with production machine learning.

機器學習正在成為軟件堆棧的標準部分,而且這種趨勢還在加速發展。 如果您還不是,請該熟悉生產機器學習了。

翻譯自: https://towardsdatascience.com/why-are-you-seeing-gpt-3-everywhere-f156a71b77b0

openai-gpt

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389377.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389377.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389377.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

Pytorch高階API示范——DNN二分類模型

代碼部分: import numpy as np import pandas as pd from matplotlib import pyplot as plt import torch from torch import nn import torch.nn.functional as F from torch.utils.data import Dataset,DataLoader,TensorDataset""" 準備數據 &qu…

OO期末總結

$0 寫在前面 善始善終,臨近期末,為一學期的收獲和努力畫一個圓滿的句號。 $1 測試與正確性論證的比較 $1-0 什么是測試? 測試是使用人工操作或者程序自動運行的方式來檢驗它是否滿足規定的需求或弄清預期結果與實際結果之間的差別的過程。 它…

puppet puppet模塊、file模塊

轉載:http://blog.51cto.com/ywzhou/1577356 作用:通過puppet模塊自動控制客戶端的puppet配置,當需要修改客戶端的puppet配置時不用在客戶端一一設置。 1、服務端配置puppet模塊 (1)模塊清單 [rootpuppet ~]# tree /et…

數據可視化及其重要性:Python

Data visualization is an important skill to possess for anyone trying to extract and communicate insights from data. In the field of machine learning, visualization plays a key role throughout the entire process of analysis.對于任何試圖從數據中提取和傳達見…

熊貓數據集_熊貓邁向數據科學的第三部分

熊貓數據集Data is almost never perfect. Data Scientist spend more time in preprocessing dataset than in creating a model. Often we come across scenario where we find some missing data in data set. Such data points are represented with NaN or Not a Number i…

Pytorch有關張量的各種操作

一,創建張量 1. 生成float格式的張量: a torch.tensor([1,2,3],dtype torch.float)2. 生成從1到10,間隔是2的張量: b torch.arange(1,10,step 2)3. 隨機生成從0.0到6.28的10個張量 注意: (1).生成的10個張量中包含0.0和6.28&#xff…

mongodb安裝失敗與解決方法(附安裝教程)

安裝mongodb遇到的一些坑 浪費了大量的時間 在此記錄一下 主要是電腦系統win10企業版自帶的防火墻 當然還有其他的一些坑 一般的問題在第6步驟都可以解決,本教程的安裝步驟不夠詳細的話 請自行百度或谷歌 安裝教程很多 我是基于node.js使用mongodb結合Robo 3T數…

【洛谷算法題】P1046-[NOIP2005 普及組] 陶陶摘蘋果【入門2分支結構】Java題解

👨?💻博客主頁:花無缺 歡迎 點贊👍 收藏? 留言📝 加關注?! 本文由 花無缺 原創 收錄于專欄 【洛谷算法題】 文章目錄 【洛谷算法題】P1046-[NOIP2005 普及組] 陶陶摘蘋果【入門2分支結構】Java題解🌏題目…

web性能優化(理論)

什么是性能優化? 就是讓用戶感覺你的網站加載速度很快。。。哈哈哈。 分析 讓我們來分析一下從用戶按下回車鍵到網站呈現出來經歷了哪些和前端相關的過程。 緩存 首先看本地是否有緩存,如果有符合使用條件的緩存則不需要向服務器發送請求了。DNS查詢建立…

python多項式回歸_如何在Python中實現多項式回歸模型

python多項式回歸Let’s start with an example. We want to predict the Price of a home based on the Area and Age. The function below was used to generate Home Prices and we can pretend this is “real-world data” and our “job” is to create a model which wi…

充分利用UC berkeleys數據科學專業

By Kyra Wong and Kendall Kikkawa黃凱拉(Kyra Wong)和菊川健多 ( Kendall Kikkawa) 什么是“數據科學”? (What is ‘Data Science’?) Data collection, an important aspect of “data science”, is not a new idea. Before the tech boom, every industry al…

文本二叉樹折半查詢及其截取值

using System;using System.ComponentModel;using System.Data;using System.Drawing;using System.Text;using System.Windows.Forms;using System.Collections;using System.IO;namespace CS_ScanSample1{ /// <summary> /// Logic 的摘要說明。 /// </summary> …

nn.functional 和 nn.Module入門講解

本文來自《20天吃透Pytorch》 一&#xff0c;nn.functional 和 nn.Module 前面我們介紹了Pytorch的張量的結構操作和數學運算中的一些常用API。 利用這些張量的API我們可以構建出神經網絡相關的組件(如激活函數&#xff0c;模型層&#xff0c;損失函數)。 Pytorch和神經網絡…

10.30PMP試題每日一題

SC>0&#xff0c;CPI<1&#xff0c;說明項目截止到當前&#xff1a;A、進度超前&#xff0c;成本超值B、進度落后&#xff0c;成本結余C、進度超前&#xff0c;成本結余D、無法判斷 答案將于明天和新題一起揭曉&#xff01; 10.29試題答案&#xff1a;A轉載于:https://bl…

02-web框架

1 while True:print(server is waiting...)conn, addr server.accept()data conn.recv(1024) print(data:, data)# 1.得到請求的url路徑# ------------dict/obj d["path":"/login"]# d.get(”path“)# 按著http請求協議解析數據# 專注于web業…

ai驅動數據安全治理_AI驅動的Web數據收集解決方案的新起點

ai驅動數據安全治理Data gathering consists of many time-consuming and complex activities. These include proxy management, data parsing, infrastructure management, overcoming fingerprinting anti-measures, rendering JavaScript-heavy websites at scale, and muc…

從Text文本中讀值插入到數據庫中

/// <summary> /// 轉換數據&#xff0c;從Text文本中導入到數據庫中 /// </summary> private void ChangeTextToDb() { if(File.Exists("Storage Card/Zyk.txt")) { try { this.RecNum.Visibletrue; SqlCeCommand sqlCreateTable…

Dataset和DataLoader構建數據通道

重點在第二部分的構建數據通道和第三部分的加載數據集 Pytorch通常使用Dataset和DataLoader這兩個工具類來構建數據管道。 Dataset定義了數據集的內容&#xff0c;它相當于一個類似列表的數據結構&#xff0c;具有確定的長度&#xff0c;能夠用索引獲取數據集中的元素。 而D…

鐵拳nat映射_鐵拳如何重塑我的數據可視化設計流程

鐵拳nat映射It’s been a full year since I’ve become an independent data visualization designer. When I first started, projects that came to me didn’t relate to my interests or skills. Over the past eight months, it’s become very clear to me that when cl…

Django2 Web 實戰03-文件上傳

作者&#xff1a;Hubery 時間&#xff1a;2018.10.31 接上文&#xff1a;接上文&#xff1a;Django2 Web 實戰02-用戶注冊登錄退出 視頻是一種可視化媒介&#xff0c;因此視頻數據庫至少應該存儲圖像。讓用戶上傳文件是個很大的隱患&#xff0c;因此接下來會討論這倆話題&#…