So … What Is ChatGPT Doing, and Why Does It Work?
那么…ChatGPT在做什么,為什么它有效呢?
The basic concept of ChatGPT is at some level rather simple. Start from a huge sample of human-created text from the web, books, etc. Then train a neural net to generate text that’s “like this”. And in particular, make it able to start from a “prompt” and then continue with text that’s “like what it’s been trained with”.
在某種程度上,ChatGPT 的基本概念非常簡單。從互聯網、書籍等來源的大量人類創作的文本開始,然后訓練一個神經網絡生成“類似”的文本。特別是,使其能夠從一個“提示”開始,然后繼續生成“類似于它所訓練過的”的文本。
As we’ve seen, the actual neural net in ChatGPT is made up of very simple elements—though billions of them. And the basic operation of the neural net is also very simple, consisting essentially of passing input derived from the text it’s generated so far “once through its elements” (without any loops, etc.) for every new word (or part of a word) that it generates.
正如我們所看到的,ChatGPT 中的實際神經網絡由非常簡單的元素組成——盡管有數十億之多。神經網絡的基本操作也非常簡單,本質上由輸入傳遞到迄今為止生成的文本所派生的“一次通過其元素”(沒有任何循環等)以生成每個新單詞(或單詞的一部分)。
But the remarkable—and unexpected—thing is that this process can produce text that’s successfully “like” what’s out there on the web, in books, etc. And not only is it coherent human language, it also “says things” that “follow its prompt” making use of content it’s “read”. It doesn’t always say things that “globally make sense” (or correspond to correct computations)—because (without, for example, accessing the “computational superpowers” of Wolfram|Alpha) it’s just saying things that “sound right” based on what things “sounded like” in its training material.
但令人驚訝且意想不到的是,這個過程可以生成與網絡、書籍等地方的文本成功“類似”的文本。不僅是連貫的人類語言,它還“說出了事物”,根據它“讀過”的內容“遵循其提示”。它并不總是說出“全局有意義的事物”(或對應于正確的計算),因為(例如,沒有訪問 Wolfram|Alpha 的“計算超能力”)它只是說出那些基于其訓練材料中事物的“聽起來像”的東西。
The specific engineering of ChatGPT has made it quite compelling. But ultimately (at least until it can use outside tools) ChatGPT is “merely” pulling out some “coherent thread of text” from the “statistics of conventional wisdom” that it’s accumulated. But it’s amazing how human-like the results are. And as I’ve discussed, this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought. ChatGPT has implicitly discovered it. But we can potentially explicitly expose it, with semantic grammar, computational language, etc.
ChatGPT 的具體工程使其非常引人注目。但最終(至少在它可以使用外部工具之前),ChatGPT 只是從它積累的“常識統計”中挖掘出一些“連貫的文本線索”。但令人驚訝的是,結果是多么的像人類。正如我所討論的,這暗示了一些至少在科學上非常重要的東西:人類語言(及其背后的思維模式)在結構上比我們想象的更簡單、更“類似法則”。ChatGPT 已經隱含地發現了它。但我們可以通過語義語法、計算語言等將其潛在地明確地暴露出來。
What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar.
ChatGPT 在生成文本方面的表現非常令人印象深刻,結果通常非常類似于我們人類的產物。那么,這是否意味著 ChatGPT 像大腦一樣工作呢?它底層的人工神經網絡結構最初是基于大腦的理想化模型。而當我們人類產生語言時,很多方面的過程似乎相當相似。
When it comes to training (AKA learning) the different “hardware” of the brain and of current computers (as well as, perhaps, some undeveloped algorithmic ideas) forces ChatGPT to use a strategy that’s probably rather different (and in some ways much less efficient) than the brain. And there’s something else as well: unlike even in typical algorithmic computation, ChatGPT doesn’t internally “have loops” or “recompute on data”. And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.
在訓練(也稱為學習)方面,大腦和當前計算機的不同“硬件”(以及可能還有一些未開發的算法思想)迫使 ChatGPT 使用一種可能與大腦相當不同(在某些方面效率低得多)的策略。還有另一個方面:與典型的算法計算不同,ChatGPT 在內部沒有“循環”或“重新計算數據”。這不可避免地限制了它的計算能力——即使與現有計算機相比,更不用說與大腦相比了。
It’s not clear how to “fix that” and still maintain the ability to train the system with reasonable efficiency. But to do so will presumably allow a future ChatGPT to do even more “brain-like things”. Of course, there are plenty of things that brains don’t do so well—particularly involving what amount to irreducible computations. And for these both brains and things like ChatGPT have to seek “outside tools”—like Wolfram Language.
目前還不清楚如何在保持系統合理訓練效率的同時“解決這個問題”。但要做到這一點,可能會讓未來的 ChatGPT 能夠做更多“類似大腦的事情”。當然,大腦在許多方面做得并不好——特別是涉及到不可約計算的部分。對于這些方面,大腦和像 ChatGPT 這樣的工具都必須尋求“外部工具”——如 Wolfram 語言。
But for now it’s exciting to see what ChatGPT has already been able to do. At some level it’s a great example of the fundamental scientific fact that large numbers of simple computational elements can do remarkable and unexpected things. But it also provides perhaps the best impetus we’ve had in two thousand years to understand better just what the fundamental character and principles might be of that central feature of the human condition that is human language and the processes of thinking behind it.
但現在,看到 ChatGPT 已經取得的成果非常令人興奮。在某種程度上,這是一個很好的例子,證明了大量簡單計算元素可以實現非凡和意想不到的事情這一基本科學事實。同時,它也為我們提供了兩千年來最好的動力,以更好地理解構成人類狀況的核心特征和原則,即人類語言及其背后的思維過程。
“點贊有美意,贊賞是鼓勵”