webassembly

by Patrick Ferris

帕特里克·費里斯(Patrick Ferris)

WebAssembly的設計 (The Design of WebAssembly)

I love the web. It is a modern-day superpower for the dissemination of information and empowerment of the individual. Of course, it has its downsides like trolling (largely possible through anonymity) and privacy issues, not to mention the problems of ownership and copyright infringement about to come into effect with the highly divisive article 13. But, let’s forget about that for just a moment and marvel at the technological innovation of the internet and the browsers which support it.

我喜歡網絡。它是當今傳播信息和增強個人能力的超級大國。當然，它有缺點，例如拖釣(很大程度上可能通過匿名)和隱私問題，更不用說高度分裂的條款13即將生效的所有權和版權侵權問題。但是，讓我們暫時忘記這一點，驚嘆于互聯網和支持互聯網的瀏覽器的技術創新。

I first learnt to code in Javascript and have since been ridiculed by many for liking it. Yes, I know there are weird bits like this gem: [] == ![] // true but it has become one of the most ubiquitous languages on the planet thanks to the internet, browsers and the interpreters that run the code (Google’s V8 and Firefox’s SpiderMonkey to name a few).

我最初學會了使用Javascript進行編碼，此后由于喜歡它而被許多人嘲笑。是的，我知道有怪異這樣的寶石位： [] == ![] // true ，但它已成為最普遍的語言地球，感謝互聯網，瀏覽器和運行代碼的解釋上的一個( 谷歌的V8和Firefox的SpiderMonkey等 )。

As I got more into web development, I noticed a new name on the block: WebAssembly. As a computer science student and a developer, I believe one of the best ways to learn something is to try and understand why the engineers who built it made those design choices. So here is a brief look at some of the interesting design principles in WebAssembly and also why I think everyone should be excited.

隨著我對Web開發的更多投入，我注意到了一個新名稱：WebAssembly。作為計算機科學專業的學生和開發人員，我相信學習某些東西的最好方法之一就是嘗試了解為什么建造它的工程師會做出這些設計選擇。因此，這里簡要介紹一下WebAssembly中一些有趣的設計原則，以及為什么我認為每個人都應該感到興奮。

為什么我們需要WebAssembly？ (Why do we need WebAssembly?)

Okay, so first of all, to all my Javascript fans out there — no you shouldn’t be worried. When Javascript first came about, it was designed to be used in a lightweight way but has since gone on to do a lot of heavy lifting. Maybe it was used for manipulating some DOM elements, some client-side verification in forms but not everything that is trying to be done on the web now. Certainly not running fully-fledged games.

好的，首先，對所有我的Javascript愛好者來說，不，您不必擔心。當Javascript第一次出現時，它被設計為以輕量級方式使用，但此后又進行了很多繁重的工作。也許它被用于操縱一些DOM元素，以形式進行一些客戶端驗證，但不是現在試圖在網絡上進行的所有操作。當然不是在運行成熟的游戲。

Why is Javascript not so fast or great? One of the main reasons is because it is an interpreted language. Scanning the code line by line and executing, luckily with Just-in-Time compilers, the efficiency improved massively but still there is only so much room to improve. But even then there’s the issue of Javascript’s dynamic typing causing another ceiling on performance

為什么Javascript沒那么快或太好了？主要原因之一是因為它是一種解釋語言。幸運的是，使用即時編譯器逐行掃描代碼并執行，效率得到了極大的提高，但仍有很大的提高空間。但是即使那樣，仍然存在Javascript動態類型的問題，這又導致了性能的另一個上限

Alex Danilo discussed the improvements WebAssembly could make in his Google I/O talk in 2017. What really brought home the inefficiencies was his example add(a, b) function and the complexity that the Javascript interpreters have to go through in order to make sense of it.

Alex Danilo在2017年的Google I / O演講中討論了WebAssembly可以進行的改進。真正使效率低下的原因是他的示例add(a, b)函數以及Java解釋器必須經過的復雜性才能使其有意義它的。

WebAssembly opens the door to compilation, which opens another door to optimisation. It’s ability to take C/C++ source language allows it to do some static type checking which helps improve speed. This is what the developers of the Mozilla Foundation realised and wanted to fix. To summarise this great video, Javascript was designed for humans and browsers were left to try and make it fast; WebAssembly was designed as a target language for compilers that browsers could already run quickly.

WebAssembly為編譯打開了大門，為優化打開了另一扇門。它具有采用C / C ++源語言的能力，因此可以執行一些靜態類型檢查，這有助于提高速度。這就是Mozilla Foundation的開發人員意識到并希望解決的問題。總結這段精彩的視頻，Javascript是為人類設計的，瀏覽器則可以嘗試使其變得更快。 WebAssembly被設計為編譯器的目標語言，使瀏覽器可以快速運行。

The realisation that we could have two choices of code run in the engines was an exciting prospect — and the four major browsers (Chrome, Safari, Firefox and IE) all began plans to let their engines run Javascript and WebAssembly. Again, let me reiterate… WebAssembly is not replacing Javascript.

我們意識到可以在引擎中運行兩種代碼選擇，這是一個令人興奮的前景-四種主要的瀏覽器(Chrome，Safari，Firefox和IE)都開始計劃使其引擎運行Javascript和WebAssembly。再次，讓我重申一下……WebAssembly 不會替代Javascript。

為什么要編譯代碼？ (Why compile code?)

Compiling code really means taking it from one (source) language and translating it into another (target) language. This is an incredibly simplified understanding of compilation. Most modern day compilation pipelines involve many more stages that allow us to really fine-tune and optimise our code making it faster and more energy-efficient.

編譯代碼實際上意味著從一種(源)語言中獲取代碼并將其翻譯為另一種(目標)語言。這是對編譯的極大簡化。大多數現代編譯管道涉及更多的階段，這些階段使我們能夠真正地微調和優化我們的代碼，從而使其更快，更節能。

The first steps usually include lexical, syntactic and semantic analysers to get the code into some kind of intermediate language that is perfect for optimisation. Then we optimise independently, generate the target code and then maybe optimise dependently on the hardware or environment.

第一步通常包括詞法，句法和語義分析器，以將代碼轉換成最適合優化的某種中間語言。然后我們獨立進行優化，生成目標代碼，然后可能依賴于硬件或環境進行優化。

All projects need to start small first, and the engineers at Mozilla decided to begin with their source language being C/C++ and using an existing toolchain called LLVM (not an acronym) they would compile using that.

所有項目都需要從小處著手，Mozilla的工程師決定從其源語言為C / C ++開始，并使用一個稱為LLVM (不是首字母縮寫)的現有工具鏈，然后使用該工具鏈進行編譯。

Initially, the search for a better performing web started with asm.js (at least in WebAssembly narrative. See PNaCL — Google’s earlier attempts) a small subset of Javascript that could be the compile target for C/C++ programs that used annotations and other clever tricks to improve the Javascript performance.

最初，搜索性能更好的Web始于asm.js (至少在WebAssembly敘述中。請參閱PNaCL -Google的早期嘗試)一小部分Javascript 腳本，它可以成為使用注釋和其他巧妙方法的C / C ++程序的編譯目標。提高Javascript性能的技巧。

Unfortunately, it lacked one crucial design principle underlying what was wanted: Portability. Different Javascript engines gave different performance reviews, but it was a clear indication that this may be a good approach.

不幸的是，它缺少一個基本的設計原則：可移植性。不同的Javascript引擎給出了不同的性能評估，但這清楚地表明這可能是一個好方法。

The developers of WebAssembly decided their target representation would be a binary format that provided a “dense, linear encoding of the abstract syntax”… Which is a lot of words, so let’s unpack that.

WebAssembly的開發人員決定，他們的目標表示形式將是一種二進制格式，該格式提供“ 抽象語法的密集，線性編碼 ”。。。這是很多單詞，因此讓我們對其進行解壓縮。

The “dense” part refers to the high-level goal of achieving a size and load time efficient format. The internet is all about sending data along wires, and whilst there are lots of projects to improve the latency of this, one foolproof way of achieving this is to send less data. Another important aspect is the increased decoding speed thanks to array indexing over dictionary lookup (if they used compressed text format). Read more about this design choice here.

“密集”部分是指實現尺寸和加載時間高效格式的高級目標。互聯網就是要通過電線發送數據，盡管有許多項目可以改善這種情況的延遲，但實現這一目標的一種萬無一失的方法是發送更少的數據。另一個重要方面是由于通過字典查找進行數組索引(如果他們使用壓縮文本格式)，從而提高了解碼速度。在此處閱讀有關此設計選擇的更多信息。

什么是at？ (What is wat?)

The binary format that the C and C++ programs compile to are .wasm files, these have a 1:1 mapping straight to a (somewhat) human readable text format. These files are labelled .wat , this WasmExplorer is great for getting your head around text representation and how it relates to the original code. Let’s take a simple example.

C和C ++程序編譯為的二進制格式是.wasm文件，這些文件具有1：1映射，直接映射到(某種程度上)人類可讀的文本格式。這些文件被標記為.wat ，此WasmExplorer非常有助于您理解文本表示形式及其與原始代碼的關系。讓我們舉一個簡單的例子。

There’s a lot going on here so let’s take it slowly and explain the concepts as they come.

這里有很多事情，所以讓我們慢慢來，并解釋概念的提出。

First, there is this weird module word, where did that come from? Mejin Leechor gave a great talk on modules in Javascript and describes them as giving code “structure and boundaries”. This is very similar to the idea of WebAssembly modules (and there are plans in the future to try and integrate with es6 modules).

首先，有一個奇怪的module字，它是從哪里來的？ Mejin Leechor在Java語言中對模塊進行了精彩演講，并將其描述為給出了代碼“結構和邊界”。這與WebAssembly模塊的思想非常相似(并且將來有計劃嘗試與es6模塊集成)。

Straight from the docs, we have that the module is the “distributable, loadable, and executable unit of code in WebAssembly”. Modules can have the following sections each with their own unique responsibility: import, export, start, global, memory, data, table, elements, function and code. For now, let’s just look at what we have in our module.

直接從文檔開始，我們認為該模塊是“ WebAssembly中的代碼的可分發，可加載和可執行單元 ”。模塊可以具有以下各節，各節各有其各自的職責：導入，導出，啟動，全局，內存，數據，表，元素，功能和代碼。現在，讓我們來看一下模塊中的內容。

The first declaration is (type $type0 (func (param i32) (result i32))) . This is intimately linked to the table call on the next line. We are declaring a new type with the func signature that takes a 32 bit integer parameter and returns a 32 bit integer. If we were to make use of the function we wrote again, we would have to make a call_indirect into our table and then we could do some type-checking to make sure everything was correct. As part of the minimal viable product only one table is allowed, but there are future plans to allow multiple tables and for these to be indexed.

第一個聲明是(type $type0 (func (param i32) (result i32))) 。這與下一行的表調用緊密相關。我們正在聲明一個帶有func簽名的新類型，該類型帶有32位整數參數并返回32位整數。如果要使用我們再次編寫的函數，則必須在table進行call_indirect ，然后可以進行類型檢查以確保所有內容正確。作為最低限度可行產品的一部分，只允許使用一個表，但是將來有計劃允許多個表并為它們建立索引。

The next declaration is (table 0 anyfunc) . The table section is reserved for defining zero or more tables. A table is similar to a linear memory in the sense that they are resizable arrays which contain references. The 0 makes reference to the fact that we have nothing in our table, but we still need to provide the MVP’s only possible value of anyfunc (a function).

下一個聲明是(table 0 anyfunc) 。表部分保留用于定義零個或多個表。表在某種意義上類似于線性內存，因為它們是包含引用的可調整大小的數組。 0表示表中沒有任何東西，但是我們仍然需要提供MVP的anyfunc (函數)唯一可能的值。

The problem that the developers had was linked to security. If a function wanted to call another function, giving it direct access to a function stored in linear memory was unsafe. Instead functions are stored in the table ready to be indexed if needed. Lin Clark wrote a great article describing tables (as used in imports) in more detail and how they provide better security.

開發人員遇到的問題與安全性有關。如果一個函數要調用另一個函數，則直接訪問線性存儲器中存儲的函數是不安全的。而是將函數存儲在表中，以便在需要時進行索引。 Lin Clark寫了一篇很棒的文章，更詳細地描述了表(用于導入)以及它們如何提供更好的安全性。

We then have a declaration of (memory 1) , this is the linear memory used by the module and we declare that we need 1 page of memory (64KiB).

然后，我們有一個聲明(memory 1) ，這是模塊使用的線性內存，我們聲明需要1頁內存(64KiB) 。

The next declaration is (export "memory" memory) . An export is something that is returned to the host at instantiation time. Basically, the cool bits we want from the WebAssembly code.

下一個聲明是(export "memory" memory) 。導出是實例化時返回給主機的內容。基本上，我們需要WebAssembly代碼中的一些技巧。

The structure is quite simple (export <name-of-export> (<type> &lt;name/index>)) so here we are just exporting the memory we declared in the previous line. This allows for direct memory access within our Javascript code, as an ArrayBuffer which drastically improves the efficiency as there are no backwards and forwards calls across the WASM/JS border. Similarly we then export our function with (export "main" $func0) .

結構非常簡單(export <name-of-export> (<type> &l t; name / index>))，因此這里我們只是導出在上一行中聲明的內存。這允許在我們的cript code,進行直接內存訪問cript code,因為ArrayBuffer可以大大提高效率，因為不會在WASM / JS邊界上發生任何向后和向前的調用。同樣，然后our function with (exp ort“ main” $ func0)導出our function with (exp 。

Now to the slightly more interesting bit, our code and its representation.

現在到稍微有趣的一點，我們的代碼及其表示形式。

Before moving on, this is the perfect opportunity to introduce yet another design component: the stack machine.

在繼續之前，這是引入另一個設計組件：堆棧機的絕佳機會。

注冊與堆疊機 (Register versus Stack Machines)

Computers, at their simplest, consume inputs and produce outputs. As a ‘machine’ executes a program it can do so in multiple different ways. Two of the main approaches are register and stack machines. In a register machine, parameters to functions are kept in memory locations and are then manipulated depending on the program in execution.

最簡單的計算機消耗輸入并產生輸出。當“機器”執行程序時，它可以以多種不同方式執行。兩種主要方法是套準機和堆棧機。在寄存器機中，功能參數保存在存儲器中，然后根據執行中的程序進行操作。

A simple, but somewhat flawed, analogy could be a kitchen and making a recipe. The ingredients are stored in different locations, you get them and make something which you might put somewhere for another day or immediately consume (yum). It’s far from perfect but hopefully you get the idea.

一個簡單但有缺陷的類比可以是廚房和烹飪食譜。食材存儲在不同的位置，您可以得到它們，然后制成一些可以放在另一天或立即食用(百勝)的東西。這遠非完美，但希望您能理解。

Stack machines, on the other hand, employ a different model. Imagine you are a journalist or secretary, your job is to read and respond to letters. You ‘pop’ the top letter from your pile and begin writing a response whilst someone else comes along with more work and ‘pushes’ to the top of the pile. These are the ones you are going to have to do next. Again, grossly oversimplified but it should help visualise the mechanics.

另一方面，堆棧機采用不同的模型。假設您是新聞工作者或秘書，您的工作是閱讀和回復信件。您從堆中“彈出”頂部的字母，開始寫回覆，而其他人則伴隨著更多的工作和“推動”到堆頂部。這些是您接下來必須要做的。再次，過度簡化，但這應該有助于可視化機制。

WebAssembly uses a stack machine model for code execution. If you’re short of some great reading, and are into programming semantics, the paper “Bringing the Web up to Speed with WebAssembly” is really good. It also indicates why they choose the stack machine representation: “The stack organization is merely a way to achieve a compact program representation, as it has been shown to be smaller than a register machine” with reference to this paper which found “… the bytecode size of the register machine being only 26% larger than that of the corresponding stack one”.

WebAssembly使用堆棧計算機模型執行代碼。如果您缺乏精通閱讀，并且對編程語義有所了解，那么“ 使用WebAssembly使Web加速發展 ”一文確實不錯。這也表明了他們為什么選擇堆棧機表示的原因：“堆棧組織僅僅是實現緊湊程序表示的一種方法，因為它已顯示出比寄存器機小”，并參考本文，發現“……字節碼”套準機的大小僅比相應的堆棧大26％”。

Even though the stack machine approach isn’t necessarily faster, it offered smaller bytecode; an incredibly important design goal for internet-based transactions.

即使堆棧機方法不一定更快，它也提供了較小的字節碼。基于Internet的交易的一項非常重要的設計目標。

So how can we understand the text-format as a stack machine. As we read the code line by line we end up pushing arguments to the stack, then popping them off, doing some computation and pushing the result back. And repeat.

那么我們如何才能將文本格式理解為堆棧機。當我們逐行讀取代碼時，最終將參數推入堆棧，然后將其彈出，進行一些計算并將結果推回。重復一遍。

At first it might seem a little odd to have a text format, if in the end it will be compiled to the binary format for compression. But, the internet has always had the policy of viewing the source and that’s why the developers behind WebAssembly produced the text format. To go one step further and avoid conflicts of syntax they used the Lisp-like s-expression style.

最初，使用文本格式似乎有些奇怪，如果最后將其編譯為二進制格式以進行壓縮。但是，互聯網一直以來都有查看源代碼的政策，這就是WebAssembly背后的開發人員生成文本格式的原因。為了更進一步并避免語法沖突，他們使用了類似Lisp的s-expression樣式。

安全和沙箱 (Safety and Sandboxing)

One of the greatest sources of bugs (and exploits) in unsafe languages is buffer overflows. C and C++ are almost interchangeable with this idea and it is one of the first aspects you are taught when learning these languages. In exchange for a little overhead costs, WebAssembly adds this safety net by enforcing fixed-sized, indexed memory (although certain memory can be grown).

不安全語言中最大的錯誤(和漏洞利用)來源之一是緩沖區溢出。 C和C ++幾乎可以與這個想法互換，這是學習這些語言時首先要教的內容之一。為了換來一點開銷，WebAssembly通過強制使用固定大小的索引內存(盡管可以增加某些內存)來添加此安全網。

The local variables to our function, for example$var0 , are not referenced by address but instead are indexed providing a layer of security. Access is granted via the get_local and set_local commands which all happens within the index space of the local variables.

我們函數的局部變量(例如$var0 )沒有被地址引用，而是被索引以提供一層安全性。通過get_local和set_local命令授予訪問權限， set_local命令均發生在局部變量的索引空間內。

Memory security was a top priority when designing WebAssembly. Straight from the documentation: “Linear memory is sandboxed; it does not alias other linear memories, the execution engine’s internal data structures, the execution stack, local variables, or other process memory.” Lin Clark, again, wrote a great article describing this.

在設計WebAssembly時，內存安全是重中之重。直接來自文檔：“ 線性內存已沙盒化；它不會混淆其他線性內存，執行引擎的內部數據結構，執行堆棧，局部變量或其他進程內存。” 林克拉克(Lin Clark)再次寫了一篇很棒的文章對此進行了描述。

The basic idea is comparable to the Javascript ArrayBuffer object — resizable and bound-checked. What we’re trying to achieve is program isolation to prevent errors and malicious code from spreading and corrupting data it shouldn’t even have access to.

其基本思想可與Javascript ArrayBuffer對象媲美-可調整大小并進行邊界檢查。我們正在試圖實現的是程序隔離，以防止錯誤和惡意代碼傳播和破壞它甚至不應該訪問的數據。

WebAssembly可以做什么？ (What can WebAssembly do?)

One of the major end-goals for WebAssembly was revolutionising what was possible in terms of graphics on the web. The classic examples are ZenGarden by EpicGames and Tanks!.

WebAssembly的主要最終目標之一是徹底改變了網絡圖形的可能性。經典的例子是EpicGames和Tanks提供的ZenGarden ！。

Thanks to its design, WebAssembly marks a pivotal moment in web development. The internet has a new tool in its arsenal to create amazing experiences and share information. WebAssembly provides smaller code-sizes, faster execution, greater security and a lot of room for extensibility. With ideas like threads, single-instruction multiple-data (SIMD) primitives and zero-cost execution on the horizon, WebAssembly’s abilities look only set to expand.

由于其設計，WebAssembly標志著Web開發的關鍵時刻。互聯網在其工具庫中提供了一種新工具，可以創造令人驚奇的體驗并共享信息。 WebAssembly提供了較小的代碼大小，更快的執行速度，更高的安全性以及可擴展性的余地。有了線程，單指令多數據( SIMD )原語和零成本執行等想法，WebAssembly的功能似乎只會擴展。