LuaJIT 學習(5)—— string.buffer 庫

文章目錄

    • Using the String Buffer Library
      • Buffer Objects
      • Buffer Method Overview
    • Buffer Creation and Management
      • `local buf = buffer.new([size [,options]]) local buf = buffer.new([options])`
      • `buf = buf:reset()`
      • `buf = buf:free()`
    • Buffer Writers
      • `buf = buf:put([str|num|obj] [,…])`
      • `buf = buf:putf(format, …)`
      • `buf = buf:putcdata(cdata, len)`FFI
      • `buf = buf:set(str) `
      • `buf = buf:set(cdata, len)`FFI
      • `ptr, len = buf:reserve(size)`FFI
      • `buf = buf:commit(used)`FFI
    • Buffer Readers
      • `len = #buf`
      • `res = str|num|buf .. str|num|buf […]`
      • `buf = buf:skip(len)`
      • `str, … = buf:get([len|nil] [,…])`
      • `str = buf:tostring() `
      • `str = tostring(buf)`
      • `ptr, len = buf:ref()`FFI
    • Serialization of Lua Objects
        • 例子:序列化 Lua 對象
    • Error handling
    • FFI caveats
        • 例子說明:

The string buffer library allows high-performance manipulation of string-like data.

Unlike Lua strings, which are constants, string buffers are mutable sequences of 8-bit (binary-transparent) characters. Data can be stored, formatted and encoded into a string buffer and later converted, extracted or decoded.

The convenient string buffer API simplifies common string manipulation tasks, that would otherwise require creating many intermediate strings. String buffers improve performance by eliminating redundant memory copies, object creation, string interning and garbage collection overhead. In conjunction with the FFI library, they allow zero-copy operations.

The string buffer library also includes a high-performance serializer for Lua objects.

Using the String Buffer Library

The string buffer library is built into LuaJIT by default, but it’s not loaded by default. Add this to the start of every Lua file that needs one of its functions:

local buffer = require("string.buffer")

The convention for the syntax shown on this page is that buffer refers to the buffer library and buf refers to an individual buffer object.

Please note the difference between a Lua function call, e.g. buffer.new() (with a dot) and a Lua method call, e.g. buf:reset() (with a colon).

Buffer Objects

A buffer object is a garbage-collected Lua object. After creation with buffer.new(), it can (and should) be reused for many operations. When the last reference to a buffer object is gone, it will eventually be freed by the garbage collector, along with the allocated buffer space.

Buffers operate like a FIFO (first-in first-out) data structure. Data can be appended (written) to the end of the buffer and consumed (read) from the front of the buffer. These operations may be freely mixed.

The buffer space that holds the characters is managed automatically — it grows as needed and already consumed space is recycled. Use buffer.new(size) and buf:free(), if you need more control.

The maximum size of a single buffer is the same as the maximum size of a Lua string, which is slightly below two gigabytes. For huge data sizes, neither strings nor buffers are the right data structure — use the FFI library to directly map memory or files up to the virtual memory limit of your OS.

Buffer Method Overview

  • The buf:put*()-like methods append (write) characters to the end of the buffer.
  • The buf:get*()-like methods consume (read) characters from the front of the buffer.
  • Other methods, like buf:tostring() only read the buffer contents, but don’t change the buffer.
  • The buf:set() method allows zero-copy consumption of a string or an FFI cdata object as a buffer.
  • The FFI-specific methods allow zero-copy read/write-style operations or modifying the buffer contents in-place. Please check the FFI caveats below, too.
  • Methods that don’t need to return anything specific, return the buffer object itself as a convenience. This allows method chaining, e.g.: buf:reset():encode(obj) or buf:skip(len):get()

Buffer Creation and Management

local buf = buffer.new([size [,options]]) local buf = buffer.new([options])

Creates a new buffer object.

The optional size argument ensures a minimum initial buffer size. This is strictly an optimization when the required buffer size is known beforehand. The buffer space will grow as needed, in any case.

The optional table options sets various serialization options.

buf = buf:reset()

Reset (empty) the buffer. The allocated buffer space is not freed and may be reused.

buf = buf:free()

The buffer space of the buffer object is freed. The object itself remains intact, empty and may be reused.

Note: you normally don’t need to use this method. The garbage collector automatically frees the buffer space, when the buffer object is collected. Use this method, if you need to free the associated memory immediately.

Buffer Writers

buf = buf:put([str|num|obj] [,…])

Appends a string str, a number num or any object obj with a __tostring metamethod to the buffer. Multiple arguments are appended in the given order.

Appending a buffer to a buffer is possible and short-circuited internally. But it still involves a copy. Better combine the buffer writes to use a single buffer.

buf = buf:putf(format, …)

Appends the formatted arguments to the buffer. The format string supports the same options as string.format().

buf = buf:putcdata(cdata, len)FFI

Appends the given len number of bytes from the memory pointed to by the FFI cdata object to the buffer. The object needs to be convertible to a (constant) pointer.

buf = buf:set(str)

buf = buf:set(cdata, len)FFI

This method allows zero-copy consumption of a string or an FFI cdata object as a buffer. It stores a reference to the passed string str or the FFI cdata object in the buffer. Any buffer space originally allocated is freed. This is not an append operation, unlike the buf:put*() methods.

After calling this method, the buffer behaves as if buf:free():put(str) or buf:free():put(cdata, len) had been called. However, the data is only referenced and not copied, as long as the buffer is only consumed.

In case the buffer is written to later on, the referenced data is copied and the object reference is removed (copy-on-write semantics).

The stored reference is an anchor for the garbage collector and keeps the originally passed string or FFI cdata object alive.

ptr, len = buf:reserve(size)FFI

buf = buf:commit(used)FFI

The reserve method reserves at least size bytes of write space in the buffer. It returns an uint8_t * FFI cdata pointer ptr that points to this space.

The available length in bytes is returned in len. This is at least size bytes, but may be more to facilitate efficient buffer growth. You can either make use of the additional space or ignore len and only use size bytes.

The commit method appends the used bytes of the previously returned write space to the buffer data.

This pair of methods allows zero-copy use of C read-style APIs:

local MIN_SIZE = 65536
repeatlocal ptr, len = buf:reserve(MIN_SIZE)local n = C.read(fd, ptr, len)if n == 0 then break end -- EOF.if n < 0 then error("read error") endbuf:commit(n)
until false

The reserved write space is not initialized. At least the used bytes must be written to before calling the commit method. There’s no need to call the commit method, if nothing is added to the buffer (e.g. on error).

Buffer Readers

len = #buf

Returns the current length of the buffer data in bytes.

res = str|num|buf .. str|num|buf […]

The Lua concatenation operator .. also accepts buffers, just like strings or numbers. It always returns a string and not a buffer.

Note that although this is supported for convenience, this thwarts one of the main reasons to use buffers, which is to avoid string allocations. Rewrite it with buf:put() and buf:get().

Mixing this with unrelated objects that have a __concat metamethod may not work, since these probably only expect strings.

buf = buf:skip(len)

Skips (consumes) len bytes from the buffer up to the current length of the buffer data.

str, … = buf:get([len|nil] [,…])

Consumes the buffer data and returns one or more strings. If called without arguments, the whole buffer data is consumed. If called with a number, up to len bytes are consumed. A nil argument consumes the remaining buffer space (this only makes sense as the last argument). Multiple arguments consume the buffer data in the given order.

Note: a zero length or no remaining buffer data returns an empty string and not nil.

str = buf:tostring()

str = tostring(buf)

Creates a string from the buffer data, but doesn’t consume it. The buffer remains unchanged.

Buffer objects also define a __tostring metamethod. This means buffers can be passed to the global tostring() function and many other functions that accept this in place of strings. The important internal uses in functions like io.write() are short-circuited to avoid the creation of an intermediate string object.

ptr, len = buf:ref()FFI

Returns an uint8_t * FFI cdata pointer ptr that points to the buffer data. The length of the buffer data in bytes is returned in len.

The returned pointer can be directly passed to C functions that expect a buffer and a length. You can also do bytewise reads (local x = ptr[i]) or writes (ptr[i] = 0x40) of the buffer data.

In conjunction with the skip method, this allows zero-copy use of C write-style APIs:

repeatlocal ptr, len = buf:ref()if len == 0 then break endlocal n = C.write(fd, ptr, len)if n < 0 then error("write error") endbuf:skip(n)
until n >= len

Unlike Lua strings, buffer data is not implicitly zero-terminated. It’s not safe to pass ptr to C functions that expect zero-terminated strings. If you’re not using len, then you’re doing something wrong.

Serialization of Lua Objects

略過

例子:序列化 Lua 對象
local buffer = require("string.buffer")-- 創建一個元表
local mt1 = { __index = function(t, k) return "default" end }
local mt2 = { __index = function(t, k) return "another default" end }-- 創建需要序列化的表
local t1 = setmetatable({ key1 = "value1", key2 = "value2" }, mt1)
local t2 = setmetatable({ key1 = "value3", key2 = "value4" }, mt2)-- 定義字典和元表的數組
local dict = {"key1", "key2"}
local metatable = {mt1, mt2}-- 使用 buffer.new() 進行序列化
local buffer_obj = buffer.new({dict = dict,metatable = metatable
})-- 假設序列化后的數據為序列化函數 `encode()`
local serialized_data = buffer_obj:encode({t1, t2})-- 反序列化
local decoded_data = buffer_obj:decode(serialized_data)-- 訪問解碼后的數據
for _, tbl in ipairs(decoded_data) doprint(tbl.key1, tbl.key2)
end

Error handling

Many of the buffer methods can throw an error. Out-of-memory or usage errors are best caught with an outer wrapper for larger parts of code. There’s not much one can do after that, anyway.

OTOH, you may want to catch some errors individually. Buffer methods need to receive the buffer object as the first argument. The Lua colon-syntax obj:method() does that implicitly. But to wrap a method with pcall(), the arguments need to be passed like this:

local ok, err = pcall(buf.encode, buf, obj)
if not ok then-- Handle error in err.
end

FFI caveats

The string buffer library has been designed to work well together with the FFI library. But due to the low-level nature of the FFI library, some care needs to be taken:

First, please remember that FFI pointers are zero-indexed. The space returned by buf:reserve() and buf:ref() starts at the returned pointer and ends before len bytes after that.

I.e. the first valid index is ptr[0] and the last valid index is ptr[len-1]. If the returned length is zero, there’s no valid index at all. The returned pointer may even be NULL.

The space pointed to by the returned pointer is only valid as long as the buffer is not modified in any way (neither append, nor consume, nor reset, etc.). The pointer is also not a GC anchor for the buffer object itself.

Buffer data is only guaranteed to be byte-aligned. Casting the returned pointer to a data type with higher alignment may cause unaligned accesses. It depends on the CPU architecture whether this is allowed or not (it’s always OK on x86/x64 and mostly OK on other modern architectures).

FFI pointers or references do not count as GC anchors for an underlying object. E.g. an array allocated with ffi.new() is anchored by buf:set(array, len), but not by buf:set(array+offset, len). The addition of the offset creates a new pointer, even when the offset is zero. In this case, you need to make sure there’s still a reference to the original array as long as its contents are in use by the buffer.

例子說明:
  1. 正常的引用:當你使用 buf:set(array, len) 時,這個 array 是一個通過 FFI 創建的數組,它會被作為 buf 的參數傳遞進去。在這種情況下,array 被引用,并且只要 buf 依然存在并持有這個引用,array 不會被垃圾回收器回收。這里 array 是一個“垃圾回收錨點”(GC anchor),即它會被垃圾回收器追蹤。
  2. 添加偏移量后的情況:當你通過 array + offset 創建一個新的指針時(即通過加偏移量來引用 array 中的某個元素),這時創建的是一個新的指針對象。即使 offset 為零,array + offset 仍然會被視為一個新的指針。這個新的指針不會自動被垃圾回收器追蹤,因為它并沒有直接引用 array
    • 問題:這意味著,如果你只使用 array + offset(即偏移后的指針),垃圾回收器可能會認為原始的 array 對象不再被使用,最終回收掉 array,即使 buf 仍然依賴于它的內容。這會導致訪問已回收的內存,造成未定義行為或崩潰。

Even though each LuaJIT VM instance is single-threaded (but you can create multiple VMs), FFI data structures can be accessed concurrently. Be careful when reading/writing FFI cdata from/to buffers to avoid concurrent accesses or modifications. In particular, the memory referenced by buf:set(cdata, len) must not be modified while buffer readers are working on it. Shared, but read-only memory mappings of files are OK, but only if the file does not change.

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/web/72305.shtml
繁體地址,請注明出處:http://hk.pswp.cn/web/72305.shtml
英文地址,請注明出處:http://en.pswp.cn/web/72305.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

vue3:request.js中請求方法,api封裝請求,方法請求

方法一 request.js // 封裝GET請求 export const get (url, params {}) > {return request.get(url, { params }); }; // 封裝POST請求 export const post (url, data {}) > {return request.post(url, data); }; api封裝 import { post } from /utils/request; …

Ollama+OpenWebUI本地部署大模型

OllamaOpenWebUI本地部署大模型 前言Ollama使用Ollama安裝Ollama修改配置Ollama 拉取遠程大模型Ollama 構建本地大模型Ollama 運行本地模型&#xff1a;命令行交互Api調用Web 端調用 總結 前言 Ollama是一個開源項目&#xff0c;用于在本地計算機上運行大型語言模型&#xff0…

【機器學習】基于t-SNE的MNIST數據集可視化探索

一、前言 在機器學習和數據科學領域&#xff0c;高維數據的可視化是一個極具挑戰但又至關重要的問題。高維數據難以直觀地理解和分析&#xff0c;而有效的可視化方法能夠幫助我們發現數據中的潛在結構、模式和關系。本文以經典的MNIST手寫數字數據集為例&#xff0c;探討如何利…

【redis】發布訂閱

Redis的發布訂閱&#xff08;Pub/Sub&#xff09;是一種基于消息多播的通信機制&#xff0c;它允許消息的**發布者&#xff08;Publisher&#xff09;向特定頻道發送消息&#xff0c;而訂閱者&#xff08;Subscriber&#xff09;**通過訂閱頻道或模式來接收消息。 其核心特點如…

C語言零基礎入門:嵌入式系統開發之旅

C語言零基礎入門&#xff1a;嵌入式系統開發之旅 一、引言 嵌入式系統開發是當今科技領域中一個極具魅力和挑戰性的方向。從智能家居設備到汽車電子系統&#xff0c;從智能穿戴設備到工業自動化控制&#xff0c;嵌入式系統無處不在。而C語言&#xff0c;作為嵌入式開發中最常…

K8S學習之基礎二十三:k8s的持久化存儲之nfs

K8S持久化存儲之nfs ? 在 Kubernetes (k8s) 中使用 NFS&#xff08;Network File System&#xff09;作為存儲解決方案是一種常見的方式&#xff0c;特別是在需要共享存儲的場景中。以下是關于如何在 Kubernetes 中使用 NFS 存儲的詳細說明&#xff1a; 1. 準備 NFS 服務器 …

【Rust】枚舉和模式匹配——Rust語言基礎14

文章目錄 1. 枚舉類型1.2. Option 枚舉 2. match 控制流結構2.1. match 對綁定值的匹配2.2. Option<T> 的匹配2.3. 通配模式以及 _ 占位符 3. if let 控制流4. 小測試 1. 枚舉類型 枚舉&#xff08;enumerations&#xff09;&#xff0c;也被稱作 enums。枚舉允許你通過…

【商城實戰(25)】解鎖UniApp移動端適配秘籍,打造完美商城體驗

【商城實戰】專欄重磅來襲&#xff01;這是一份專為開發者與電商從業者打造的超詳細指南。從項目基礎搭建&#xff0c;運用 uniapp、Element Plus、SpringBoot 搭建商城框架&#xff0c;到用戶、商品、訂單等核心模塊開發&#xff0c;再到性能優化、安全加固、多端適配&#xf…

《C++ Primer》學習筆記(二)

第二部分&#xff1a;C標準庫 1.為了支持不同種類的IO處理操作&#xff0c;標準庫定義了以下類型的IO&#xff0c;分別定義在三個獨立的文件中&#xff1a;iostream文件中定義了用于讀寫流的基本類型&#xff1b;fstream文件中定義了讀寫命名文件的類型&#xff1b;sstream文件…

MATLAB風光柴儲微網粒子群算法

本程序實現了風光柴儲微網中的粒子群優化&#xff08;PSO&#xff09;算法&#xff0c;用于優化微網的能源調度問題。具體來說&#xff0c;程序考慮了光伏發電、風力發電、柴油機發電&#xff08;柴儲&#xff09;&#xff0c;并使用粒子群算法來優化這些能源的調度&#xff0c…

解決Windows版Redis無法遠程連接的問題

&#x1f31f; 解決Windows版Redis無法遠程連接的問題 在Windows系統下使用Redis時&#xff0c;很多用戶會遇到無法遠程連接的問題。尤其是在配置了Redis并嘗試通過工具如RedisDesktopManager連接時&#xff0c;可能會報錯“Cannot connect to ‘redisconnection’”。今天&am…

解決 HTTP 請求中的編碼問題:從亂碼到正確傳輸

文章目錄 解決 HTTP 請求中的編碼問題&#xff1a;從亂碼到正確傳輸1. **問題背景**2. **亂碼問題的原因**2.1 **客戶端編碼問題**2.2 **請求頭缺失**2.3 **服務器編碼問題** 3. **解決方案**3.1 **明確指定請求體編碼**3.2 **確保請求頭正確**3.3 **動態獲取響應編碼** 4. **調…

VS Code 配置優化指南

目錄 一、安裝與基礎設置1. 安裝 VS Code2. 中文語言包 二、插件推薦三、常見配置項與優化1. 用戶 / 工作區設置2. 全局配置 / Settings Sync3. 常用設置示例 四、性能優化五、調試與終端配置1. 調試配置2. 內置終端配置 六、快捷鍵配置七、美觀與主題八、總結 VS Code&#xf…

基于NXP+FPGA永磁同步電機牽引控制單元(單板結構/機箱結構)

永磁同步電機牽引控制單元&#xff08;單板結構/機箱結構&#xff09; 永磁同步電機牽引控制單元&#xff08;TCU-PMSM&#xff09;用于牽引逆變器-永磁同步電機構成的牽引電傳動系統&#xff0c;采用軸控方式。執行高性能永磁同步電機復矢量控制策略&#xff0c;具有響應迅速…

/etc/sysconfig/jenkins 沒有這個文件

在 CentOS 或其他基于 Red Hat 的 Linux 系統中&#xff0c;/etc/sysconfig/jenkins 文件通常用來存儲 Jenkins 的配置參數&#xff0c;例如 JENKINS_HOME 的路徑。但是&#xff0c;如果你發現沒有這個文件&#xff0c;你可以通過以下幾種方式來解決或確認&#xff1a; 檢查 J…

conda 安裝軟件報錯 Found conflicts! Looking for incompatible packages.

問題描述&#xff1a; 利用 conda 安裝某包 conda install -c "nvidia/label/cuda-11.8.0" cuda-nvcc時發現報錯&#xff1a; Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with…

MySQL 衍生表(Derived Tables)

在SQL的查詢語句select …. from …中&#xff0c;跟在from子句后面的通常是一張擁有定義的實體表&#xff0c;而有的時候我們會用子查詢來扮演實體表的角色&#xff0c;這個在from子句中的子查詢會返回一個結果集&#xff0c;這個結果集可以像普通的實體表一樣查詢、連接&…

STM32配套程序接線圖

1 工程模板 2 LED閃爍 3LED流水燈 4蜂鳴器 5按鍵控制LED 6光敏傳感器控制蜂鳴器 7OLED顯示屏 8對射式紅外傳感器計次 9旋轉編碼器計次 10 定時器定時中斷 11定時器外部時鐘 12PWM驅動LED呼吸燈 13 PWM驅動舵機 14 PWM驅動直流電機 15輸入捕獲模式測頻率 16PWMI模式測頻率占空…

鴻蒙初級考試備忘

Module類型 Module按照使用場景可以分為兩種類型&#xff1a; Ability類型的Module&#xff1a; 用于實現應用的功能和特性。每一個Ability類型的Module編譯后&#xff0c;會生成一個以.hap為后綴的文件&#xff0c;我們稱其為HAP&#xff08;Harmony Ability Package&#x…

語音識別踩坑記錄

本來想在原來的語音識別的基礎上增加本地擴展本地詞典&#xff0c; 采用的語音識別是Vosk識別器&#xff0c;模型是 vosk-model-small-cn-0.22 // 初始化Vosk識別器 if (recognizer null) {using (Model model new Model(modelPath)){string grammar "{""…