前言:
本筆記是基于對RISC-V? DSP擴展指令集文檔總結的,《P-ext-proposal.pdf》文檔的關鍵內容如下:
主要介紹了RISC-V的P擴展指令集及其相關細節。
首先,對P擴展指令進行了概述,并列出了其與其他擴展重復的指令。
接著,詳細描述了P擴展的子集,包括Zbpbo擴展和Zpn擴展(適用于RV32和RV64)的指令。
此外,還提供了僅適用于RV64的詳細指令描述。
文檔還介紹了新的用戶控制和狀態寄存器,并提供了指令編碼表。最后,列出了因RVB重疊而被移除的指令。
這份文檔為RISC-V的P擴展指令集提供了全面而詳細的信息,包括指令的描述、編碼、以及與其他擴展的關系。這對于理解、開發和優化基于RISC-V架構的系統非常有價值。同時,文檔也提醒了開發者在使用P擴展時需要注意的兼容性和優化問題。
1. 介紹
數字信號處理(DSP)已成為現代電子系統的重要技術。廣泛的現代應用利用DSP算法解決特定領域的問題,包括傳感器融合、伺服電機控制、音頻解碼/編碼、語音合成和編碼、MPEG4解碼、醫學成像、計算機視覺、嵌入式控制、機器人、人機交互等。
提出的P指令集擴展提高了RISC-V CPU IP產品的DSP算法處理能力。通過添加RISC-V P指令集擴展,RISC-V CPU現在可以以更低的功耗和更高的性能運行這些各種DSP應用程序。
2. 縮寫定義和術語
2.1 縮寫定義
-
r.H == rH1: r[31:16],r.L == r.H0: r[15:0]
- r.H 表示寄存器的高 16 位(位 31 到 16),等同于 rH1。
- r.L 表示寄存器的低 16 位(位 15 到 0),等同于 r.H0。
-
r.B3: r[31:24],r.B2: r[23:16],r.B1: r[15:8],r.B0: r[7:0]
- r.B3 到 r.B0 分別表示從高位到低位的 8 位段。
-
r.B[x]: r[(x8+7):(x8+0)]
- r.B[x] 表示從第 x 個 8 位段開始的 8 位數據。
-
r.H[x]: r[(x16+15):(x16+0)]
- r.H[x] 表示從第 x 個 16 位段開始的 16 位數據。
-
r.W[x]: r[(x32+31):(x32+0)]
- r.W[x] 表示從第 x 個 32 位段開始的 32 位數據。
-
r.D[x]: r[(x64+63):(x64+0)]
- r.D[x] 表示從第 x 個 64 位段開始的 64 位數據。
-
r[xU]: 64 位數的上 32 位;xU 代表包含此上部分 32 位值的 GPR(通用寄存器)編號。
-
r[xL]: 64 位數的下 32 位;xL 代表包含此下部分 32 位值的 GPR 編號。
-
r[xU].r[xL]: 由一對 GPR 形成的 64 位數。
-
s>>: 有符號算術右移。
-
u>>: 無符號邏輯右移。
-
u<<: 邏輯左移,從右側移入 0。
-
SAT.Qn(): 飽和至 [-2n, 2n-1] 范圍內,若發生飽和,則設置 OV 標志。
-
SAT.Um(): 飽和至 [0, 2m-1] 范圍內,若發生飽和,則設置 OV 標志。
-
ROUND(): 表示“四舍五入”,即向最高有效位加 1。
這些縮寫定義和術語提供了對特定指令集或處理器架構中使用的寄存器和操作的簡化表示。它們通常用于硬件描述語言、匯編語言或低級編程中,以簡化復雜操作和提高代碼可讀性。
2.2. 術語
? Q格式(Qm.n):它描述了一個有符號的二進制定點數格式。“m”是包括符號位和整數位在內的位數,位于假想的二進制點之前,而“n”是跟隨其后的分數位數。這種表示法代表一個在-2^(m-1)(包含)和2^(m-1)(不包含)范圍內的有符號二進制定點值,該范圍內有2^(m+n)個唯一值。例如,Q1.15表示一個在-1(包含)和1(不包含)范圍內的數,該范圍內有65536個唯一值。
? Qn:Q1.n的縮寫格式。例如,Q7,Q15,Q31,Q63。
? Um:它表示一個無符號的二進制數,范圍在0到(2^m)-1之間。
3. RISC-V P 擴展指令
3.1. SIMD 數據處理指令
3.1.1. 16位加法和減法指令
基于32位字元素內的兩種16位算術運算類型的組合,SIMD 16位加/減指令可以分為6個主要類別:加法(兩個16位加法)、減法(兩個16位減法)、交叉加和減(一個加法和一個減法)、交叉減和加(一個減法和一個加法)、直接加和減(一個加法和一個減法)以及直接減和加(一個減法和一個加法)。
基于處理溢出條件的方式,SIMD 16位加/減指令可以分為5組:環繞(丟棄溢出)、有符號減半(通過丟棄最低有效位來保留溢出)、無符號減半、有符號飽和(剪裁溢出)和無符號飽和。
序號 | 指令 | 說明 |
1 | ADD16 rd, rs1, rs2 | 16-bit Addition |
2 | RADD16 rd, rs1, rs2 | 16-bit Signed Halving Addition |
3 | URADD16 rd, rs1, rs2 | 16-bit Unsigned Halving Addition? |
4 | KADD16 rd, rs1, rs2 | 16-bit Signed Saturating Addition |
5 | UKADD16 rd, rs1, rs2 | 16-bit Unsigned Saturating Addition |
6 | SUB16 rd, rs1, rs2 | 16-bit Subtraction |
7 | RSUB16 rd, rs1, rs2 | 16-bit Signed Halving Subtraction |
8 | URSUB16 rd, rs1, rs2 | 16-bit Unsigned Halving Subtraction |
9 | KSUB16 rd, rs1, rs2 | 16-bit Signed Saturating Subtraction |
10 | UKSUB16 rd, rs1, rs2 | 16-bit Unsigned Saturating Subtraction |
11 | CRAS16 rd, rs1, rs2 | 16-bit Cross Add & Sub |
12 | RCRAS16 rd, rs1, rs2 | 16-bit Signed Halving Cross Add & Sub |
13 | URCRAS16 rd, rs1, rs2 | 16-bit Unsigned Halving Cross Add & Sub |
14 | KCRAS16 rd, rs1, rs2 | 16-bit Signed Saturating Cross Add & Sub |
15 | UKCRAS16 rd, rs1, rs2 | 16-bit Unsigned Saturating Cross Add & Sub |
16 | CRSA16 rd, rs1, rs2 | 16-bit Cross Sub & Add |
17 | RCRSA16 rd, rs1, rs2 | 16-bit Signed Halving Cross Sub & Add |
18 | URCRSA16 rd, rs1, rs2 | 16-bit Unsigned Halving Cross Sub & Add |
19 | KCRSA16 rd, rs1, rs2 | 16-bit Signed Saturating Cross Sub & Add |
20 | UKCRSA16 rd, rs1, rs2 | 16-bit Unsigned Saturating Cross Sub & Add |
21 | STAS16 rd, rs1, rs2 | 16-bit Straight Add & Sub |
22 | RSTAS16 rd, rs1, rs2 | 16-bit Signed Halving Straight Add & Sub |
23 | URSTAS16 rd, rs1, rs2 | 16-bit Unsigned Halving Straight Add & Sub |
24 | KSTAS16 rd, rs1, rs2 | 16-bit Signed Saturating Straight Add & Sub |
25 | UKSTAS16 rd, rs1, rs2 | 16-bit Unsigned Saturating Straight Add & Sub |
26 | STSA16 rd, rs1, rs2 | 16-bit Straight Sub & Add |
27 | RSTSA16 rd, rs1, rs2 | 16-bit Signed Halving Straight Sub & Add |
28 | URSTSA16 rd, rs1, rs2 | 16-bit Unsigned Halving Straight Sub & Add |
29 | KSTSA16 rd, rs1, rs2 | 16-bit Signed Saturating Straight Sub & Add |
30 | UKSTSA16 rd, rs1, rs2 | 16-bit Unsigned Saturating Straight Sub & Add |
3.1.2. 8位加法和減法指令
基于32位字元素內四個8位算術運算的類型,SIMD 8位加/減指令可以分為兩大類別:加法(執行四個8位加法)和減法(執行四個8位減法)。
根據有符號或無符號運算中處理溢出條件的方式,SIMD 8位加/減指令又可以進一步分為五組:環繞(即丟棄溢出部分)、有符號減半(通過丟棄最低有效位來保留溢出)、無符號減半、有符號飽和(通過剪裁來處理溢出)和無符號飽和。
序號 | 指令 | 說明 |
1 | ADD8 rd, rs1, rs2 | 8-bit Addition |
2 | RADD8 rd, rs1, rs2 | 8-bit Signed Halving Addition |
3 | URADD8 rd, rs1, rs2 | 8-bit Unsigned Halving Addition |
4 | KADD8 rd, rs1, rs2 | 8-bit Signed Saturating Addition |
5 | UKADD8 rd, rs1, rs2 | 8-bit Unsigned Saturating Addition |
6 | SUB8 rd, rs1, rs2 | 8-bit Subtraction |
7 | RSUB8 rd, rs1, rs2 | 8-bit Signed Halving Subtraction |
8 | URSUB8 rd, rs1, rs2 | 8-bit Unsigned Halving Subtraction |
9 | KSUB8 rd, rs1, rs2 | 8-bit Signed Saturating Subtraction |
10 | UKSUB8 rd, rs1, rs2 | 8-bit Unsigned Saturating Subtraction |
3.1.3. 16位移位指令
Table 3. SIMD 16-bit Shift Instructions
序號 | 指令 | 說明 |
1 | SRA16 rd, rs1, rs2 | 16-bit Shift Right Arithmetic |
2 | SRAI16 rd, rs1, im4u | 16-bit Shift Right Arithmetic Immediate |
3 | SRA16.u rd, rs1, rs2 | 16-bit Rounding Shift Right Arithmetic |
4 | SRAI16.u rd, rs1, im4u | 16-bit Rounding Shift Right Arithmetic Immediate |
5 | SRL16 rd, rs1, rs2 | 16-bit Shift Right Logical |
6 | SRLI16 rd, rs1, im4u | 16-bit Shift Right Logical Immediate\ |
7 | SRL16.u rd, rs1, rs2 | 16-bit Rounding Shift Right Logical |
8 | SRLI16.u rd, rs1, im4u | 16-bit Rounding Shift Right Logical Immediate |
9 | SLL16 rd, rs1, rs2 | 16-bit Shift Left Logical |
10 | SLLI16 rd, rs1, im4u | 16-bit Shift Left Logical Immediate |
11 | KSLL16 rd, rs1, rs2 | 16-bit Saturating Shift Left Logical |
12 | KSLLI16 rd, rs1, im4u | 16-bit Saturating Shift Left Logical Immediate |
13 | KSLRA16 rd, rs1, rs2 | 16-bit Shift Left Logical with Saturation & Shift Right Arithmetic |
14 | KSLRA16.u rd, rs1, rs2 | 16-bit Shift Left Logical with Saturation & Rounding Shift Right Arithmetic |
3.1.4. 8位移位指令
Table 4. SIMD 8-bit Shift Instructions
序號 | 指令 | 說明 |
1 | SRA8 rd, rs1, rs2 | 8-bit Shift Right Arithmetic |
2 | SRAI8 rd, rs1, im4u | 8-bit Shift Right Arithmetic Immediate |
3 | SRA8.u rd, rs1, rs2 | 8-bit Rounding Shift Right Arithmetic |
4 | SRAI8.u rd, rs1, im4u | 8-bit Rounding Shift Right Arithmetic Immediate |
5 | SRL8 rd, rs1, rs2 | 8-bit Shift Right Logical |
6 | SRLI8 rd, rs1, im4u | 8-bit Shift Right Logical Immediate |
7 | SRL8.u rd, rs1, rs2 | 8-bit Rounding Shift Right Logical |
8 | SRLI8.u rd, rs1, im4u | 8-bit Rounding Shift Right Logical Immediate |
9 | SLL8 rd, rs1, rs2 | 8-bit Shift Left Logical |
10 | SLLI8 rd, rs1, im4u | 8-bit Shift Left Logical Immediate |
11 | KSLL8 rd, rs1, rs2 | 8-bit Saturating Shift Left Logical |
12 | KSLLI8 rd, rs1, im4u | 8-bit Saturating Shift Left Logical Immediate |
13 | KSLRA8 rd, rs1, rs2 | 8-bit Shift Left Logical with Saturation & Shift Right Arithmetic |
14 | KSLRA8.u rd, rs1, rs2 | 8-bit Shift Left Logical with Saturation & Rounding Shift Right Arithmetic |
3.1.5. 16位比較指令
Table 5. SIMD 16-bit Compare Instructions
序號 | 指令 | 說明 |
1 | CMPEQ16 rd, rs1, rs2 | 16-bit Compare Equal |
2 | SCMPLT16 rd, rs1, rs2 | 16-bit Signed Compare Less Than |
3 | SCMPLE16 rd, rs1, rs2 | 16-bit Signed Compare Less Than & Equal |
4 | UCMPLT16 rd, rs1, rs2 | 16-bit Unsigned Compare Less Than |
5 | UCMPLE16 rd, rs1, rs2 | 16-bit Unsigned Compare Less Than & Equal |
3.1.6. 8位比較指令
Table 6. SIMD 8-bit Compare Instructions
序號 | 指令 | 說明 |
1 | CMPEQ8 rd, rs1, rs2 | 8-bit Compare Equal |
2 | SCMPLT8 rd, rs1, rs2 | 8-bit Signed Compare Less Than |
3 | SCMPLE8 rd, rs1, rs2 | 8-bit Signed Compare Less Than & Equal |
4 | UCMPLT8 rd, rs1, rs2 | 8-bit Unsigned Compare Less Than |
5 | UCMPLE8 rd, rs1, rs2 | 8-bit Unsigned Compare Less Than & Equal |
3.1.7. 16位乘法指令
Table 7. SIMD 16-bit Multiply Instructions
序號 | 指令 | 說明 |
1 | SMUL16 rd, rs1, rs2 | 16-bit Signed Multiply |
2 | SMULX16 rd, rs1, rs2 | 16-bit Signed Crossed Multiply |
3 | UMUL16 rd, rs1, rs2 | 16-bit Unsigned Multiply |
4 | UMULX16 rd, rs1, rs2 | 16-bit Unsigned Crossed Multiply |
5 | KHM16 rd, rs1, rs2 | Q15 Signed Saturating Multiply |
6 | KHMX16 rd, rs1, rs2 | Q15 Signed Saturating Crossed Multiply |
3.1.8. 8位乘法指令
Table 8. SIMD 8-bit Multiply Instructions
序號 | 指令 | 說明 |
1 | SMUL8 rd, rs1, rs2 | 8-bit Signed Multiply |
2 | SMULX8 rd, rs1, rs2 | 8-bit Signed Crossed Multiply |
3 | UMUL8 rd, rs1, rs2 | 8-bit Unsigned Multiply |
4 | UMULX8 rd, rs1, rs2 | 8-bit Unsigned Crossed Multiply |
5 | KHM8 rd, rs1, rs2 | Q8 Signed Saturating Multiply |
6 | KHMX8 rd, rs1, rs2 | Q8 Signed Saturating Crossed Multiply |
3.1.9. 16位其他指令
Table 9. SIMD 16-bit Miscellaneous Instructions
序號 | 指令 | 說明 |
1 | SMIN16 rd, rs1, rs2 | 16-bit Signed Minimum |
2 | UMIN16 rd, rs1, rs2 | 16-bit Unsigned Minimum |
3 | SMAX16 rd, rs1, rs2 | 16-bit Signed Maximum |
4 | UMAX16 rd, rs1, rs2 | 16-bit Unsigned Maximum |
5 | SCLIP16 rd, rs1, imm4u | 16-bit Signed Clip Value |
6 | UCLIP16 rd, rs1, imm4u | 16-bit Unsigned Clip Value |
7 | KABS16 rd, rs1 | 16-bit Absolute Value |
8 | CLRS16 rd, rs1 | 16-bit Count Leading Redundant Sign |
9 | CLZ16 rd, rs1 | 16-bit Count Leading Zero |
10 | SWAP16 rd, rs1 | Swap Halfword within Word |
3.1.10. 8位其他指令
Table 10. SIMD 8-bit Miscellaneous Instructions
序號 | 指令 | 說明 |
1 | SMIN8 rd, rs1, rs2 | 8-bit Signed Minimum |
2 | UMIN8 rd, rs1, rs2 | 8-bit Unsigned Minimum |
3 | SMAX8 rd, rs1, rs2 | 8-bit Signed Maximum |
4 | UMAX8 rd, rs1, rs2 | 8-bit Unsigned Maximum |
5 | SCLIP8 rd, rs1, imm4u | 8-bit Signed Clip Value |
6 | UCLIP8 rd, rs1, imm4u | 8-bit Unsigned Clip Value |
7 | KABS8 rd, rs1 | 8-bit Absolute Value |
8 | CLRS8 rd, rs1 | 8-bit Count Leading Redundant Sign |
9 | CLZ8 rd, rs1 | 8-bit Count Leading Zero |
10 | SWAP8 rd, rs1 | Swap Halfword within Word |
3.1.11. 8位解壓指令
Table 10. SIMD 8-bit Unpacking Instructions
序號 | 指令 | 說明 |
1 | SUNPKD810 rd, rs1 | Signed Unpacking Bytes 1 & 0 |
2 | SUNPKD820 rd, rs1 | Signed Unpacking Bytes 2 & 0 |
3 | SUNPKD830 rd, rs1 | Signed Unpacking Bytes 3 & 0 |
4 | SUNPKD831 rd, rs1 | Signed Unpacking Bytes 3 & 1 |
5 | SUNPKD832 rd, rs1 | Signed Unpacking Bytes 3 & 2 |
6 | ZUNPKD810 rd, rs1 | Unsigned Unpacking Bytes 1 & 0 |
7 | ZUNPKD820 rd, rs1 | Unsigned Unpacking Bytes 2 & 0 |
8 | ZUNPKD830 rd, rs1 | Unsigned Unpacking Bytes 3 & 0 |
9 | ZUNPKD831 rd, rs1 | Unsigned Unpacking Bytes 3 & 1 |
10 | ZUNPKD832 rd, rs1 | Unsigned Unpacking Bytes 3 & 2 |
RISC-V? DSP擴展指令集文檔:
https://download.csdn.net/download/u011376987/88898800