【Linux】在Arm服務器源碼編譯onnxruntime-gpu的whl

服務器信息：

aarch64架構
ubuntu20.04
nvidia T4卡

編譯onnxruntime-gpu前置條件：

已經安裝合適的cuda
已經安裝合適的cudnn
已經安裝合適的cmake

源碼編譯onnxruntime-gpu的步驟

1. 下載源碼

git clone --recursive https://github.com/microsoft/onnxruntime.git
cd onnxruntime

2. 選擇版本

然后根據需要安裝的onnxruntime-gpu版本號，切換版本

git checkout v1.16.3

3. 執行bulid指令

在onnxruntime根目錄下執行：

./build.sh \--config Release \--update \--build \--parallel \--build_wheel \--use_cuda \--allow_running_as_root \--cuda_home /usr/local/cuda \--cudnn_home /usr/lib/aarch64-linux-gnu \--skip_tests \--cmake_extra_defines \CMAKE_CUDA_ARCHITECTURES=75 \onnxruntime_ENABLE_NVTX_PROFILE=ON \onnxruntime_USE_MEMORY_EFFICIENT_ATTENTION=OFF \onnxruntime_USE_FLASH_ATTENTION=OFF \onnxruntime_BUILD_UNIT_TESTS=OFF \CMAKE_POLICY_VERSION_MINIMUM=3.5

4. 報錯解決方式

4.1 算力設置不匹配

[ 53%] Built target onnxruntime_optimizer
make: *** [Makefile:166: all] Error 2
Traceback (most recent call last):File "/home/tc/onnxruntime/tools/ci_build/build.py", line 2684, in <module>sys.exit(main())File "/home/tc/onnxruntime/tools/ci_build/build.py", line 2577, in mainbuild_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)File "/home/tc/onnxruntime/tools/ci_build/build.py", line 1487, in build_targetsrun_subprocess(cmd_args, env=env)File "/home/tc/onnxruntime/tools/ci_build/build.py", line 798, in run_subprocessreturn run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)File "/home/tc/onnxruntime/tools/python/util/run.py", line 49, in runcompleted_process = subprocess.run(File "/usr/lib/python3.8/subprocess.py", line 516, in runraise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/home/tc/cmake-3.26.0-linux-aarch64/bin/cmake', '--build', '/home/tc/onnxruntime/build/Linux/Release', '--config', 'Release', '--', '-j40']' returned non-zero exit status 2.

問題原因：
./bulid.sh的參數CMAKE_CUDA_ARCHITECTURES=87 表示目標 GPU 的計算能力為 8.7，請確認您的硬件是否匹配：
可以直接用AI來查詢，T4需要填寫75，解決此問題

4.2 下載庫文件超時

-- Using src='https://github.com/pytorch/cpuinfo/archive/ca678952a9a8eaa6de112d154e8e104b22f9ab3f.zip'
CMake Error at pytorch_cpuinfo-subbuild/pytorch_cpuinfo-populate-prefix/src/pytorch_cpuinfo-populate-stamp/download-pytorch_cpuinfo-populate.cmake:170 (message):Each download failed!error: downloading 'https://github.com/pytorch/cpuinfo/archive/ca678952a9a8eaa6de112d154e8e104b22f9ab3f.zip' failedstatus_code: 28status_string: "Timeout was reached"log:--- LOG BEGIN ---Trying 20.205.243.166:443...connect to 20.205.243.166 port 443 failed: Connection timed outFailed to connect to github.com port 443 after 131336 ms: Couldn't connectto serverClosing connection 0--- LOG END ---

問題原因：
下載庫文件速度過慢超時
解決方案
手動下載，并放到指定位置，再重新執行編譯指令。
以上述報錯為例：
網頁下載資源，瀏覽器輸入地址自動下載壓縮包：

https://github.com/pytorch/cpuinfo/archive/ca678952a9a8eaa6de112d154e8e104b22f9ab3f.zip

將下載的 cpuinfo-ca678952a9a8eaa6de112d154e8e104b22f9ab3f.zip，解壓到相對路徑onnxruntime/build/Linux/Release/_deps/pytorch_cpuinfo-subbuild/pytorch_cpuinfo-populate-prefix/src/ 目錄下。(報錯里會寫明需要放置的位置，根據實際情況修改路徑)

然后重新執行bulid.sh的指令就可以繼續編譯。

4.3 不支持BFLOAT16

NVCC_ERROR = nvcc fatal   : Unknown option '-Wstrict-aliasing'NVCC_OUT = 1
CMake Error at CMakeLists.txt:695 (message):The compiler doesn't support BFLOAT16!!!-- Configuring incomplete, errors occurred!
Traceback (most recent call last):File "/home/tc/onnxruntime/tools/ci_build/build.py", line 2998, in <module>sys.exit(main())File "/home/tc/onnxruntime/tools/ci_build/build.py", line 2853, in maingenerate_build_tree(File "/home/tc/onnxruntime/tools/ci_build/build.py", line 1674, in generate_build_treerun_subprocess(File "/home/tc/onnxruntime/tools/ci_build/build.py", line 867, in run_subprocessreturn run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)File "/home/tc/onnxruntime/tools/python/util/run.py", line 49, in runcompleted_process = subprocess.run(File "/usr/lib/python3.8/subprocess.py", line 516, in runraise CalledProcessError(retcode, process.args,

問題原因：
ONNX Runtime 從 v1.17.0 起要求 ARM 架構支持 BFLOAT16 指令集，當前編譯器版本不支持。
解決方式一：升級編譯器 / 系統環境
官方文檔和討論指出：在 JetPack 5.x (Ubuntu 18.04/20.04) 上編譯時，GCC 必須 ≥10 (JetPack 6 對應 Ubuntu 22.04，自帶 GCC11+)。也就是說，最簡單的解決方案是升級到 JetPack 6 (Ubuntu 22.04/GCC11) 或手動安裝較新版本的 GCC（如通過 sudo apt install gcc-11 g+±11 并更新 alternatives）。升級要求：Ubuntu 22.04，不然可能無法直接安裝gcc-12 和 g+±12 。
升級后，-march=armv8.2-a+bf16 檢查就會通過。表明需要更高版本的編譯器以支持 BF16。

首先，添加包含較新 GCC 版本的 PPA：

sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt update

此 PPA 提供了多個版本的 GCC 和 G++，包括 gcc-12 和 g+±12 。

安裝所需版本的 GCC 和 G++

sudo apt install gcc-12 g++-12

使用 update-alternatives 設置默認版本

sudo update-alternatives --install /usr/bin/gcc  gcc  /usr/bin/gcc-12 120
sudo update-alternatives --install /usr/bin/g++  g++  /usr/bin/g++-12 120

如果系統中安裝了多個版本的 GCC 和 G++，可以使用以下命令手動選擇默認版本：

sudo update-alternatives --config gcc
sudo update-alternatives --config g++

驗證當前版本

gcc --version
g++ --version

重新編譯

解決方式二：修改源碼繞過 BFLOAT16 檢查
若必須在現有環境下編譯，可以手動修改源碼跳過 BFLOAT16 檢查。在 onnxruntime/cmake/CMakeLists.txt 中找到如下檢查段：

check_cxx_compiler_flag(-march=armv8.2-a+bf16 HAS_ARM64_BFLOAT16)
if(NOT HAS_ARM64_BFLOAT16)message(FATAL_ERROR "The compiler doesn't support BFLOAT16!!!")
endif()

修改為：

if(NOT HAS_ARM64_BFLOAT16)#message(WARNING "BFLOAT16 not supported, disabling BF16 optimizations")set(HAS_ARM64_BFLOAT16 TRUE)
endif()

這樣跳過了編譯器不支持 BF16 的致命錯誤。不過需要注意：跳過檢查后仍可能缺少 BF16 優化代碼，性能或功能可能受影響。修改后保存并重新運行 CMake 即可繼續編譯。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/bicheng/80756.shtml
繁體地址，請注明出處：http://hk.pswp.cn/bicheng/80756.shtml
英文地址，請注明出處：http://en.pswp.cn/bicheng/80756.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！