Windows安裝前的準備工作
-
檢查硬件兼容性:確認電腦顯卡為 NVIDIA GPU。通過快捷鍵 Win + R 喚出“運行”,輸入“control /name Microsoft.DeviceManager”喚出“設備管理器”,點擊“顯示適配器”查看是否有 NVIDIA 字樣。
-
驗證 CUDA 支持性:通過快捷鍵 Win + R 喚出“運行”,輸入“cmd”喚出命令行,在命令行中輸入“nvidia-smi”,查看右上角顯示的 CUDA 版本,該數字表示驅動支持的最高 CUDA 版本,CUDA 版本需與顯卡驅動、cuDNN 版本嚴格匹配,否則會導致兼容性問題。
安裝 CUDA
-
下載 CUDA Toolkit:訪問 CUDA Toolkit Archive(,CUDA Toolkit Archive | NVIDIA Developerhttps://developer.nvidia.com/cuda-toolkit-archive),根據自己的操作系統版本、顯卡型號和需要安裝的 CUDA 版本,選擇對應的安裝包進行下載。
-
運行安裝程序:雙擊下載好的安裝程序,根據安裝向導提示進行操作。建議選擇自定義安裝,可根據自己的需求進行相關設置,如安裝路徑等。
-
配置環境變量:安裝完成后,需要將 CUDA 的路徑添加到系統環境變量中。在 Windows 操作系統上,可以通過右鍵點擊“計算機”(或“此電腦”)-> 屬性 -> 高級系統設置 -> 環境變量,在系統變量中找到“Path”變量并添加 CUDA 的安裝路徑。一般 CUDA 安裝完成后會自動加入到系統環境變量中,如果提示 nvcc 或 nvidia 命令找不到,則需要手動配置。
-
驗證安裝:打開命令提示符,輸入“nvcc -V”,如果能正確輸出版本信息,則說明 CUDA 安裝成功。
安裝 cuDNN
-
下載 cuDNN:訪問 cuDNN Archive(https://developer.nvidia.com/rdp/cudnn-archive),選擇與已安裝的 CUDA 版本相匹配的 cuDNN 版本進行下載。
-
解壓并安裝:解壓下載好的 cuDNN 文件至 CUDA 安裝目錄。如果是默認安裝路徑,CUDA 安裝目錄為“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4”,v12.4 為安裝的 CUDA 版本。解壓文件至 CUDA 安裝目錄時,系統會提示“替換目標中的文件”,點擊替換即可。
-
驗證安裝:打開命令提示符,進入 CUDA 安裝目錄下的“bin”文件夾,運行“deviceQuery.exe”,如果結果顯示為 pass,則證明 cuDNN 安裝成功。
在 Ubuntu 系統上安裝 NVIDIA 驅動、CUDA 和 cuDNN 的詳細教程:
首先使用docker拉取一個Ubuntu鏡像,在容器中運行,不要破壞原環境;
安裝docker及docker-compose這步省略
拉取Ubuntu鏡像
docker pull docker.m.daocloud.io/ubuntu:20.04
創建目錄Ubuntu存放文件并新建docker-compose.yaml文件
services:ubuntu:build: ./buildimage: ubuntu_kcontainer_name: ubuntu_krestart: alwaysruntime: nvidiaprivileged: trueenvironment:# - CUDA_VISIBLE_DEVICES=1- HF_ENDPOINT=https://hf-mirror.com- HF_HUB_ENABLE_HF_TRANSFER=1ports:- 60:22volumes:- ./data:/data- ./root:/roottty: truedeploy:resources:reservations:devices:- driver: nvidiacount: allcapabilities: [gpu]restart_policy:condition: on-failuredelay: 5smax_attempts: 3window: 120s
FROM docker.m.daocloud.io/ubuntu:20.04
MAINTAINER Csars (Csars@qq.com)
ADD ./sources.list /etc/apt/
RUN export DEBIAN_FRONTEND=noninteractive \&& apt-get update \&& apt-get install -y curl \&& apt-get install -y git \&& apt-get install -y openssh-server
# Configure SSH server
RUN mkdir /var/run/sshd
RUN echo 'root:root' | chpasswd
RUN sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config# SSH login fix. Otherwise user is kicked off after login
RUN sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd#ENV NOTVISIBLE "in users profile"
RUN echo "export VISIBLE=now" >> /etc/profile
ADD ./sshd_config /etc/ssh/#RUN npm install -g https://gaccode.com/claudecode/install# Expose the SSH port
EXPOSE 22ENTRYPOINT ["/usr/sbin/sshd", "-D"]
#sources.list 可更換為適用版本的鏡像源
deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
20.04版本更換源文件
第一步:備份源文件:
sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup第二步:編輯/etc/apt/sources.list文件在文件最前面添加以下條目(操作前請做好相應備份):
vi /etc/apt/sources.list網易163源# 默認注釋了源碼鏡像以提高 apt update 速度,如有需要可自行取消注釋
deb http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# 預發布軟件源,不建議啟用
# deb http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse第三步:執行更新命令:sudo apt-get update
sudo apt-get upgrade常用國內源:阿里云源deb http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ focal-backports main restricted universe multiverse清華源# 默認注釋了源碼鏡像以提高 apt update 速度,如有需要可自行取消注釋
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-security main restricted universe multiverse# 預發布軟件源,不建議啟用
# deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse中科大源deb https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-updates main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-backports main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-security main restricted universe multiverse
deb https://mirrors.ustc.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse
deb-src https://mirrors.ustc.edu.cn/ubuntu/ focal-proposed main restricted universe multiverse網易163源deb http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-security main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-updates main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-proposed main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ focal-backports main restricted universe multiverse
# cat sshd_config配置文件# $OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options override the
# default value.#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key# Ciphers and keying
#RekeyLimit default none# Logging
#SyslogFacility AUTH
#LogLevel INFO# Authentication:#LoginGraceTime 2m
#PermitRootLogin prohibit-password
PermitRootLogin yes
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10PubkeyAuthentication yes
#RSAAuthentication yes# Expect .ssh/authorized_keys2 to be disregarded by default in future.
#AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2#AuthorizedPrincipalsFile none#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes# To disable tunneled clear text passwords, change to no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes#AllowAgentForwarding yes
#AllowTcpForwarding yes
#GatewayPorts no
X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
#TCPKeepAlive yes
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS no
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none# no default banner path
#Banner none# Allow client to pass locale environment variables
AcceptEnv LANG LC_*# override default of no subsystems
Subsystem sftp /usr/lib/openssh/sftp-server# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
安裝驅動前一定要更新軟件列表和安裝必要軟件、依賴(必須)
sudo apt-get update ? ? ?#更新軟件列表
sudo apt-get install g++
sudo apt-get install gcc
sudo apt-get install make
# 查看顯卡類型
lspci | grep -i nvidia
# 顯示如下
01:00.0 VGA compatible controller: NVIDIA Corporation GA106M [GeForce RTX 3060 Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)
?
# 查看系統硬件架構信息, 如果顯示結果是x86_64,則選擇Linux 64-bit
uname -m
安裝 NVIDIA 驅動
-
檢查顯卡是否被識別
lspci | grep -i nvidia
如果能看到 NVIDIA 顯卡信息,說明系統已識別到顯卡。
-
安裝內核頭文件:
sudo apt-get install linux-headers-$(uname -r)
-
添加 CUDA 倉庫并安裝驅動:
sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get install nvidia-driver-535 -y sudo reboot
重啟后,通過以下命令驗證驅動是否安裝成功
nvidia-smi
如果能看到驅動版本和 CUDA 版本,說明驅動安裝成功。
安裝 CUDA Toolkit
-
添加 NVIDIA CUDA 官方軟件源
sudo apt-get install -y software-properties-common sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt-get update
-
安裝 CUDA Toolkit(這里使用Ubuntu20.04版本)其他版本從這里下載CUDA Toolkit 12.9 Downloads | NVIDIA Developer
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/12.9.0/local_installers/cuda-repo-ubuntu2004-12-9-local_12.9.0-575.51.03-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu2004-12-9-local_12.9.0-575.51.03-1_amd64.deb sudo cp /var/cuda-repo-ubuntu2004-12-9-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cuda-toolkit-12-9
安裝過程中會自動處理依賴關系,安裝匹配的 NVIDIA 驅動。
To install the open kernel module flavor: sudo apt-get install -y nvidia-open
To install the proprietary kernel module flavor: sudo apt-get install -y cuda-drivers
-
配置系統環境變量: 編輯
~/.bashrc
文件,在文件末尾添加以下內容export PATH=/usr/local/cuda-12.5/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.5/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
保存并關閉文件后,執行以下命令使配置立即生效
source ~/.bashrc
-
驗證 CUDA 安裝
nvcc --version
如果能看到版本信息,說明 CUDA 安裝成功。
安裝 cuDNN
-
下載 cuDNN: 訪問 cuDNN Archive,選擇與已安裝的 CUDA 版本相匹配的 cuDNN 版本進行下載。
-
解壓并安裝: 解壓下載好的 cuDNN 文件至 CUDA 安裝目錄。例如,CUDA 安裝目錄為
/usr/local/cuda-12.5
,解壓文件至該目錄時,系統會提示“替換目標中的文件”,點擊替換即可。 -
驗證安裝: 運行以下命令驗證 cuDNN 是否安裝成功
sudo ldconfig /usr/local/cuda-12.5/lib64
驅動及 CUDA 安裝位置
-
NVIDIA 驅動:通常安裝在
/usr/lib/nvidia-<driver-version>
和/usr/lib32/nvidia-<driver-version>
目錄下。 -
CUDA Toolkit:默認安裝路徑為
/usr/local/cuda-<version>
,例如/usr/local/cuda-12.5
。
可選步驟:安裝 NVIDIA Container Toolkit(用于 Docker)
為了讓 Docker 容器能夠使用 GPU,可以安裝 NVIDIA Container Toolkit:
-
設置 GPG 密鑰和軟件源
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
-
更新軟件包列表并安裝
sudo apt-get update sudo apt-get install -y nvidia-container-toolkit
-
配置 Docker 守護進程
sudo nvidia-ctk runtime configure
-
重啟 Docker 服務
sudo systemctl restart docker
-
驗證 Docker 容器是否能調用 GPU
sudo docker run --rm --gpus all nvidia/cuda:12.5.1-base-ubuntu22.04 nvidia-smi
如果命令成功執行,并且在容器的輸出中看到了和主機上一樣的
nvidia-smi
表格,說明配置成功。