ER-NeRF是基于NeRF用于生成數字人的方法,可以達到實時生成的效果。
下載源碼
cd D:\Projects\
git clone https://github.com/Fictionarry/ER-NeRF
cd D:\Projects\ER-NeRF
下載模型
準備面部解析模型
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_parsing/79999_iter.pth?raw=true -O data_utils/face_parsing/79999_iter.pth
準備basel面部模型
在data_utils/face_tracking文件夾中新建文件夾3DMM
下載01_MorphableModel.mat
https://faces.dmi.unibas.ch/bfm/main.php?nav=1-2&id=downloadshttps://faces.dmi.unibas.ch/bfm/main.php?nav=1-2&id=downloads
勾選選項并填寫資料,提交之后一封會發一封郵件到郵箱,包含下載地址及賬號密碼,輸入正確后即可下載到tar的壓縮文件,解壓后將01_MorphableModel.mat放入項目中的 data_utils/face_tracking/3DMM 文件夾中
其他文件
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/exp_info.npy?raw=true -O data_utils/face_tracking/3DMM/exp_info.npy
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/keys_info.npy?raw=true -O data_utils/face_tracking/3DMM/keys_info.npy
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/sub_mesh.obj?raw=true -O data_utils/face_tracking/3DMM/sub_mesh.obj
wget https://github.com/YudongGuo/AD-NeRF/blob/master/data_util/face_tracking/3DMM/topology_info.npy?raw=true -O data_utils/face_tracking/3DMM/topology_info.npy
部署項目
拉取cuda116鏡像
docker pull nvcr.io/nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04
創建容器
docker run -it --name ernerf -v D:\Projects\ER-NeRF:/ernerf nvcr.io/nvidia/cuda:11.6.1-cudnn8-devel-ubuntu20.04
安裝依賴環境
apt-get update -yq --fix-missing \&& DEBIAN_FRONTEND=noninteractive apt-get install -yq --no-install-recommends \pkg-config \wget \cmake \curl \git \vim# 對于Ubuntu,pyaudio需要portaudio的支持才能正常工作。
apt install portaudio19-dev
安裝Miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh -b -u -p ~/miniconda3
~/miniconda3/bin/conda init
source ~/.bashrc
創建環境
conda create -n ernerf python=3.10
conda activate ernerf
安裝依賴庫
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
pip install -r requirements.txtconda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d==0.7.4 -c pytorch3d
conda install ffmpeg
pip install tensorflow-gpu==2.8.0
pip install numpy==1.22.4
pip install opencv-python-headless
pip install protobuf==3.20.0
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116
運行 convert_BFM.py
cd data_utils/face_tracking
python convert_BFM.py
預處理
視頻預處理
將視頻放在 data/<ID>/<ID>.mp4 路徑下
視頻必須為 25FPS,所有幀都包含說話的人。 分辨率應約為 512x512,持續時間約為 1-5 分鐘。
運行腳本以處理視頻
python data_utils/process.py data/<ID>/<ID>.mp4
音頻預處理
在訓練和測試時指定音頻功能的類型。
--asr_model <deepspeech, esperanto, hubert>
DeepSpeech
python data_utils/deepspeech_features/extract_ds_features.py --input data/<name>.wav
# save to data/<name>.npy
Wav2Vec
python data_utils/wav2vec.py --wav data/<name>.wav --save_feats
# save to data/<name>_eo.npy
HuBERT
# Borrowed from GeneFace. English pre-trained.
python data_utils/hubert.py --wav data/<name>.wav
# save to data/<name>_hu.npy
訓練
首次運行需要一些時間來編譯 CUDA 擴展。
# train (head and lpips finetune, run in sequence)
python main.py data/obama/ --workspace trial_obama/ -O --iters 100000
python main.py data/obama/ --workspace trial_obama/ -O --iters 125000 --finetune_lips --patch_size 32# train (torso)
# <head>.pth should be the latest checkpoint in trial_obama
python main.py data/obama/ --workspace trial_obama_torso/ -O --torso --head_ckpt <head>.pth --iters 200000