one4all 排坑記錄
- 任務
- 踩坑回顧
- 動作
- 踩坑
- 動作
- 踩坑
- 動作
- 新一步
- 測試Habitat-sim
- 測試habitat-lab
- 繼續ONE4ALL
任務
看了《One-4-All: Neural Potential Fields for Embodied Navigation》這篇論文,感覺挺有意思,他也開源了代碼。視覺語言導航是我一直想做的事情,這個項目用的也是Habitat這個仿真環境,我看很多做VLN的都用這個環境。嘗試復現這個項目,然后搞清楚Habitat環境是怎么回事。
踩坑回顧
沒有說的,就是按照原項目readme做的
動作
- 照例,新建文件夾,然后git clone
- readme說用venv創建虛擬環境,但我習慣用conda了,所以:
conda create -n ONE4ALL python=3.10
conda activate ONE4ALL
踩坑
安裝依賴時:
pip3 install -r requirements.txt
報錯:
Running command git clone --filter=blob:none --quiet https://github.com/facebookincubator/submitit /tmp/pip-install-l7el9026/submitit_2fc7c3624f664ace92cc27ec82088ad7fatal: 無法訪問 'https://github.com/facebookincubator/submitit/':gnutls_handshake() failed: Error in the pull function.error: subprocess-exited-with-error× git clone --filter=blob:none --quiet https://github.com/facebookincubator/submitit /tmp/pip-install-l7el9026/submitit_2fc7c3624f664ace92cc27ec82088ad7 did not run successfully.│ exit code: 128╰─> See above for output.note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error× git clone --filter=blob:none --quiet https://github.com/facebookincubator/submitit /tmp/pip-install-l7el9026/submitit_2fc7c3624f664ace92cc27ec82088ad7 did not run successfully.
│ exit code: 128
╰─> See above for output.note: This error originates from a subprocess, and is likely not a problem with pip.
這種問題常遇到,一般是git的代理設置問題,但檢查后發現沒問題,再次運行,報錯:
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113
Collecting submitit (from -r requirements.txt (line 6))Cloning https://github.com/facebookincubator/submitit (to revision escape_all) to /tmp/pip-install-4zalxp9i/submitit_20dd99e9b81e4548b972292505b58f6eRunning command git clone --filter=blob:none --quiet https://github.com/facebookincubator/submitit /tmp/pip-install-4zalxp9i/submitit_20dd99e9b81e4548b972292505b58f6eWARNING: Did not find branch or tag 'escape_all', assuming revision or ref.Running command git checkout -q escape_allerror: 路徑規格 'escape_all' 未匹配任何 git 已知文件error: subprocess-exited-with-error× git checkout -q escape_all did not run successfully.│ exit code: 1╰─> See above for output.note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error× git checkout -q escape_all did not run successfully.
│ exit code: 1
╰─> See above for output.note: This error originates from a subprocess, and is likely not a problem with pip.
問題在于找不到escape_all這個分支。打開submitit這個項目,發現里面確實沒有這個分支。我嘗試不指定分支,把requirements命令改為:
git+https://github.com/facebookincubator/submitit#egg=submitit
這個問題沒有報錯,如果以后出問題,可能就出在這里的版本不對。真實的為啥非要用實驗的分支,用穩定的多好。
然后又遇到了其他包的版本錯誤:
Collecting matplotlib==3.5.1 (from -r requirements.txt (line 16))Downloading matplotlib-3.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
ERROR: Could not find a version that satisfies the requirement numpy==1.22.2 (from versions: 1.24.1, 1.26.3)
ERROR: No matching distribution found for numpy==1.22.2
嘗試不指定版本:
pip3 install matplotlib
報錯,但是成功安裝了:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
vcstools 0.1.42 requires pyyaml, which is not installed.
wstool 0.1.17 requires pyyaml, which is not installed.
Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.49.0 kiwisolver-1.4.5 matplotlib-3.8.3 numpy-1.26.4 packaging-23.2 pillow-10.2.0 pyparsing-3.1.1 python-dateutil-2.8.2 six-1.16.0
那就按照提示,補充安裝:
pip3 install pyyaml catkin-pkg roskpkg
最后報錯:
ERROR: Could not find a version that satisfies the requirement roskpkg (from versions: none)
ERROR: No matching distribution found for roskpkg
找不到就找不到吧,可能也用不到。
繼續按照readme手動安裝剩下的,不再指定版本了:
pip3 install numpy Pillow pytorch-lightning protobuf scikit-image scipy torch torchvision setuptools pykeops seaborn tensorflow-gpu tensorflow-probability einops prettytable tqdm imageio-ffmpeg
然后torch竟然找不到版本了:
ERROR: Ignored the following versions that require a different python version: 1.6.2 Requires-Python >=3.7,<3.10; 1.6.3 Requires-Python >=3.7,<3.10; 1.7.0 Requires-Python >=3.7,<3.10; 1.7.1 Requires-Python >=3.7,<3.10
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch
然后我把環境刪了重建,一個個包手動install,幾乎所有的包都沒問題了,只有tensorflow-gpu頑固不化!
Specifications:- tensorflow-gpu -> python[version='2.7.*|3.6.*|3.7.*|3.8.*|3.9.*|3.5.*|>=3.5,<3.6.0a0|>=3.6,<3.7.0a0|>=2.7,<2.8.0a0']Your python: python=3.10
難道只能新建一個低版本python環境,重新安裝包了嗎?我不服氣,網上搜下,看有沒有新的希望。寧猜怎么著?還真有!參考博客說,tensorflow2.9適配python3.10.我一看requirements,人家本來就要求的tensorflow-gpu==2.9.1
。于是我順理成章地運行:
pip3 install tensorflow-gpu==2.9.1
解決!
全過程:
pip3 install empy rospkg pyyaml catkin_pkg
pip3 install submitit hydra-submitit-launcher
pip3 install torch torchvision
pip3 install pandas albumentations networkx rich hydra-core hydra-colorlog
pip3 install hydra_optuna_sweeper scikit-learn comet_ml gym imageio matplotlib numpy Pillow pytorch-lightning protobuf scikit-image scipy
pip3 install setuptools pykeops seaborn einops prettytable tqdm imageio-ffmpeg
pip3 install tensorflow-probability
pip3 install tensorflow-gpu==2.9.1
動作
按照readme執行:
pip3 install geomloss
踩坑
readme中的:
cd mazelab
pip3 install -e .
export PYTHONPATH=<path_to>/one4all/:$PYTHONPATH
讓我摸不到頭腦,明明沒有cd mazelab這個目錄啊。我一搜,哦,原來有個mazelab的python項目,是用來生成迷宮的:mazelab項目
于是,新建一個目錄,進入目錄,查看requirements里面的東西我的環境里都有了,然后按照mazelab的readme執行:
pip3 install -e .
于是順利安裝成功:Successfully installed mazelab-0.2.0
動作
按照readme指示,查看cmake版本:
cmake --version
我是:cmake version 3.22.1
,滿足大于3.10了。
接下來是要安裝Habiat了,包含Habitat-sim和Habitat-lab,他都用的源碼安裝。我看meta的官方倉庫,Habitat-sim推薦用conda安裝,Habitat-lab要clone下來安裝。我頭鐵,就要按官方的來!
對于Habitat-sim的conda安裝,meta給了這么幾種選擇:
比較一下發現,基本都是一樣的,只有兩個區別:如果要bullet,就加上withbullet,如果沒有顯示器,就加上headless。我有顯示器,然后看原本要源碼安裝的語句python setup.py --bullet --with-cuda build_ext --parallel 8 install --cmake-args="-DUSE_SYSTEM_ASSIMP=ON"
也有bullet的字眼,最后habitat-lab的安裝要求中也要求執行conda install habitat-sim withbullet -c conda-forge -c aihabitat
,于是我選擇執行:
conda install habitat-sim withbullet -c conda-forge -c aihabitat
結果是他一直卡在種地方循環,我只能結束掉他:
Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Solving environment: unsuccessful attempt using repodata from current_repodata.json, retrying with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
似乎是環境不對,和其他包不兼容。我按照官網要求,建個環境試一試:
conda create -n habitat python=3.9 cmake=3.14.0
conda activate habitat
conda install habitat-sim withbullet -c conda-forge -c aihabitat
發現沒有任何問題。那在我的環境里不行,有3中可能:
- cmake的原因
- python3.10的原因
- 其他包的原因
我決定刪掉官方環境,新建一個3.10,不指定cmake的環境:
conda create -n habitat python=3.10
conda activate habitat
conda install habitat-sim withbullet -c conda-forge -c aihabitat
發現不行。說明大概率是python3.10或者cmake問題。
我再進行這個嘗試:
conda create -n habitat python=3.10 cmake=3.14.0
conda activate habitat
conda install habitat-sim withbullet -c conda-forge -c aihabitat
發現還是不行。可能和cmake沒關系?得要python3.9才行?
再試試python3.9,不指定cmake的:
conda create -n habitat python=3.9
conda activate habitat
conda install habitat-sim withbullet -c conda-forge -c aihabitat
發現沒問題。。。還真得python3.9啊。。。
算了我git下來安裝!我就不退版本!
按照readme來,執行到
pip3 install -r requirements.txt
基本沒問題,有一個版本報錯:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
albumentations 1.4.0 requires numpy>=1.24.4, but you have numpy 1.23.5 which is incompatible.
重新安裝這個版本就行:
pip3 install numpy==1.24.4
然后關鍵一步:
python setup.py --bullet --with-cuda build_ext --parallel 8 install --cmake-args="-DUSE_SYSTEM_ASSIMP=ON"
成功!沒有報錯。
安裝Habitat-Lab,前幾步沒事,到最后一步出錯:
python setup.py develop --all
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]or: setup.py --help [cmd1 cmd2 ...]or: setup.py --help-commandsor: setup.py cmd --helperror: option --allow-hosts requires argument
按照chatgpt的指示,我運行:
python setup.py develop --all --allow-hosts=*
成功
發現之前沒有測試,測試下:
python -m habitat_sim.utils.datasets_download --uids habitat_test_scenes --data-path /home/lcy-magic/VLN_TEST/Habitat_data
報錯:
git:'lfs' 不是一個 git 命令。參見 'git --help'
chatgpt說這是沒安裝Git LFS,于是安裝:
sudo apt install git-lfs
git lfs install
再次運行:
python -m habitat_sim.utils.datasets_download --uids habitat_test_scenes --data-path /home/lcy-magic/VLN_TEST/Habitat_data
報錯:
stderr: 'fatal: unable to access 'https://huggingface.co/datasets/ai-habitat/habitat_test_scenes.git/': gnutls_handshake() failed: Error in the pull function.'
參照chatgpt的提示執行:
git remote set-url origin git@huggingface.co:datasets/ai-habitat/habitat_test_scenes.git
然后再下載就沒問題了。
執行這個語句時候又出錯了:
python examples/viewer.py --scene /home/lcy-magic/VLN_TEST/Habitat_data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
報錯:
EGL: Failed to get EGL display: Success
Platform::GlfwApplication::tryCreate(): cannot create a window with core OpenGL context, falling back to compatibility context
EGL: Failed to get EGL display: Success
Platform::GlfwApplication::tryCreate(): cannot create a window with OpenGL context
可是運行這個就沒問題:
./build/viewer /home/lcy-magic/VLN_TEST/Habitat_data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
有人遇到了相同的問題,發了issue:參考issue。太長了,明天再看。
首先,我把原來的數據庫刪了,重新在默認位置下載了,因為改語句麻煩,然后開始看issue排查問題:
首先檢查安裝GLVND,他是一個用于管理OpenGL的函數庫:
sudo apt install libglvnd-dev
glxinfo | grep OpenGL
發現我安裝好了,沒問題。但是intel的,可能是這個問題:
OpenGL vendor string: Intel
OpenGL renderer string: Mesa Intel(R) Graphics (ADL GT2)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 21.2.6
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.6 (Compatibility Profile) Mesa 21.2.6
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 21.2.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:
這篇博客參考博客說,可以用Bumblebee切換顯卡。安裝Bumblebee,N卡驅動,并進行配置:
sudo apt install bumblebee primus nvidia-prime
sudo gedit /etc/bumblebee/bumblebee.conf
把配置改為:
Driver=nvidia
重啟bumblebee服務:
sudo systemctl restart bumblebeed
執行完后等一會兒,我就是太急了直接強制關機,導致驅動掉了,進不去圖形界面。最后是ctrl+alt+fn+f2進入tty界面,然后安裝了525版本驅動再重啟才好的(我用推薦的驅動沒有用,最后看別人用525我也試一試沒想到解決了)。
然后我重啟,好像沒用,而且我的筆記本外接顯示器也不識別了,然后我又運行了:
sudo prime-select nvidia
再次重啟,兩個顯示器都正常顯示了,而且:
可能就是應該用這個語句,感謝這個博客參考博客
如果要換回核顯,就:sudo prime-select intel
不知道有沒有,隨時指定用哪種顯卡的方式。
這時候再回去運行:
./build/viewer /home/lcy-magic/VLN_TEST/habitat-sim/data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
python examples/viewer.py --scene /home/lcy-magic/VLN_TEST/habitat-sim/data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
新一步
接下來要生成數據和訓練了。原文說:
We’ll use the Habitat simulator for data generation and navigation. You first need to download the relevant scenes (at least Annawan) from the official Gibson repositiory. Make sure to put the relevant .glb and .navmesh files under the data_habitat/versioned_data/habitat_test_scenes_1.0 directory.
我就填了PDF表,提交上去,然后下載了最小的那個給habitat的數據集。后面遇到了很多波折,運行不起來。還是先驗證著仿真環境吧,說不定是這里面的問題。
測試Habitat-sim
前面測試過這倆沒問題:
/home/lcy-magic/VLN_TEST/habitat-sim/build/viewer /home/lcy-magic/VLN_TEST/habitat-sim/data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
python /home/lcy-magic/VLN_TEST/habitat-sim/examples/viewer.py --scene /home/lcy-magic/VLN_TEST/habitat-sim/data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
然后物理交互測試:
python -m habitat_sim.utils.datasets_download --uids replica_cad_dataset
/home/lcy-magic/VLN_TEST/habitat-sim/build/viewer /home/lcy-magic/VLN_TEST/habitat-sim/data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
python /home/lcy-magic/VLN_TEST/habitat-sim/src_python/habitat_sim/utils/datasets_download.py --uids replica_cad_dataset
/home/lcy-magic/VLN_TEST/habitat-sim/build/viewer --enable-physics --dataset /home/lcy-magic/VLN_TEST/habitat-sim/data/replica_cad/replicaCAD.scene_dataset_config.json -- apt_1
python /home/lcy-magic/VLN_TEST/habitat-sim/examples/viewer.py --dataset /home/lcy-magic/VLN_TEST/habitat-sim/data/replica_cad/replicaCAD.scene_dataset_config.json --scene apt_1
沒問題
測試非交互式:
python /home/lcy-magic/VLN_TEST/habitat-sim/examples/example.py --scene /home/lcy-magic/VLN_TEST/habitat-sim/data/scene_datasets/habitat-test-scenes/skokloster-castle.glb
測試benchmark:
數據可以這么獲取:
wget http://dl.fbaipublicfiles.com/habitat/mp3d_example.zip
執行:
python /home/lcy-magic/VLN_TEST/habitat-sim/examples/benchmark.py --scene /home/lcy-magic/VLN_TEST/habitat-sim/data/mp3d_example/17DRP5sb8fy/17DRP5sb8fy.glb
報錯:
---------------------- rgb ------------------------
Traceback (most recent call last):File "/home/lcy-magic/VLN_TEST/habitat-sim/examples/benchmark.py", line 120, in <module>perf[key] = demo_runner.benchmark(settings)File "/home/lcy-magic/VLN_TEST/habitat-sim/examples/demo_runner.py", line 395, in benchmarkperfs = pool.map(self._bench_target, range(nprocs))File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/multiprocessing/pool.py", line 367, in mapreturn self._map_async(func, iterable, mapstar, chunksize).get()File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/multiprocessing/pool.py", line 774, in getraise self._valueFile "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/multiprocessing/pool.py", line 540, in _handle_tasksput(task)File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/multiprocessing/connection.py", line 206, in sendself._send_bytes(_ForkingPickler.dumps(obj))File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/multiprocessing/reduction.py", line 51, in dumpscls(buf, protocol).dump(obj)
TypeError: cannot pickle '_magnum.Color4' object
而運行:
python /home/lcy-magic/VLN_TEST/habitat-sim/examples/example.py --scene /home/lcy-magic/VLN_TEST/habitat-sim/data/mp3d_example/17DRP5sb8fy/17DRP5sb8fy.glb
python /home/lcy-magic/VLN_TEST/habitat-sim/examples/example.py --scene /home/lcy-magic/VLN_TEST/habitat-sim/data/mp3d_example/17DRP5sb8fy/17DRP5sb8fy.glb --enable_physics
是沒問題的。害又卡在這里了。
測試habitat-lab
運行:
python /home/lcy-magic/VLN_TEST/habitat-lab/examples/example.py
沒問題。中間遇到了以前遇到過的問題,老辦法解決就行:
運行:
python /home/lcy-magic/VLN_TEST/habitat-lab/examples/interactive_play.py --never-end
提示我沒有Pygame,安裝:
pip3 install pygame
運行后又說我沒有Pybullet,安裝:
pip3 install pybullet
再運行,還是有報錯:
X Error of failed request: BadAccess (attempt to access private resource denied)Major opcode of failed request: 152 (GLX)Minor opcode of failed request: 5 (X_GLXMakeCurrent)Serial number of failed request: 178Current serial number in output stream: 178
habitat-lab的readme說:
Note: Interactive testing currently fails on Ubuntu 20.04 with an error: X Error of failed request: BadAccess (attempt to access private resource denied). We are working on fixing this, and will update instructions once we have a fix. The script works without errors on MacOS.
麻了,看來近期沒戲了。
繼續ONE4ALL
想直接跳到navigation看效果,按reame指示下載,并整理目錄。
下載下來是個壓縮包,解壓后就叫components,正是他要的目錄。但是原本這個項目就有一個這個目錄。原本的目錄是空的,我把新下載的替換掉他。乍一看和readme要求的不一樣:
components
└── habitat├── backbone.ckpt├── fk.ckpt└── geodesic_regressors├── annawan.ckpt...
但你要用tree命令查看,發現是一樣的:
./components/
├── habitat
│ ├── backbone.ckpt
│ ├── fd.ckpt
│ └── geodesic_regressors
│ ├── aloha.ckpt
│ ├── annawan.ckpt
│ ├── cantwell.ckpt
│ ├── dunmor.ckpt
│ ├── eastville.ckpt
│ ├── hambleton.ckpt
│ ├── nicut.ckpt
│ └── sodaville.ckpt
└── jackal├── backbone.ckpt├── backbone_finetuned.ckpt├── fd.ckpt└── gr.ckpt3 directories, 14 files
接下來運行:
ython /home/lcy-magic/VLN_TEST/one4all/run_habitat.py policy=habitat_o4a env=habitat sim_env=Annawan difficulty=hard test_params.n_trajectories=10
報錯:
Traceback (most recent call last):File "/home/lcy-magic/VLN_TEST/one4all/run_habitat.py", line 10, in mainfrom src.run_habitat import run_habitatFile "/home/lcy-magic/VLN_TEST/one4all/src/run_habitat.py", line 17, in <module>from src import utilsFile "/home/lcy-magic/VLN_TEST/one4all/src/utils/__init__.py", line 146, in <module>logger: List[pl.loggers.LightningLoggerBase],
AttributeError: module 'pytorch_lightning.loggers' has no attribute 'LightningLoggerBase'Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
糟糕,看似是pytorch_lightning版本問題。
看requirements要求的是1.6.0,結果我的:
pip3 show pytorch_lightning
是2.2.0的。
Name: pytorch-lightning
Version: 2.2.0.post0
Summary: PyTorch Lightning is the lightweight PyTorch wrapper for ML researchers. Scale your models. Write less boilerplate.
Home-page: https://github.com/Lightning-AI/lightning
Author: Lightning AI et al.
Author-email: pytorch@lightning.ai
License: Apache-2.0
Location: /home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/site-packages
Requires: fsspec, lightning-utilities, numpy, packaging, PyYAML, torch, torchmetrics, tqdm, typing-extensions
Required-by:
看來要降版本了。看博客參考博客說,這個功能,1.9的版本后就沒有了。我先退到指定版本吧:
pip3 install pytorch_lightning==1.6.0
然后提示可能會有問題:
DEPRECATION: pytorch-lightning 1.6.0 has a non-standard dependency specifier torch>=1.8.*. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
先測試行不行:
報錯:
Error executing job with overrides: ['policy=habitat_o4a', 'env=habitat', 'sim_env=Annawan', 'difficulty=hard', 'test_params.n_trajectories=10']
Traceback (most recent call last):File "/home/lcy-magic/VLN_TEST/one4all/run_habitat.py", line 17, in mainreturn run_habitat(cfg)File "/home/lcy-magic/VLN_TEST/one4all/src/run_habitat.py", line 83, in run_habitathabitat_config = habitat.get_config(config_paths="conf_habitat/imagenav.yaml")
TypeError: get_config() got an unexpected keyword argument 'config_paths'Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
好吧,那就給他設為1吧:
export HYDRA_FULL_ERROR=1
報錯信息確實多了,但沒啥有用信息,害。看代碼:
habitat_config = habitat.get_config(config_paths="conf_habitat/imagenav.yaml")
再去看函數定義的頭:
def get_config(config_path: str,overrides: Optional[List[str]] = None,configs_dir: str = _HABITAT_CFG_DIR,
) -> DictConfig:
看來是拼寫錯誤,改成:
habitat_config = habitat.get_config(config_path="conf_habitat/imagenav.yaml")
再運行,開始報新的錯誤了。報錯太長了,應該把之前的環境變量取消掉:
unset HYDRA_FULL_ERROR
再運行,報錯:
Error executing job with overrides: ['policy=habitat_o4a', 'env=habitat', 'sim_env=Annawan', 'difficulty=hard', 'test_params.n_trajectories=10']
Traceback (most recent call last):File "/home/lcy-magic/VLN_TEST/one4all/run_habitat.py", line 17, in mainreturn run_habitat(cfg)File "/home/lcy-magic/VLN_TEST/one4all/src/run_habitat.py", line 83, in run_habitathabitat_config = habitat.get_config(config_path="conf_habitat/imagenav.yaml")File "/home/lcy-magic/VLN_TEST/habitat-lab/habitat-lab/habitat/config/default.py", line 131, in get_configwith lock, initialize_config_dir(File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/site-packages/hydra/initialize.py", line 170, in __init__Hydra.create_main_hydra2(task_name=job_name, config_search_path=csp)File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 68, in create_main_hydra2GlobalHydra.instance().initialize(hydra)File "/home/lcy-magic/anaconda3/envs/ONE4ALL/lib/python3.10/site-packages/hydra/core/global_hydra.py", line 16, in initializeraise ValueError(
ValueError: GlobalHydra is already initialized, call GlobalHydra.instance().clear() if you want to re-initializeSet the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
聽起來不嚴重,是初始化的語句問題。修改為:
GlobalHydra.instance().clear()
habitat_config = habitat.get_config(config_path="conf_habitat/imagenav.yaml")
并在文件開頭import:
from hydra.core.global_hydra import GlobalHydra
再次運行,又遇到經典問題:
File "/home/lcy-magic/VLN_TEST/habitat-lab/habitat-lab/habitat/config/default.py", line 85, in patch_configsim_config = cfg.habitat.simulator
omegaconf.errors.ConfigAttributeError: Key 'habitat' is not in structfull_key: habitatobject_type=dict
之前運行數據訓練的代碼就卡在這個地方,真的服了。我看網上別人的config里也都沒有habitat這個key啊!
算了,我暫時放棄了,嗚嗚嗚。因為habitat-sim的配置還不太熟悉。打算找個star、fork人多的項目,復現一下,然后學清楚habitat的使用,再回來排錯,思路會清晰很多。如果有大佬知道我該怎么辦,請不吝指教,謝謝。