【機器人】復現 UniGoal 具身導航 | 通用零樣本目標導航 CVPR 2025

UniGoal的提出了一個通用的零樣本目標導航框架，能夠統一處理多種類型的導航任務。

支持?對象類別導航、實例圖像目標導航和文本目標導航，而無需針對特定任務進行訓練或微調。

本文分享UniGoal復現和模型推理的過程～

查找沙發，模型會根據輸入的實例圖片進行匹配的

1、創建Conda環境

2、?安裝habitat仿真環境

3、安裝第三方的依賴庫

3.1 安裝LightGlue依賴

3.2?安裝detectron2依賴

3.3?安裝Grounded-Segment-Anything依賴

3.4 安裝其他依賴庫

4、下載模型權重

5、下載HM3D數據集

6、安裝Ollama，配置LLM 和 VLM

7、模型推理

1、創建Conda環境

首先創建一個Conda環境，名字為unigoal，python版本為3.8

進行unigoal環境

conda create -n unigoal python=3.8
conda activate unigoal

然后下載unigoal代碼，并解壓：https://github.com/bagh2178/UniGoal

2、?安裝habitat仿真環境

執行下面命令進行安裝

cd UniGoal
conda install habitat-sim==0.2.3 -c conda-forge -c aihabitat
pip install -e third_party/habitat-lab

安裝過程的打印信息：

安裝成功啦～

3、安裝第三方的依賴庫

3.1 安裝LightGlue依賴

pip install git+https://github.com/cvg/LightGlue.git

正常安裝打印的信息：?

3.2?安裝detectron2依賴

需要cuda>=12.1的，用nvcc --version查詢

(unigoal) lgp@lgp-MS-7E07:~/2025_project/UniGoal$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0

如果是cuda11.x或更底版本的，需要安裝或切換為cuda>=12.1的

將以下內容添加到?~/.bashrc?

# 設置 CUDA 12.1 為默認版本
export CUDA_HOME=/usr/local/cuda-12.1
export PATH=/usr/local/cuda-12.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH

然后執行：

source ~/.bashrc

再安裝detectron2：

pip install git+https://github.com/facebookresearch/detectron2.git

3.3?安裝Grounded-Segment-Anything依賴

執行命令進行安裝，等待安裝完成～

git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git third_party/Grounded-Segment-Anything
cd third_party/Grounded-Segment-Anything
git checkout 5cb813f
pip install -e segment_anything
pip install --no-build-isolation -e GroundingDINO

3.4 安裝其他依賴庫

先安裝pytorch::faiss-gpu，等待安裝完成～

conda install pytorch::faiss-gpu

再安裝安裝其他依賴庫

pip install -r requirements.txt

2025/5/12 補丁安裝：

pip install openai mkl faiss-gpu

4、下載模型權重

分別下載sam_vit_h_4b8939.pth和groundingdino_swint_ogc.pth權重，放在data/models目錄下

cd ../../
mkdir -p data/models
wget -O data/models/sam_vit_h_4b8939.pth https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget -O data/models/groundingdino_swint_ogc.pth https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

等待下載完成：

5、下載HM3D數據集

從這里下載 HM3D 場景數據集，從這里下載實例-圖像-目標導航事件數據集。

數據集的結構概述如下：

UniGoal/
└── data/
? ? ├── datasets/
? ? │ ? └── instance_imagenav/
? ? │ ? ? ? └── hm3d/
? ? │ ? ? ? ? ? └── v3/
? ? │ ? ? ? ? ? ? ? └── val/
? ? │ ? ? ? ? ? ? ? ? ? ├── content/
? ? │ ? ? ? ? ? ? ? ? ? │ ? ├── 4ok3usBNeis.json.gz
? ? │ ? ? ? ? ? ? ? ? ? │ ? ├── 5cdEh9F2hJL.json.gz
? ? │ ? ? ? ? ? ? ? ? ? │ ? ├── ...
? ? │ ? ? ? ? ? ? ? ? ? │ ? └── zt1RVoi7PcG.json.gz
? ? │ ? ? ? ? ? ? ? ? ? └── val.json.gz
? ? └── scene_datasets/
? ? ? ? └── hm3d_v0.2/
? ? ? ? ? ? └── val/
? ? ? ? ? ? ? ? ├── 00800-TEEsavR23oF/
? ? ? ? ? ? ? ? │ ? ├── TEEsavR23oF.basis.glb
? ? ? ? ? ? ? ? │ ? └── TEEsavR23oF.basis.navmesh
? ? ? ? ? ? ? ? ├── 00801-HaxA7YrQdEC/
? ? ? ? ? ? ? ? ├── ...
? ? ? ? ? ? ? ? └── 00899-58NLZxWBSpk/

6、安裝Ollama，配置LLM 和 VLM

分別執行下面命令：

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2-vision

成功啦～

7、模型推理

運行main.py，就可以進行模型推理啦

python main.py

打印信息：

[22:03:27:032178]:[Assets] ResourceManager.cpp(2210)::loadMaterials : Idx 26:Flat.
[22:03:27:032183]:[Assets] ResourceManager.cpp(2210)::loadMaterials : Idx 27:Flat.
[22:03:27:062166]:[Sim] Simulator.cpp(442)::instanceStageForSceneAttributes : Successfully loaded stage named : data/scene_datasets/hm3d_v0.2/val/00877-4ok3usBNeis/4ok3usBNeis.basis.glb
[22:03:27:062184]:[Sim] Simulator.cpp(474)::instanceStageForSceneAttributes : 
---
The active scene does not contain semantic annotations : activeSemanticSceneID_ = 0  
---
[22:03:27:062207]:[Sim] Simulator.cpp(208)::reconfigure : CreateSceneInstance success == true for active scene name : data/scene_datasets/hm3d_v0.2/val/00877-4ok3usBNeis/4ok3usBNeis.basis.glb  with renderer.
[22:03:27:067606]:[Nav] PathFinder.cpp(568)::build : Building navmesh with 222 x 162 cells
[22:03:27:121110]:[Nav] PathFinder.cpp(842)::build : Created navmesh with 340 vertices 163 polygons
[22:03:27:121130]:[Sim] Simulator.cpp(898)::recomputeNavMesh : reconstruct navmesh successful
2025-05-12 22:03:27,122 Initializing task InstanceImageNav-v1
[05/12 22:03:27 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=['input1.jpeg'], opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl', 'MODEL.DEVICE', 'cuda:0'], output=None, video_input=None, webcam=False)
[05/12 22:03:27 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl ...
[22:03:27:595720]:[Sensor] Sensor.cpp(69)::~Sensor : Deconstructing Sensor
Loading episodes from: data/datasets/instance_imagenav/hm3d/v3/val/content/4ok3usBNeis.json.gz
Changing scene: 0/data/scene_datasets/hm3d_v0.2/val/00877-4ok3usBNeis/4ok3usBNeis.basis.glb
rank:0, episode:1, cat_id:0, cat_name:chair

看一下運行效果，查找椅子：

查找不同的椅子，模型會根據輸入的實例圖片進行匹配的

查找衛生間：

分享完成～

相關文章推薦：

UniGoal 具身導航 | 通用零樣本目標導航 CVPR 2025-CSDN博客

【機器人】復現 ECoT 具身思維鏈推理-CSDN博客

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/diannao/83309.shtml
繁體地址，請注明出處：http://hk.pswp.cn/diannao/83309.shtml
英文地址，請注明出處：http://en.pswp.cn/diannao/83309.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！