復現 MODEST 機器人抓取透明物體單目 ICRA 2025

MODEST 單目透明物體抓取算法，來自ICRA 2025，本文分享它的復現過程。

輸入單個視角的RGB圖像，模型需要同時處理深度和分割任務，輸出透明物體的分割結果和場景深度預測。

論文地址：Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion

代碼地址：https://github.com/D-Robotics-AI-Lab/MODEST

將算法遷移到真實機器人平臺，開展了透明物體抓取實驗。實驗平臺主要由UR機械臂和深度相機組成。

在借助MODEST方法對透明物體進行分割和深度預測，生成點云數據作為輸入，進而采用GraspNet生成抓取位姿。

1、創建Conda環境

使用conda創建一個虛擬環境，名字為modest，指定使用python3.8

然后進入modest環境

conda create -n modest python=3.8
conda activate modest

2、安裝torch和CUDA

需要安裝torch==1.10.1+cu111，執行下面命令：

pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

然后安裝其他依賴

sudo apt-get install openexr libopenexr-dev

3、安裝依賴庫requirements.txt

下載MODEST代碼到本地，然后解壓

打開requirements.txt，注釋torch==1.10.1+cu111、torchvision==0.11.2+cu111，因為上面安裝了

然后執行命令，安裝依賴庫

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

4、準備數據集ClearPose?

ClearPose 數據集是使用 RealSense L515 攝像頭在室內環境中捕獲的，捕獲了 63 個透明物體。

它包含 RGB、原始深度、地面真實深度、地面真實表面法線圖像以及所有物體實例6D位姿。

代碼地址：https://github.com/opipari/ClearPose

下載地址：點擊下載clearpose

ClearPose 被分成 9 個集合，其中 Set1 只包含化學透明物體，Set2-7 只包含家居物品，Set8-9 還包含其他對抗因素。

文件夾結構如下：

<dataset_path>
|-- set1|-- scene1|-- metadata.mat            # |-- 000000-color.png        # RGB image|-- 000000-depth.png        # Raw depth image|-- 000000-depth_true.png   # Ground truth depth image|-- 000000-label.png        #|-- 000000-normal_true.png  #...
|-- model|-- <object1>|-- <object1>.obj|-- <object2>|-- <object2>.obj...

示例數據：

5、下載模型權重

?Syn-TODD 數據集上預先訓練的模型權重：https://drive.google.com/file/d/1haxiir4PdBNE9Zr1AA4D9bVJ4KCzqa8v/view

真實世界數據集 ClearPose?的模型權重：https://drive.google.com/file/d/1798AE_u6KrMV6mpUGBxz_jaLrg_21A39/view

然后創建文件夾ckpt，放到里面：

6、進行推理

首先配置文件：config/config.json，指定預訓練權重ISGNet_clearpose.p

使用CPU運行，"device":"cpu"；如果使用GPU，"device":"cuda"

然后在推理代碼inference.py中，需要修改圖片路徑，比如：?

image_path = "./datasets/clearpose_downsample_100/set1/scene1/000000-color.png"?

推理代碼如下

import json
from models.Trainer import Trainer
from utils.visualize import *image_path = "./datasets/clearpose_downsample_100/set1/scene1/000000-color.png"################ load the config file ##################
with open('config/config.json', 'r') as f:config = json.load(f)############### load the trainer ###############
trainer = Trainer(config)############### start inference ##############
trainer.inference(image_path)

執行代碼：

運行結果，在results目錄保存了

原圖是這樣的

模型預測的深度圖：

模型預測的分割效果：

MODEST對透明物體進行分割和深度預測，生成點云數據作為輸入，進而采用GraspNet生成抓取位姿。

分享完成～

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/pingmian/71880.shtml
繁體地址，請注明出處：http://hk.pswp.cn/pingmian/71880.shtml
英文地址，請注明出處：http://en.pswp.cn/pingmian/71880.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！