Skip to content
/ DovSG Public

Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation

Notifications You must be signed in to change notification settings

BJHYZJ/DovSG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DovSG

Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation

1 Introduction

DovSG constructs a Dynamic 3D Scene Graph and leverages task decomposition with large language models, enabling localized updates of the 3D scene graphs during interactive exploration. This assists mobile robots in accurately executing long-term tasks, even in scenarios where human modifications to the environment are present.

Contributors: Zhijie Yan, Shufei Li, Zuoxu Wang, Lixiu Wu, Han Wang, Jun Zhu, Lijiang Chen, Jihong Liu

1.1 Our paper

Our paper is now available on arXiv: Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation.

If our code is used in your project, please cite our paper following the bibtex below:

@misc{yan2024dynamicopenvocabulary3dscene,
      title={Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation}, 
      author={Zhijie Yan and Shufei Li and Zuoxu Wang and Lixiu Wu and Han Wang and Jun Zhu and Lijiang Chen and Jihong Liu},
      year={2024},
      eprint={2410.11989},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2410.11989}, 
}

1.2 Our demo

Our accompanying demo are now available on YouTube and Project Page.

2 Prerequisited

  • We have set up all the necessary environments on a Lenovo Y9000K laptop running Ubuntu 20.04, equipped with an NVIDIA RTX 4090 GPU with 16GB of VRAM.

  • We used a real-world setup with a UFACTORY xARM6 robotic arm on an Agilex Ranger Mini 3 mobile base, equipped with a RealSense D455 camera for perception and a basket for item transport.

2.1 Ubuntu and ROS

Ubuntu 20.04. ROS Installation.

2.1 Environment Setup

3 Run DovSG

3.1 Run our demo

You can directly download the pre-recorded scenes we provided from Google Cloud. Please place them in the project's root directory, specifically in DovSG/data_example, and set the tags to your_name_of_scene, such as room1.

python demo.py --tags room1 --preprocess --debug --task_scene_change_level "Minor Adjustment" --task_description "Please move the red pepper to the plate, then move the green pepper to plate."

3.2 Run on real world workstation

You need to refer to here to configure the aglix ranger mini.

3.2.1 You should Scanning the room for memory

python demo.py --tags `your_name_of_scene` --scanning_room --preprocess --task_scene_change_level your_task_scene_change_level --task_description your_task_description

3.2.2 In one terminal run the hardcode.

cd hardcode
source ~/agilex_ws/devel/setup.bash
rosrun ranger_bringup bringup_can2usb.bash
roslaunch ranger_bringup ranger_mini_v2.launch

# You need to replace the port with your own.
python server.py  

3.2.3 In another terminal run the Navigation and Manipulation Module.

python demo.py --tags `your_name_of_scene` --preprocess --task_scene_change_level your_task_scene_change_level --task_description your_task_description 

Reference

About

Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published