Adaptive Visual Abstraction via Object Token Merging and Pruning for Efficient Robot Manipulation

Accepted at CVPR 2024 Workshop CoRR

Robots must efficiently manipulate objects in complex, unstructured environments. This entails identifying task-relevant objects, which consist of objects that are directly connected to the goal and constraint objects that may cause collisions during robot execution. Leveraging foundation models like Vision-Language Model or CLIP holds promise, yet they usually lack awareness of the robot's configuration and fail to recognize constraint objects, resulting in sub-optimal performance. Fine-grained object segments offer an alternative but are computationally expensive. Humans instinctively process information about objects in a manner that aligns with the demands of the task and trajectory requirements. Inspired by this, we propose integrating an architectural bias into imitation learning framework. By merging and pruning object tokens based on task relevance and importance, our method, named as GoS, reduces computational burdens and enhances task understanding, leading to higher success rates. Applied to vision-based multi-task articulated object manipulation domain, our approach shows 1.7 $\times$ higher success rate in general scenes, 1.6 $\times$ higher success rate in scenes where constraint objects exist, and 3 $\times$ less computation cost.

Note that our code base is built upon RVT.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
misc		misc
models		models
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adaptive Visual Abstraction via Object Token Merging and Pruning for Efficient Robot Manipulation

About

Releases

Packages

Languages

JisuHann/GoS

Folders and files

Latest commit

History

Repository files navigation

Adaptive Visual Abstraction via Object Token Merging and Pruning for Efficient Robot Manipulation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages