Skip to content

Latest commit

 

History

History
79 lines (66 loc) · 2.8 KB

README.md

File metadata and controls

79 lines (66 loc) · 2.8 KB

RoboKit

A toolkit for robotic tasks. This toolkit is compiled by Jishnu Jaykumar P.

Features

  • Zero-shot classification using OpenAI CLIP.
  • Zero-shot text-to-bbox approach for object detection using GroundingDINO.
  • Zero-shot bbox-to-mask approach for object detection using SegmentAnything (MobileSAM).
  • Zero-shot image-to-depth approach for depth estimation using Depth Anything.
  • Zero-shot feature upsampling using FeatUp.

Getting Started

Prerequisites

  • Python 3.9
  • torch (tested 2.0)
  • torchvision

Installation

Before installing, set the CUDA_HOME path. Make sure to replace your cuda path below.

export CUDA_HOME=/use/local/cuda
pip install -r requirements.txt
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
python setup.py install

Note: Check GroundingDINO installation for the following error

NameError: name '_C' is not defined

Usage

Sample output of GroundingDINO + SAM

Input Image

Segmented Image

Roadmap

Future goals for this project include:

  • Add a config to set the pretrained checkpoints dynamically
  • More: TODO

Acknowledgments

This project is based on the following repositories (license check mandatory):

License

This project is licensed under the MIT License. However, before using this tool please check the respective works for specific licenses.