Object Detection and Pose Estimation Pipeline

Overview

This repository provides an end-to-end pipeline for:

Preparing data for YOLO-based object detection using Ultralytics’ YOLO.
Training a YOLO model to detect a specific object.
Training a simple pose estimation model (SimplePoseNet) on BOP-format data.

Repository Structure

yolo/
  prepare_data.py   # Prepare data in YOLO format
  train.py          # Train YOLO using Ultralytics
  configs/         # YOLO configuration files (.yaml)
  models/          # YOLO trained weights (.pt)

pose/
  train.py         # Train the SimplePoseNet on BOP data
  checkpoints/     # Stores pose model checkpoints
  models/          # Network definitions, losses, etc.
  trainers/        # Training logic for pose estimation

utils/
  data_utils.py    # BOP dataset loading + transforms
  obj_match.py     # Example usage of matching pipeline

inference/
  # Scripts for epipolar matching, YOLO detection, and match pipelines

datasets/          # Contains BOP and YOLO data
runs/              # YOLO training runs, logs, etc.
output/            # Various output images and results

Environment Setup

To set up the environment, follow these steps (tested on Ubuntu with an NVIDIA GPU). The environment name is bop.

Build Docker Container

cd docker/
docker build . -t bpc:2025.1.31

Run Docker

docker run -p 8888:8888 --shm-size=1g --runtime nvidia --gpus all -v $(pwd):/code -ti bpc:2025.1.31 bash
cd /code

Download Data

bash download_data.sh

Training Pipeline

Prepare YOLO Data

Convert BOP data to YOLO format:

python3 bpc/yolo/prepare_data.py \
    --dataset_path "datasets/train_pbr" \
    --output_path "datasets/yolo11/train_obj_11" \
    --obj_id 11

Train YOLO Model

python3 bpc/yolo/train.py \
    --obj_id 11 \
    --data_path "yolo/configs/data_obj_11.yaml" \
    --epochs 20 \
    --imgsz 640 \
    --batch 16 \
    --task detection

Train Pose Model

python3 train_pose.py \
  --root_dir datasets/ \
  --target_obj_id 11 \
  --epochs 5 \
  --batch_size 32 \
  --lr 1e-3 \
  --num_workers 16 \
  --checkpoints_dir yolo_ckpts/

Download Pretrained Models

wget https://storage.googleapis.com/akasha-public/IBPC/baseline_solution/v1/models.zip
unzip models.zip
rm models.zip

Run Inference

jupyter notebook --ip=0.0.0.0 --allow-root --port=8888
# Go to localhost:8888 on your browswer
# Run "Inference Notebook.ipynb"

Notes

Ensure CUDA 12.1 drivers are installed and PyTorch recognizes the GPU (nvidia-smi).
BOP dataset must follow standard conventions (train_pbr, test, etc.).
Update yolo/configs/data_obj_11.yaml with the correct dataset paths.
If encountering module import errors, try:
```
python -m idp_codebase.pose.train ...
```
or add __init__.py files where necessary.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
bpc		bpc
datasets		datasets
docker		docker
resource		resource
.gitignore		.gitignore
Inference Notebook.ipynb		Inference Notebook.ipynb
README.md		README.md
download_data.sh		download_data.sh
package.xml		package.xml
setup.cfg		setup.cfg
setup.py		setup.py
train_pose.py		train_pose.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object Detection and Pose Estimation Pipeline

Overview

Repository Structure

Environment Setup

Build Docker Container

Run Docker

Download Data

Training Pipeline

Prepare YOLO Data

Train YOLO Model

Train Pose Model

Download Pretrained Models

Run Inference

Notes

About

Releases

Packages

Languages

SilentWolf27/bin_picking_challenge_opencv

Folders and files

Latest commit

History

Repository files navigation

Object Detection and Pose Estimation Pipeline

Overview

Repository Structure

Environment Setup

Build Docker Container

Run Docker

Download Data

Training Pipeline

Prepare YOLO Data

Train YOLO Model

Train Pose Model

Download Pretrained Models

Run Inference

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages