#################### PROJECT IS IN DEVELOPMENT ####################
This repository contains a set of Python scripts for detecting objects using YOLOv8, capturing images of unknown objects, cropping those images based on detected bounding boxes, and automatically classifying the cropped images using clustering.
The repository includes three main scripts:
object_detector.py
: Detects objects in real-time using YOLOv8, captures images of unknown objects, and saves them along with metadata.crop_images.py
: Crops the captured images based on the bounding boxes stored in the metadata.auto_classifier.py
: Automatically classifies the cropped images using clustering and user input for labeling.
- Python 3.6 or higher
- OpenCV
- NumPy
- scikit-learn
- tqdm
- ultralytics (for YOLOv8)
-
Clone the repository:
git clone https://github.com/0xroyce/yolov8-dct.git cd yolov8-dct
-
Create a virtual environment and activate it:
python3 -m venv venv source venv/bin/activate
-
Install the required packages:
pip install -r requirements.txt
Run object_detector.py
to start detecting objects and capturing images of unknown objects.
python object_detector.py
This script uses your webcam to detect objects in real-time. Known objects are highlighted in green, while unknown objects are highlighted in red and captured for further processing.
Run crop_images.py
to crop the captured images based on the bounding boxes stored in the metadata.
python crop_images.py
This script processes the images in the unknown_objects
directory and saves the cropped images in the cropped_unknown_objects
directory.
Run auto_classifier.py
to classify the cropped images using clustering.
python auto_classifier.py
This script uses KMeans clustering to group similar images together. You will be prompted to label each cluster, and the images will be renamed accordingly.
unknown_objects/
: Contains the images and metadata of unknown objects captured byobject_detector.py
.cropped_unknown_objects/
: Contains the cropped images processed bycrop_images.py
.
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.