Skip to content

Commit

Permalink
Docs partial mdformat improvements (ultralytics#7378)
Browse files Browse the repository at this point in the history
Signed-off-by: Glenn Jocher <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
glenn-jocher and pre-commit-ci[bot] authored Jan 7, 2024
1 parent ed73c0f commit bb1326a
Show file tree
Hide file tree
Showing 52 changed files with 232 additions and 262 deletions.
5 changes: 3 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,11 @@ repos:
name: MD formatting
additional_dependencies:
- mdformat-gfm
# - mdformat-black
# - mdformat-frontmatter
- mdformat-frontmatter
- mdformat-mkdocs
args:
- --wrap=no
- --number
exclude: 'docs/.*\.md'
# exclude: "README.md|README.zh-CN.md|CONTRIBUTING.md"

Expand Down
3 changes: 1 addition & 2 deletions docs/en/datasets/classify/imagenet10.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,7 @@ To test a deep learning model on the ImageNet10 dataset with an image size of 22

The ImageNet10 dataset contains a subset of images from the original ImageNet dataset. These images are chosen to represent the first 10 classes in the dataset, providing a diverse yet compact dataset for quick testing and evaluation.

![Dataset sample images](https://user-images.githubusercontent.com/26833433/239689723-16f9b4a7-becc-4deb-b875-d3e5c28eb03b.png)
The example showcases the variety and complexity of the images in the ImageNet10 dataset, highlighting its usefulness for sanity checks and quick testing of computer vision models.
![Dataset sample images](https://user-images.githubusercontent.com/26833433/239689723-16f9b4a7-becc-4deb-b875-d3e5c28eb03b.png) The example showcases the variety and complexity of the images in the ImageNet10 dataset, highlighting its usefulness for sanity checks and quick testing of computer vision models.

## Citations and Acknowledgments

Expand Down
20 changes: 10 additions & 10 deletions docs/en/datasets/classify/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,16 +104,16 @@ In this example, the `train` directory contains subdirectories for each class in

Ultralytics supports the following datasets with automatic download:

* [Caltech 101](caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
* [Caltech 256](caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
* [CIFAR-10](cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
* [CIFAR-100](cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
* [Fashion-MNIST](fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
* [ImageNet](imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
* [ImageNet-10](imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
* [Imagenette](imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
* [Imagewoof](imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
* [MNIST](mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
- [Caltech 101](caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
- [Caltech 256](caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
- [CIFAR-10](cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
- [CIFAR-100](cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
- [Fashion-MNIST](fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
- [ImageNet](imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
- [ImageNet-10](imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
- [Imagenette](imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
- [Imagewoof](imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
- [MNIST](mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.

### Adding your own dataset

Expand Down
3 changes: 1 addition & 2 deletions docs/en/datasets/detect/coco8.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ keywords: Ultralytics, COCO8 dataset, object detection, model testing, dataset c

[Ultralytics](https://ultralytics.com) COCO8 is a small, but versatile object detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.

This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com) and [YOLOv8](https://github.com/ultralytics/ultralytics).

## Dataset YAML

Expand Down
18 changes: 8 additions & 10 deletions docs/en/datasets/explorer/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ dataframe = explorer.get_similar()(idx=0)

## 1. Similarity Search

Similarity search is a technique for finding similar images to a given image. It is based on the idea that similar images will have similar embeddings.
One the embeddings table is built, you can get run semantic search in any of the following ways:
Similarity search is a technique for finding similar images to a given image. It is based on the idea that similar images will have similar embeddings. One the embeddings table is built, you can get run semantic search in any of the following ways:

- On a given index / list of indices in the dataset like - `exp.get_similar(idx=[1,10], limit=10)`
- On any image/ list of images not in the dataset - `exp.get_similar(img=["path/to/img1", "path/to/img2"], limit=10)`
Expand Down Expand Up @@ -210,8 +209,7 @@ When using large datasets, you can also create a dedicated vector index for fast
table.create_index(num_partitions=..., num_sub_vectors=...)
```

Find more details on the type vector indices available and parameters [here](https://lancedb.github.io/lancedb/ann_indexes/#types-of-index)
In the future, we will add support for creating vector indices directly from Explorer API.
Find more details on the type vector indices available and parameters [here](https://lancedb.github.io/lancedb/ann_indexes/#types-of-index) In the future, we will add support for creating vector indices directly from Explorer API.

## 4. Embeddings Applications

Expand All @@ -221,15 +219,15 @@ You can use the embeddings table to perform a variety of exploratory analysis. H

Explorer comes with a `similarity_index` operation:

* It tries to estimate how similar each data point is with the rest of the dataset.
* It does that by counting how many image embeddings lie closer than `max_dist` to the current image in the generated embedding space, considering `top_k` similar images at a time.
- It tries to estimate how similar each data point is with the rest of the dataset.
- It does that by counting how many image embeddings lie closer than `max_dist` to the current image in the generated embedding space, considering `top_k` similar images at a time.

It returns a pandas dataframe with the following columns:

* `idx`: Index of the image in the dataset
* `im_file`: Path to the image file
* `count`: Number of images in the dataset that are closer than `max_dist` to the current image
* `sim_im_files`: List of paths to the `count` similar images
- `idx`: Index of the image in the dataset
- `im_file`: Path to the image file
- `count`: Number of images in the dataset that are closer than `max_dist` to the current image
- `sim_im_files`: List of paths to the `count` similar images

!!! Tip

Expand Down
3 changes: 1 addition & 2 deletions docs/en/datasets/pose/coco8-pose.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ keywords: Ultralytics, YOLOv8, pose detection, COCO8-Pose dataset, dataset, mode

[Ultralytics](https://ultralytics.com) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.

This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com) and [YOLOv8](https://github.com/ultralytics/ultralytics).

## Dataset YAML

Expand Down
3 changes: 1 addition & 2 deletions docs/en/datasets/pose/tiger-pose.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@ keywords: Ultralytics, YOLOv8, pose detection, COCO8-Pose dataset, dataset, mode

Despite its manageable size of 210 images, tiger-pose dataset offers diversity, making it suitable for assessing training pipelines, identifying potential errors, and serving as a valuable preliminary step before working with larger datasets for pose estimation.

This dataset is intended for use with [Ultralytics HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
This dataset is intended for use with [Ultralytics HUB](https://hub.ultralytics.com) and [YOLOv8](https://github.com/ultralytics/ultralytics).

<p align="center">
<br>
Expand Down
3 changes: 1 addition & 2 deletions docs/en/datasets/segment/coco8-seg.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ keywords: COCO8-Seg dataset, Ultralytics, YOLOv8, instance segmentation, dataset

[Ultralytics](https://ultralytics.com) COCO8-Seg is a small, but versatile instance segmentation dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging segmentation models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.

This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com) and [YOLOv8](https://github.com/ultralytics/ultralytics).

## Dataset YAML

Expand Down
4 changes: 2 additions & 2 deletions docs/en/datasets/segment/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ The `train` and `val` fields specify the paths to the directories containing the

## Supported Datasets

* [COCO](coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
* [COCO8-seg](coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
- [COCO](coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
- [COCO8-seg](coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.

### Adding your own dataset

Expand Down
74 changes: 37 additions & 37 deletions docs/en/guides/hyperparameter-tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,37 +116,37 @@ This YAML file contains the best-performing hyperparameters found during the tun
- **Format**: YAML
- **Usage**: Hyperparameter results
- **Example**:
```yaml
# 558/900 iterations complete ✅ (45536.81s)
# Results saved to /usr/src/ultralytics/runs/detect/tune
# Best fitness=0.64297 observed at iteration 498
# Best fitness metrics are {'metrics/precision(B)': 0.87247, 'metrics/recall(B)': 0.71387, 'metrics/mAP50(B)': 0.79106, 'metrics/mAP50-95(B)': 0.62651, 'val/box_loss': 2.79884, 'val/cls_loss': 2.72386, 'val/dfl_loss': 0.68503, 'fitness': 0.64297}
# Best fitness model is /usr/src/ultralytics/runs/detect/train498
# Best fitness hyperparameters are printed below.

lr0: 0.00269
lrf: 0.00288
momentum: 0.73375
weight_decay: 0.00015
warmup_epochs: 1.22935
warmup_momentum: 0.1525
box: 18.27875
cls: 1.32899
dfl: 0.56016
hsv_h: 0.01148
hsv_s: 0.53554
hsv_v: 0.13636
degrees: 0.0
translate: 0.12431
scale: 0.07643
shear: 0.0
perspective: 0.0
flipud: 0.0
fliplr: 0.08631
mosaic: 0.42551
mixup: 0.0
copy_paste: 0.0
```
```yaml
# 558/900 iterations complete ✅ (45536.81s)
# Results saved to /usr/src/ultralytics/runs/detect/tune
# Best fitness=0.64297 observed at iteration 498
# Best fitness metrics are {'metrics/precision(B)': 0.87247, 'metrics/recall(B)': 0.71387, 'metrics/mAP50(B)': 0.79106, 'metrics/mAP50-95(B)': 0.62651, 'val/box_loss': 2.79884, 'val/cls_loss': 2.72386, 'val/dfl_loss': 0.68503, 'fitness': 0.64297}
# Best fitness model is /usr/src/ultralytics/runs/detect/train498
# Best fitness hyperparameters are printed below.

lr0: 0.00269
lrf: 0.00288
momentum: 0.73375
weight_decay: 0.00015
warmup_epochs: 1.22935
warmup_momentum: 0.1525
box: 18.27875
cls: 1.32899
dfl: 0.56016
hsv_h: 0.01148
hsv_s: 0.53554
hsv_v: 0.13636
degrees: 0.0
translate: 0.12431
scale: 0.07643
shear: 0.0
perspective: 0.0
flipud: 0.0
fliplr: 0.08631
mosaic: 0.42551
mixup: 0.0
copy_paste: 0.0
```
#### best_fitness.png
Expand All @@ -166,12 +166,12 @@ A CSV file containing detailed results of each iteration during the tuning. Each
- **Format**: CSV
- **Usage**: Per-iteration results tracking.
- **Example**:
```csv
fitness,lr0,lrf,momentum,weight_decay,warmup_epochs,warmup_momentum,box,cls,dfl,hsv_h,hsv_s,hsv_v,degrees,translate,scale,shear,perspective,flipud,fliplr,mosaic,mixup,copy_paste
0.05021,0.01,0.01,0.937,0.0005,3.0,0.8,7.5,0.5,1.5,0.015,0.7,0.4,0.0,0.1,0.5,0.0,0.0,0.0,0.5,1.0,0.0,0.0
0.07217,0.01003,0.00967,0.93897,0.00049,2.79757,0.81075,7.5,0.50746,1.44826,0.01503,0.72948,0.40658,0.0,0.0987,0.4922,0.0,0.0,0.0,0.49729,1.0,0.0,0.0
0.06584,0.01003,0.00855,0.91009,0.00073,3.42176,0.95,8.64301,0.54594,1.72261,0.01503,0.59179,0.40658,0.0,0.0987,0.46955,0.0,0.0,0.0,0.49729,0.80187,0.0,0.0
```
```csv
fitness,lr0,lrf,momentum,weight_decay,warmup_epochs,warmup_momentum,box,cls,dfl,hsv_h,hsv_s,hsv_v,degrees,translate,scale,shear,perspective,flipud,fliplr,mosaic,mixup,copy_paste
0.05021,0.01,0.01,0.937,0.0005,3.0,0.8,7.5,0.5,1.5,0.015,0.7,0.4,0.0,0.1,0.5,0.0,0.0,0.0,0.5,1.0,0.0,0.0
0.07217,0.01003,0.00967,0.93897,0.00049,2.79757,0.81075,7.5,0.50746,1.44826,0.01503,0.72948,0.40658,0.0,0.0987,0.4922,0.0,0.0,0.0,0.49729,1.0,0.0,0.0
0.06584,0.01003,0.00855,0.91009,0.00073,3.42176,0.95,8.64301,0.54594,1.72261,0.01503,0.59179,0.40658,0.0,0.0987,0.46955,0.0,0.0,0.0,0.49729,0.80187,0.0,0.0
```

#### tune_scatter_plots.png

Expand Down
Loading

0 comments on commit bb1326a

Please sign in to comment.