+

-> This is the first and only (for now) **`YOLO family variant with transformers!`** and more advanced YOLO with multi-tasking such as detect & segmentation at the same time!
+
YOLOv7 - Make YOLO Great Again
+[Documentation](https://github.com/jinfagang/yolov7) •
+[Installation Instructions](https://github.com/jinfagang/yolov7) •
+[Deployment](#deploy) •
+[Contributing](.github/CONTRIBUTING.md) •
+[Reporting Issues](https://github.com/jinfagang/yolov7/issues/new?assignees=&labels=&template=bug-report.yml)
-Just another yolo variant implemented based on **`detectron2`**. Be note that **YOLOv7 doesn't meant to be a successor of yolo family, 7 is just my magic and lucky number**. In our humble opinion, a good opensource project must have these features:
-- It must be reproduceble;
-- It must be simple and understandable;
-- It must be build with the weapon of the edge;
-- It must have a good maintainance, listen to the voice from community;
+[](https://pypi.org/project/alfred-py/)
+[](https://pepy.tech/project/yolort)
+[](https://img.shields.io/github/downloads/jinfagang/yolov7/total?color=blue&label=Downloads&logo=github&logoColor=lightgrey)
-However, we found many opensource detection framework such as YOLOv5, Efficientdet have their own weakness, for example, YOLOv5 is very good at reproduceable but really over-engineered, too many messy codes. What's more surprisingly, there were at least 20+ different version of re-implementation of YOLOv3-YOLOv4 in pytorch, 99.99% of them were totally **wrong**, either can u train your dataset nor make it mAP comparable with origin paper.(However, *doesn't mean this work is totally right, use at your own risk*.)
+[](https://codecov.io/gh/zhiqwang/yolov5-rt-stack)
+[](LICENSE)
+[](https://join.slack.com/t/yolort/shared_invite/zt-mqwc7235-940aAh8IaKYeWclrJx10SA)
+[](https://github.com/jinfagang/yolov7/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22)
-That's why we have this project! It's much more simpler to experiment different ARCH of YOLO build upon detectron2 with YOLOv7! Most importantly, more and more decent YOLO series model merged into this repo such as YOLOX (most decent in 2021). We also **welcome any trick/experiment PR on YOLOv7, help us build it better and stronger!!**. Please **star it and fork it right now!**.
+
+
+## Migration Warning!
+
+Since someone else created another YOLOv7 **after** us, We don't want make people messed up with 2 of them, **Also we don't want chasing the meaningless AP number as sort of stunts**. So We plan to move further development of YOLOv7 into new place -> [YOLOvn link](https://github.com/jinfagang/yolovn). **new famework will keep development forever!!** These unfinished PRs will merge then start migrate. Thanks for everyone's contribution! Again, new framework is not only for re-implement SOTA models but also exploring new model design, **we are not only exploring detection, but also multi-tasking and new transformer arch design**.
+
+
+
+> In short: **YOLOv7 added instance segmentation to YOLO arch**. Also many transformer backbones, archs included. If you look carefully, you'll find our ultimate vision is to **make YOLO great again** by the power of **transformers**, as well as **multi-tasks training**. YOLOv7 achieves mAP 43, AP-s exceed MaskRCNN by 10 with a convnext-tiny backbone while simillar speed with YOLOX-s, more models listed below, it's more accurate and even more lighter!
+
+> GPU resources wanted! yolov7 next version is up-coming, however, I didn't have enough GPU to train pretrained models for everyone, if you have GPUs, please fire a discussion and ping me, I will guide to train new models.
+
+Thanks for Aarohi's youtube vlog for guaidance of yolov7: https://www.youtube.com/watch?v=ag88beS_fvM , if you want a quick start, take a look at this nice introduction on yolov7 and detectron2.
+
+For someone who still said we shouldn't name yolov7, here is the clarify: We create repo much much more earlier than someone else's paper, we also don't want make you confuse, but as we said, we take this name long long time ago. Besides, our yolov7 is a framework, whole **modeling is very intuitive** not like yolov5's yml config model way, it's pure in python all at your control. And inside yolov7, we supported a huge range of combination such as YOLOX, YOLOX-Lite, YOLOX-Mask, YOLOX-Keypoint, YOLOv6 Head, YOLOv4, Mosiac Augmentation etc. **Using which framework is at your choice, please stop bothering us at naming, please take a look at the create repo time screenshot below**. **WE ARE EXISTED ALREADY LAST YEAR**.
+
+
+## New version will release!
+
+**YOLOv7** v2.0 will be released soon! We will release our Convext-tiny YOLO arch model achieves mAP 43.9 with very low latency! Feature will be included in next version:
+
+- Support EfficientFormer backbone;
+- Support new YOLO2Go model, more lighter, much more faster and much more accurate;
+- Support MobileOne backbone;
+
+For more details, refer to [read the doc](https://yolov7.readthedocs.io/en/latest).
+
+Just **fork and star!**, you will be noticed once we release the new version!
+
+🔥🔥🔥 Just another yolo variant implemented based on **`detectron2`**. But note that **YOLOv7 isn't meant to be a successor of yolo family, 7 is just a magic and lucky number. Instead, YOLOv7 extends yolo into many other vision tasks, such as instance segmentation, one-stage keypoints detection etc.**.
The supported matrix in YOLOv7 are:
@@ -41,52 +74,148 @@ The supported matrix in YOLOv7 are:
- [x] YOLOv7 with Res2Net-v1d backbone, we **found res2net-v1d** have a better accuracy then darknet53;
- [x] Added PPYOLOv2 PAN neck with SPP and dropblock;
- [x] YOLOX arch added, now you can train YOLOX model (**anchor free yolo**) as well;
-- [ ] DETR: transformer based detection model and **onnx export supported, as well as TensorRT acceleration**;
+- [x] DETR: transformer based detection model and **onnx export supported, as well as TensorRT acceleration**;
+- [x] AnchorDETR: Faster converge version of detr, now supported!
+- [x] Almost all models can export to onnx;
+- [x] Supports TensorRT deployment for DETR and other transformer models;
+- [ ] It will integrate with [wanwu](https://github.com/jinfagang/wanwu_release), a torch-free deploy framework run fastest on your target platform.
+
+
+> ⚠️ Important note: **YOLOv7 on Github not the latest version, many features are closed-source but you can get it from https://manaai.cn**
+
+Features are ready but not opensource yet:
+
+- [x] Convnext training on YOLOX, higher accuracy than original YOLOX;
+- [x] GFL loss support;
+- [x] **MobileVit-V2** backbone available;
+- [x] CSPRep-Resnet: a repvgg style resnet used in PP-YOLOE but in pytorch rather than paddle;
+- [ ] VitDet support;
+- [ ] Simple-FPN support from VitDet;
+- [ ] PP-YOLOE head supported;
+
+If you want get full version YOLOv7, either **become a contributor** or get from https://manaai.cn .
+
+
+## 🆕 News!
+
+- ***2022.07.26***: Now we are preparing release new pose model;
+- ***2022.06.25***: Meituan's YOLOv6 training has been supported in YOLOv7!
+- ***2022.06.13***: New model **YOLOX-Convnext-tiny** got a ~~41.3~~ 43 mAP beats yolox-s, AP-small even higher!;
+- ***2022.06.09***: **GFL**, general focal loss supported;
+- ***2022.05.26***: Added **YOLOX-ConvNext** config;
+- ***2022.05.18***: DINO, DNDetr and DABDetr are about added, new records on coco up to 63.3 AP!
+- ***2022.05.09***: Big new function added! **We adopt YOLOX with Keypoints Head!**, model still under train, but you can check at code already;
+- ***2022.04.23***: We finished the int8 quantization on SparseInst! It works perfect! Download the onnx try it our by your self.
+- ***2022.04.15***: Now, we support the `SparseInst` onnx expport!
+- ***2022.03.25***: New instance seg supported! 40 FPS @ 37 mAP!! Which is fast;
+- ***2021.09.16***: First transformer based DETR model added, will explore more DETR series models;
+- ***2021.08.02***: **YOLOX** arch added, you can train YOLOX as well in this repo;
+- ***2021.07.25***: We found **YOLOv7-Res2net50** beat res50 and darknet53 at same speed level! 5% AP boost on custom dataset;
+- ***2021.07.04***: Added YOLOF and we can have a anchor free support as well, YOLOF achieves a better trade off on speed and accuracy;
+- ***2021.06.25***: this project first started.
+- more
-## Rules
+## 🌹 Contribution Wanted
-There are some rules you must follow to if you want train on your own dataset:
+If you have spare time or if you have GPU card, then help YOLOv7 become more stronger! Here is the guidance of contribute:
-- Rule No.1: Always set your own anchors on your dataset, using `tools/compute_anchors.py`, this applys to any other anchor-based detection methods as well (EfficientDet etc.);
-- Rule No.2: Keep a faith on your loss will goes down eventually, if not, dig deeper to find out why (but do not post issues repeated caused I might don't know either.).
-- Rule No.3: No one will tells u but it's real: *do not change backbone easily, whole params coupled with your backbone, dont think its simple as you think it should be*, also a Deeplearning engineer **is not an easy work as you think**, the whole knowledge like an ocean, and your knowledge is just a tiny drop of water...
-- Rule No.4: **must** using pretrain weights for **transoformer based backbone**, otherwise your loss will bump;
+1. **`Claim task`**: I have some ideas but do not have enough time to do it, if you want to implement it, claim the task, **I will give u detailed advise on how to do, and you can learn a lot from it**;
+2. **`Test mAP`**: When you finished new idea implementation, create a thread to report experiment mAP, if it work, then merge into our main master branch;
+3. **`Pull request`**: YOLOv7 is open and always tracking on SOTA and **light** models, if a model is useful, we will merge it and deploy it, distribute to all users want to try.
-Make sure you have read **rules** before ask me any questions.
+Here are some tasks need to be claimed:
+- [ ] VAN: Visual Attention Network, [paper](https://arxiv.org/abs/2202.09741), [VAN-Segmentation](https://github.com/Visual-Attention-Network/VAN-Segmentation), it was better than Swin and PVT and DeiT:
+ - [ ] D2 VAN backbone integration;
+ - [ ] Test with YOLOv7 arch;
+- [ ] ViDet: [code](https://github.com/naver-ai/vidt), this provides a realtime detector based on transformer, Swin-Nano mAP: 40, while 20 FPS, it can be integrated into YOLOv7;
+ - [ ] Integrate into D2 backbone, remove MSAtten deps;
+ - [ ] Test with YOLOv7 or DETR arch;
+- [ ] DINO: 63.3mAP highest in 2022 on coco.
+ - [ ] Code for [DINO](https://arxiv.org/abs/2203.03605) is avaliable [here](https://github.com/IDEACVR/DINO).
+- [x] ConvNext: https://github.com/facebookresearch/ConvNeXt, combined convolution and transformer.
+- [ ] NASVit: https://github.com/facebookresearch/NASViT
+- [ ] MobileVIT: https://github.com/apple/ml-cvnets/blob/main/cvnets/models/classification/mobilevit.py
+- [ ] DAB-DETR: https://github.com/IDEA-opensource/DAB-DETR, WIP
+- [ ] DN-DETR: https://github.com/IDEA-opensource/DN-DETR
+- [ ] EfficientNetV2: https://github.com/jahongir7174/EfficientNetV2
+Just join our in-house contributor plan, you can share our newest code with your contribution!
-## News!
-- **2021.09.16**: First transformer based DETR model added, will explore more DETR series models;
-- **2021.08.02**: **YOLOX** arch added, you can train YOLOX as well in this repo;
-- **2021.07.25**: We found **YOLOv7-Res2net50** beat res50 and darknet53 at same speed level! 5% AP boost on custom dataset;
-- **2021.07.04**: Added YOLOF and we can have a anchor free support as well, YOLOF achieves a better trade off on speed and accuracy;
-- **2021.06.25**: this project first started.
-- more
+## 💁♂️ Results
+| YOLOv7 Instance | Face & Detection |
+:-------------------------:|:-------------------------:
+ | 
+ | 
+ | 
+ | 
+ | 
+ | 
+ | 
+ | 
-## Train
-For training, quit simple, same as detectron2:
+## 🧑🦯 Installation && Quick Start
-```
-python train_net.py --config-file configs/coco/darknet53.yaml --num-gpus 8
-```
+- See [docs/install.md](docs/install.md)
-If you want train YOLOX, you can using config file `configs/coco/yolox_s.yaml`. All support arch are:
+Special requirements (other version may also work, but these are tested, with best performance, including ONNX export best support):
-- **YOLOX**: anchor free yolo;
-- **YOLOv7**: traditional yolo with some explorations, mainly focus on loss experiments;
-- **YOLOv7P**: traditional yolo merged with decent arch from YOLOX;
-- **YOLOMask**: arch do detection and segmentation at the same time (tbd);
-- **YOLOInsSeg**: instance segmentation based on YOLO detection (tbd);
+- torch 1.11 (stable version)
+- onnx
+- onnx-simplifier 0.3.7
+- alfred-py latest
+- detectron2 latest
+
+If you using lower version torch, onnx exportation might not work as our expected.
+
+
+
+## 🤔 Features
+
+Some highlights of YOLOv7 are:
+- A simple and standard training framework for any detection && instance segmentation tasks, based on detectron2;
+- Supports DETR and many transformer based detection framework out-of-box;
+- Supports easy to deploy pipeline thought onnx.
+- **This is the only framework support YOLOv4 + InstanceSegmentation** in single stage style;
+- Easily plugin into transformers based detector;
-## Demo
+We are strongly recommend you send PR if you have any further development on this project, **the only reason for opensource it is just for using community power to make it stronger and further**. It's very welcome for anyone contribute on any features!
+
+## 🧙♂️ Pretrained Models
+
+| model | backbone | input | aug | AP