Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use builtin datasets for detection training #1830

Open
KenjiTakahashi opened this issue Jan 5, 2025 · 1 comment
Open

Cannot use builtin datasets for detection training #1830

KenjiTakahashi opened this issue Jan 5, 2025 · 1 comment
Labels
ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: documentation Improvements or additions to documentation topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition type: bug Something isn't working type: enhancement Improvement
Milestone

Comments

@KenjiTakahashi
Copy link

Bug description

I've modified the reference (PyTorch) training code to use SVHN dataset.
That did not work (see traceback).
By looking at what DetectionDataset class is doing and some trial and error, I managed to get it working (or at least running) by restructuring the targets from the dataset as follows:

targets = [{"words": t} for t in targets]

Don't know if this is "valid" fix, though.

Code snippet to reproduce the bug

See https://gist.github.com/KenjiTakahashi/9bb22093d584bb2b203eb003a2bbb414.

Like mentioned, this is mostly the same code as
https://github.com/mindee/doctr/blob/e6bf82d6a74a52cedac17108e596b9265c4e43c5/references/detection/train_pytorch.py
with slight modifications to work with SVHN class instead of DetectionDataset.

Error traceback

Traceback (most recent call last):                                                                                                                                                   | 0/16701 [00:00<?, ?it/s]
  File "<string>", line 1, in <module>
  File "ocr/src/ocr/main2.py", line 499, in main
    _main(args)
  File "ocr/src/ocr/main2.py", line 405, in _main
    fit_one_epoch(model, train_loader, batch_transforms, optimizer, scheduler, amp=args.amp)
  File "ocr/src/ocr/main2.py", line 126, in fit_one_epoch
    train_loss = model(images, targets)["loss"]
                 ^^^^^^^^^^^^^^^^^^^^^^
  File "ocr/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ocr/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ocr/.venv/lib/python3.12/site-packages/doctr/models/detection/fast/pytorch.py", line 208, in forward
    loss = self.compute_loss(logits, target)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ocr/.venv/lib/python3.12/site-packages/doctr/models/detection/fast/pytorch.py", line 231, in compute_loss
    targets = self.build_target(target, out_map.shape[1:], False)  # type: ignore[arg-type]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ocr/.venv/lib/python3.12/site-packages/doctr/models/detection/fast/base.py", line 177, in build_target
    if any(t.dtype != np.float32 for tgt in target for t in tgt.values()):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ocr/.venv/lib/python3.12/site-packages/doctr/models/detection/fast/base.py", line 177, in <genexpr>
    if any(t.dtype != np.float32 for tgt in target for t in tgt.values()):
                                                            ^^^^^^^^^^
AttributeError: 'numpy.ndarray' object has no attribute 'values'

Environment

This script does not seem to work well on (my) MacOS, it mostly returns N/A's. Anyway, I run it currently on MacOS 14 with latest doctr (tried both 0.10 release and master at e6bf82d) and PyTorch.

Deep Learning backend

I'm using PyTorch, but same problem happens on TF as well.

@KenjiTakahashi KenjiTakahashi added the type: bug Something isn't working label Jan 5, 2025
@felixdittrich92
Copy link
Contributor

Hi @KenjiTakahashi 👋,

Yeah you are right currently only the validation scripts provides direct support for the built-in datasets .. so your "quick fix" is valid.

Support for the built-in datasets is something we could add especially for pretraining on SynthText it would be useful.

The issue here is SVHN because it contains really small image crops of house numbers and currently we resize everything to 1024x1024 by keeping the aspect ratio and symmetric padding .. which is well chosen for the document case.

Have you tried any of the other built-in datasets ? With every other it should work (only SVHN is a bit "special")

Best regards,
Felix :)

@felixdittrich92 felixdittrich92 added topic: documentation Improvements or additions to documentation type: enhancement Improvement ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition labels Jan 30, 2025
@felixdittrich92 felixdittrich92 added this to the 1.0.0 milestone Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: documentation Improvements or additions to documentation topic: text detection Related to the task of text detection topic: text recognition Related to the task of text recognition type: bug Something isn't working type: enhancement Improvement
Projects
None yet
Development

No branches or pull requests

2 participants