Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uncaught Exception for bad tag in dicom file. #16

Open
sgithens opened this issue May 29, 2015 · 12 comments
Open

Uncaught Exception for bad tag in dicom file. #16

sgithens opened this issue May 29, 2015 · 12 comments

Comments

@sgithens
Copy link
Contributor

While running a large batch of mammograms today, dicom-anon blew up with the following exception. It appears that the error is in the mammogram file, but I'm wondering if we shouldn't be blowing up when it occurs.

Would the appropriate action in this case be to quarantine the image? I can try to put together a patch if you think this is the appropriate behavior. (If not, what should we be doing in this case?)

Traceback (most recent call last):
  File "dicom_anon.py", line 915, in <module>
    white_list_file=white_list_file, log_file=options.log_file, rename=options.rename,profile=options.profile, overlay=options.overlay)
  File "dicom_anon.py", line 760, in driver
    ds = anonymize(ds, white_list, org_root, profile, overlay)
  File "dicom_anon.py", line 678, in anonymize
    ds.remove_private_tags()
  File "/home/sgithens/Envs/opal/lib/python2.7/site-packages/dicom/dataset.py", line 483, in remove_private_tags
    self.walk(RemoveCallback)
  File "/home/sgithens/Envs/opal/lib/python2.7/site-packages/dicom/dataset.py", line 595, in walk
    callback(self, data_element)  # self = this Dataset
  File "/usr/local/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/sgithens/Envs/opal/lib/python2.7/site-packages/dicom/tagtools.py", line 21, in tag_in_exception
    raise type(e)(err)
ValueError: Invalid tag (0018, 1152): invalid literal for int() with base 10: '147.3'
@cancan101
Copy link

It seems that most fields can just be blanked out. i.e. I think this exception is thrown when reading the value for the purposes of writing the audit db.

@jeffmax
Copy link
Contributor

jeffmax commented May 30, 2015

This appears to be thrown from a function in pydicom that is supposed to remove all non-standard tags (https://www.medicalconnections.co.uk/kb/DICOM_Private_Elements). I'll take a closer look.

@jeffmax
Copy link
Contributor

jeffmax commented May 30, 2015

So, interestingly, the field it is throwing the exception on is not a private data element, but the "Exposure" which has a VR of IS (Integer String). It looks like it is just an incorrect DICOM file that has placed a decimal value in an IS field. It looks like remove_prive_tags() recurses over each attribute looking for private tags. It is a shame to quarantine the file completely for such a minor issue, but taking on the task of fixing up bad DICOM files so pydicom will operate on them sounds out of scope for this script so it is probably the best we can do.

This seems to be a similar issue: https://code.google.com/p/pydicom/issues/detail?id=152

@jeffmax
Copy link
Contributor

jeffmax commented May 30, 2015

On the bright-side, looking into this I learned about this: https://docs.python.org/2/library/contextlib.html#contextlib.contextmanager

@sgithens
Copy link
Contributor Author

sgithens commented Jun 1, 2015

@jeffmax Thanks for the diagnostic. Do you think the best thing then is to wrap the anonymize at line 760 in a try/except and then quarantine it the same as on lines 745-749?

@cancan101
Copy link

What line do you think is actually throwing the exception? Because of the way the walk is called, the original call site of the exception is lost in the trace above.

@sgithens
Copy link
Contributor Author

sgithens commented Jun 1, 2015

The first site in dicom_anon or in the third party libraries? I'm pretty sure the trace above has the accurate point for that being dicom_anon.py:678. I was going to wrap it at at 760 though for that anonymize call, since at that point all the arguments necessary to call the quarantine_file method are available.

@cancan101
Copy link

So the issue looks like pydicom tried to parse the element even though you are just checking the is_private status:

pydicom/dataset.pyc in __getitem__(self, key)
    306                 character_set = default_encoding
    307             # Not converted from raw form read from file yet; do so now
--> 308             self[tag] = DataElement_from_raw(data_elem, character_set)
    309         return dict.__getitem__(self, tag)

One option would be to have DataElement_from_raw inject a default value:
https://github.com/darcymason/pydicom/blob/9a97a7955a6bcb5015bb65e5c55e42597eb164b4/source/dicom/dataelem.py#L319-L324

@sgithens
Copy link
Contributor Author

sgithens commented Jun 1, 2015

I think for this particular issue ( a bad tag in my dicom file), I probably would tend to concur with @jeffmax above, I don't really want to patch the pydicom project for this issue. I didn't run in to this until about mammogram 1500 of my 5000 I'm churning through right now, so I don't expect it to have too often, and I'm Ok with just quarantining it for now to deal with manually later. (Although if it does become more of a problem I'd likely dig further into pydicom! :p ) I'm syncing my directories for the ones that have already been processed right now, but after that am going to go ahead and wrap that ValueError and run it again. ( If everything is awesome after that will submit a pull. )

sgithens added a commit to sgithens/dicom-anon that referenced this issue Jun 2, 2015
@jeffmax
Copy link
Contributor

jeffmax commented Jun 2, 2015

@cancan101 started looking at pydicom tickets related to this issue and it seems you have been involved in the discussion

pydicom/pydicom#193 and pydicom/pydicom#197

I guess I will leave this ticket open for now in case there is an API change that allows us to handle this better.

@jeffmax
Copy link
Contributor

jeffmax commented Jun 4, 2015

Looks like there is a good solution for addressing this in pydicom/pydicom#198
Will keep this open until this is pulled into a release.
Thanks @kevlarkevin!

@sgithens
Copy link
Contributor Author

sgithens commented Jun 6, 2015

Sounds good thanks guys! If there is a fix in pydicom and we pull in a new version and you want me to test on any of my broken dcm's let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants