Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Migrate load, save, class_map, ext_map to class properties and methods #329

Merged
merged 30 commits into from
Oct 6, 2015
Merged

Conversation

bcipolli
Copy link
Contributor

This is to address #317 and #323 (and aims to supercede PR #319), both issues around image-specific code being distributed across files, rather than localized in the class definition.

Changes here include:

  • Removing image-specific code from nib.load and nib.save functions, moving the logic into ImageKlass.is_image and HeaderKlass.is_header class functions.
  • Deprecating class_map and ext_map objects, and moving necessary info to properties defined on the image klasses themselves.

Things needing to be done in this PR:

  • Migrate some of the old tests that were deleted into similar tests in the new coding

Things that need to be done in a future PR:

  • The inclusion of .mgz in files_types as an 'image' is a bit funky, but will take thought and work to do cleanly. I have an idea of what to do, but would prefer to do a subsequent PR on it as this one is already quite sizable.

if not klass.is_valid_extension(ext):
return False, sniff
elif (getattr(klass.header_class, 'sniff_size') is None or
getattr(klass.header_class, 'is_header') is None):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if None rather than do a hasattr, so that properties can be either unset, or reset to None (which is easier when subclassing).

@effigies
Copy link
Member

effigies commented Aug 4, 2015

Small cleanups in ab2081b. Given that all is_header methods except Minc do their own slicing, it makes sense to me to pass sniff directly and slice binaryblock in the Minc headers as well. Also _minctest seems to be deprecated, so I dropped it.

klass = next(iter(valid_klasses))
except StopIteration: # if iterator is empty
raise ImageFileError('Cannot work out file type of "%s"' %
filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next(iter(filter(...))) is awkward. Alternatives:

valid_klasses = list(filter(lambda klass: klass.is_valid_extension(ext), all_image_classes))
valid_klasses = [klass for klass in all_image_classes if klass.is_valid_extension(ext)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go for the second, since we want a list anyway...

@matthew-brett
Copy link
Member

@effigies - just to check that you are happy for this PR to supersede your #319 ?

@effigies
Copy link
Member

Yes. I used that branch to make a suggested commit, but no need to have a PR for it.

from .minc2 import Minc2Image
from .freesurfer import MGHImage
from .filename_parser import splitext_addext
from .volumeutils import BinOpener
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be:

from .openers import ImageOpener

@bcipolli
Copy link
Contributor Author

Thanks @effigies -- I can see that I needed to be more careful about integrating my other PR into this one. I'll use your notes and go through things a bit myself to make sure!

@bcipolli
Copy link
Contributor Author

bcipolli commented Sep 1, 2015

Getting there. Added back some tests. Want to clean things up a bit further (e.g. two of the properties I put on the individual classes are really test-only, and should live in the associated tests). I also need to review the test coverage, to see why it's dropped so significantly.

@effigies
Copy link
Member

effigies commented Sep 2, 2015

@bcipolli A couple small fixes in a863949. Mostly the same as earlier, but rebased for convenience, and changed the MINC1 test to reduce the space of false positives to other NetCDF images.

Relatedly, I think we may want to add documentation to is_header indicating the level of confidence one should have in the result, in particular that one should not assume that a True result implies that a header may be successfully read.

@bcipolli
Copy link
Contributor Author

bcipolli commented Sep 2, 2015

@effigies That's great--thanks for the tweaks!

An easier workflow for me would be if you make PRs directly against my branch, and use your comments as the PR message. Then I could just iterate with you there and merge, rather than cherry picking manually (and potentially losing track of things).

Would that work for you?

@effigies
Copy link
Member

effigies commented Sep 2, 2015

Sure, that works. Was assuming we'd want to keep conversation in one place, but I'm fine with anything.

@bcipolli
Copy link
Contributor Author

bcipolli commented Sep 4, 2015

OK, got some updates from @effigies ; I think this is in a relatively good state now. With this code, I believe getting GIFTI into the load/save pipeline (see #316) should be cleaner.

@effigies anything further you'd like to see happen here?

@effigies
Copy link
Member

effigies commented Sep 4, 2015

I'm pretty happy with this. To reiterate this issue with is_header, though, the name suggests that it will return True if and only if the binary string will successfully load as a header, and we might want to indicate that it's a heuristic.

Also a couple style issues I'll PR on your branch in a minute.

none=None, # pass None as the sniff, should query in fn
empty='', # pass an empty sniff, should query in fn
irrelevant='a' * (sizeof_hdr - 1), # A too-small sniff, query
bad_sniff='a' * sizeof_hdr).items(): # Bad sniff, should fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be bytes objects (b'a')? They'll either fail or trigger a new sniff anyway, so maybe not a big deal, but if you want them to fail for the right reasons on Python 3...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, @effigies added a change for this (and another test issue) that I've merged in.

@effigies
Copy link
Member

effigies commented Sep 4, 2015

@matthew-brett How do you feel about tests on deprecation warnings? Seem like they might be excessive, but I think that's part of the test coverage decrease.

@@ -892,6 +892,17 @@ def _chk_pixdims(hdr, fix=False):
rep.fix_msg = ' and '.join(fmsgs)
return hdr, rep

@classmethod
def is_header(klass, binaryblock):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about may_contain_header ?

@effigies
Copy link
Member

effigies commented Sep 4, 2015

@matthew-brett Incorporated your suggestions. Let me know if my approach to guessed_image_type doesn't work for you.

Looks like we're back up to positive test coverage, so that's good.

@@ -212,43 +203,3 @@ def read_img_data(img, prefer='scaled'):
if prefer == 'scaled':
return hdr.data_from_fileobj(fileobj)
return hdr.raw_data_from_fileobj(fileobj)


def which_analyze_type(binaryblock):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry man - I know this is anal - but this one's part of the API too, technically.

@effigies
Copy link
Member

No problem. I'm mostly writing English right now, so an excuse to procrastinate with code is great.

@matthew-brett Didn't get rid of the ValueError test, mainly because I'd need to figure out how to make an exception for it in this test. Can do it if it's a hang-up.

@effigies
Copy link
Member

@matthew-brett Just a bump, in case you've got a bit of time to look this over.

@matthew-brett
Copy link
Member

Yes, sorry, thanks for the bump.

Sorry to be ignorant, but won't the may_contain_header function(s) still raise some sort of predictable error if the byte strings are too short, even if you remove the explicit check.

The image API has now expanded to include path_maybe_image - I believe. It would be very nice to have some tests for the in test_image_api.py.

What do you think about the case where the sniff can't make any sense transferred from on image type to the next? At the moment, this can't happen, but sort of by accident. For example, if we are trying to load a .mnc file, then, because the previous files in the list don't have a .mnc extension as being valid, the file will not get sniffed. But, let's imagine that there are two image types with extension .foo., Image1 and Image2. Image1 has a .bar header, which, according to the Image1.path_maybe_image method, gets sniffed. But, Image1.path_maybe_image correctly rejects the sniff as not being valid for Image1. On we go to Image2. This doesn't have a .bar header, but we still pass in the sniff from the .bar header, which Image2 rejects. But we should not have passed in the sniff from the .bar header, because Image2 does not have .bar headers - we should have started again, and this time sniffed the .foo file.

Admittedly, this can't easily happen, but it does seem like an unstated assumption in the code, that it can't happen.

I was thinking of something like this, to make that more explicit

class InvalidateSniff(object):
    " Signals that previous sniff is no longer valid "


IMG_CLASS_LOAD_SEQ = [Nifti1Image, AnalyzeImage, InvalidateSniff, MINC1]
IMG_CLASS_SAVE_SEQ = [k for k in IMG_CLASS_LOAD_SEQ if not k is InvalidateSniff]

@bcipolli
Copy link
Contributor Author

bcipolli commented Oct 1, 2015

@matthew-brett Thanks for thinking so carefully. I understand the issue you're wishing to solve.

I'd prefer a solution that uses the class metadata to invalidate sniffs. Two ways to do this:

  1. Pass sniff as a tuple: sniff=(filename, binaryblob). Then classes check where the sniff is coming from.
  2. We could explicitly track the filename of the sniff in the load function, and automatically invalidate the sniff if it's from the wrong filename.

@matthew-brett
Copy link
Member

Passing the filename with the sniff seems a bit ugly. How would you keep track of the filename in the load function? I guess you'd have to return it from the path_maybe_image ?

@effigies
Copy link
Member

effigies commented Oct 1, 2015

Sorry to be ignorant, but won't the may_contain_header function(s) still raise some sort of predictable error if the byte strings are too short, even if you remove the explicit check.

No. It will return False because slicing beyond the end of a bytes simply returns something smaller than the slice.

>>> b''[:4]
b''
>>> x = b''
>>> x[:4] == b'ABCD'
False

I'll think a bit about your other concern as I can.

@matthew-brett
Copy link
Member

Ah - yes - but isn't False the right answer in this case?

@effigies
Copy link
Member

effigies commented Oct 1, 2015

It is. I think it was a desire for consistent behavior across image classes that led here. There's a commit in the stack that removed the .mnc exception in testing for ValueErrors, so re-adding it won't be a huge deal. Let me know if that's what you think should happen.

As to your other concern, I apparently have a couple thoughts already:

If passing the path back and forth is too ugly, what about switching to (sniff, ext), and checking whether ext is a valid header extension? If so, use the sniff, if not, drop it and replace:

is_valid, sniff, ext = image_klass.path_maybe_image(filename, sniff, ext)

On the off chance you have the sequence checks-foo, checks-bar, checks-foo, sniff could be a {ext: binaryblock} dictionary, so that sniffs aren't thrown away until they're known not to be needed. And we can add back path into this and make sniff be {path: binaryblock}.

I believe it's already pretty much strongly assumed that a header and image file are assumed to have the same basename, so the ext options seem safe until it becomes necessary to modify that assumption throughout the code.

@matthew-brett
Copy link
Member

Sorry - keep on getting swamped in nipy stuff.

Thinking about it, maybe the (sniff, filename) tuple isn't too bad. I guess then you'd have to check whether the filename is the correct metadata filename for each image type, but that seems fine. So it's just redfining a sniff as as (bytes, filename) pair, which seems OK.

I have a vague preference for not raising errors in the maybe_header checks, but it's vague, so - your call.

I'll be more attentive here now - but please feel free to remind me at short (one day) intervals if I'm slacking.

@effigies
Copy link
Member

effigies commented Oct 6, 2015

I don't have a strong opinion here, but I would say if we're going to get rid of the exception in Minc1, then checks like this should return False as well, for consistency.

@matthew-brett
Copy link
Member

Agree about the consistency of checks.

Currently doesn't occur, but possible that different images with a
common extension could have different header extensions. This change
checks that the header file name is the same before reusing a sniff.
@effigies
Copy link
Member

effigies commented Oct 6, 2015

Okay, just updated the doc string and killed the exceptions.

@matthew-brett
Copy link
Member

I think this is looking good now.

Last thing - would you mind adding the tests for the expanded Image API in test_image_api.py? Specifically, I think the image API has extended to include the path_maybe_image class method.

I realize that test function is a bit hairy, so let me know if I can help - I should have time today and tomorrow.

@effigies
Copy link
Member

effigies commented Oct 6, 2015

How about adding something like the following to LoadImageAPI?

def validate_path_maybe_image(self, imaker, params):
    if hasattr(imaker, 'path_maybe_image'):
        for img_params in self.example_images:
            test, sniff = imaker.path_maybe_image(img_params['fname'])[0]
            assert test == True
            if sniff is not None:
                assert isinstance(sniff[0], bytes)
                assert isinstance(sniff[1], basestring)

@matthew-brett
Copy link
Member

Aren't we insisting that every image class have path_maybe_image?

Should be assert_true rather than assert, and string_types rather than basestring but otherwise, that looks about right.

@effigies
Copy link
Member

effigies commented Oct 6, 2015

The PAR/REC test didn't have an image_maker attribute, so I added a separate attribute klass, which for the most part doubles up on image_maker. I didn't want to assert things to be makeable to test this method.

Not all tests have image_maker attributes, so add klass attribute to
test path_maybe_image. Checks structure of return values on calls to
example images.
@matthew-brett
Copy link
Member

Ah yes, good point.

I think the PR is ready, if the tests pass. Ben - any further comments before I merge?

@bcipolli
Copy link
Contributor Author

bcipolli commented Oct 6, 2015

👍

matthew-brett added a commit that referenced this pull request Oct 6, 2015
MRG: Migrate load, save, class_map, ext_map to class properties and methods

This is to address #317 and #323 (and aims to supercede PR #319), both issues around image-specific code being distributed across files, rather than localized in the class definition.

Changes here include:
* Removing image-specific code from `nib.load` and `nib.save` functions, moving the logic into `ImageKlass.is_image` and `HeaderKlass.is_header` class functions.
* Deprecating `class_map` and `ext_map` objects, and moving necessary info to properties defined on the image classes themselves.
@matthew-brett matthew-brett merged commit ec7926a into nipy:master Oct 6, 2015
@matthew-brett
Copy link
Member

Thanks a lot for your patience - sorry it took so long.

@bcipolli bcipolli deleted the reload branch October 31, 2015 13:51
grlee77 pushed a commit to grlee77/nibabel that referenced this pull request Mar 15, 2016
MRG: Migrate load, save, class_map, ext_map to class properties and methods

This is to address nipy#317 and nipy#323 (and aims to supercede PR nipy#319), both issues around image-specific code being distributed across files, rather than localized in the class definition.

Changes here include:
* Removing image-specific code from `nib.load` and `nib.save` functions, moving the logic into `ImageKlass.is_image` and `HeaderKlass.is_header` class functions.
* Deprecating `class_map` and `ext_map` objects, and moving necessary info to properties defined on the image classes themselves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants