WIP: Migrate load, save, class_map, ext_map to class properties and methods #329

bcipolli · 2015-07-20T15:44:38Z

This is to address #317 and #323 (and aims to supercede PR #319), both issues around image-specific code being distributed across files, rather than localized in the class definition.

Changes here include:

Removing image-specific code from nib.load and nib.save functions, moving the logic into ImageKlass.is_image and HeaderKlass.is_header class functions.
Deprecating class_map and ext_map objects, and moving necessary info to properties defined on the image klasses themselves.

Things needing to be done in this PR:

Migrate some of the old tests that were deleted into similar tests in the new coding

Things that need to be done in a future PR:

The inclusion of .mgz in files_types as an 'image' is a bit funky, but will take thought and work to do cleanly. I have an idea of what to do, but would prefer to do a subsequent PR on it as this one is already quite sizable.

bcipolli · 2015-07-20T15:46:11Z

nibabel/spatialimages.py

+        if not klass.is_valid_extension(ext):
+            return False, sniff
+        elif (getattr(klass.header_class, 'sniff_size') is None or
+              getattr(klass.header_class, 'is_header') is None):


Check if None rather than do a hasattr, so that properties can be either unset, or reset to None (which is easier when subclassing).

effigies · 2015-08-04T19:53:52Z

Small cleanups in ab2081b. Given that all is_header methods except Minc do their own slicing, it makes sense to me to pass sniff directly and slice binaryblock in the Minc headers as well. Also _minctest seems to be deprecated, so I dropped it.

effigies · 2015-08-04T19:57:58Z

nibabel/loadsave.py

+            klass = next(iter(valid_klasses))
+        except StopIteration:  # if iterator is empty
+            raise ImageFileError('Cannot work out file type of "%s"' %
+                                 filename)


The next(iter(filter(...))) is awkward. Alternatives:

valid_klasses = list(filter(lambda klass: klass.is_valid_extension(ext), all_image_classes)) valid_klasses = [klass for klass in all_image_classes if klass.is_valid_extension(ext)]

I'll go for the second, since we want a list anyway...

matthew-brett · 2015-08-12T21:38:28Z

@effigies - just to check that you are happy for this PR to supersede your #319 ?

effigies · 2015-08-12T21:42:36Z

Yes. I used that branch to make a suggested commit, but no need to have a PR for it.

effigies · 2015-08-28T15:58:10Z

nibabel/loadsave.py

-from .minc2 import Minc2Image
-from .freesurfer import MGHImage
+from .filename_parser import splitext_addext
+from .volumeutils import BinOpener


Should be:

from .openers import ImageOpener

bcipolli · 2015-08-28T16:59:51Z

Thanks @effigies -- I can see that I needed to be more careful about integrating my other PR into this one. I'll use your notes and go through things a bit myself to make sure!

bcipolli · 2015-09-01T19:38:44Z

Getting there. Added back some tests. Want to clean things up a bit further (e.g. two of the properties I put on the individual classes are really test-only, and should live in the associated tests). I also need to review the test coverage, to see why it's dropped so significantly.

effigies · 2015-09-02T16:35:07Z

@bcipolli A couple small fixes in a863949. Mostly the same as earlier, but rebased for convenience, and changed the MINC1 test to reduce the space of false positives to other NetCDF images.

Relatedly, I think we may want to add documentation to is_header indicating the level of confidence one should have in the result, in particular that one should not assume that a True result implies that a header may be successfully read.

bcipolli · 2015-09-02T20:30:57Z

@effigies That's great--thanks for the tweaks!

An easier workflow for me would be if you make PRs directly against my branch, and use your comments as the PR message. Then I could just iterate with you there and merge, rather than cherry picking manually (and potentially losing track of things).

Would that work for you?

effigies · 2015-09-02T20:33:06Z

Sure, that works. Was assuming we'd want to keep conversation in one place, but I'm fine with anything.

bcipolli · 2015-09-04T18:51:37Z

OK, got some updates from @effigies ; I think this is in a relatively good state now. With this code, I believe getting GIFTI into the load/save pipeline (see #316) should be cleaner.

@effigies anything further you'd like to see happen here?

effigies · 2015-09-04T19:14:52Z

I'm pretty happy with this. To reiterate this issue with is_header, though, the name suggests that it will return True if and only if the binary string will successfully load as a header, and we might want to indicate that it's a heuristic.

Also a couple style issues I'll PR on your branch in a minute.

effigies · 2015-09-04T20:34:48Z

nibabel/tests/test_image_types.py

+                none=None,  # pass None as the sniff, should query in fn
+                empty='',  # pass an empty sniff, should query in fn
+                irrelevant='a' * (sizeof_hdr - 1),  # A too-small sniff, query
+                bad_sniff='a' * sizeof_hdr).items():  # Bad sniff, should fail.


Shouldn't these be bytes objects (b'a')? They'll either fail or trigger a new sniff anyway, so maybe not a big deal, but if you want them to fail for the right reasons on Python 3...

Just to note, @effigies added a change for this (and another test issue) that I've merged in.

effigies · 2015-09-04T21:20:19Z

@matthew-brett How do you feel about tests on deprecation warnings? Seem like they might be excessive, but I think that's part of the test coverage decrease.

matthew-brett · 2015-09-04T21:22:52Z

nibabel/analyze.py

@@ -892,6 +892,17 @@ def _chk_pixdims(hdr, fix=False):
            rep.fix_msg = ' and '.join(fmsgs)
        return hdr, rep

+    @classmethod
+    def is_header(klass, binaryblock):


How about may_contain_header ?

effigies · 2015-09-04T22:32:07Z

@matthew-brett Incorporated your suggestions. Let me know if my approach to guessed_image_type doesn't work for you.

Looks like we're back up to positive test coverage, so that's good.

matthew-brett · 2015-09-04T22:38:23Z

nibabel/loadsave.py

@@ -212,43 +203,3 @@ def read_img_data(img, prefer='scaled'):
        if prefer == 'scaled':
            return hdr.data_from_fileobj(fileobj)
        return hdr.raw_data_from_fileobj(fileobj)
-
-
-def which_analyze_type(binaryblock):


Sorry man - I know this is anal - but this one's part of the API too, technically.

effigies · 2015-09-22T19:53:31Z

No problem. I'm mostly writing English right now, so an excuse to procrastinate with code is great.

@matthew-brett Didn't get rid of the ValueError test, mainly because I'd need to figure out how to make an exception for it in this test. Can do it if it's a hang-up.

effigies · 2015-09-30T20:14:35Z

@matthew-brett Just a bump, in case you've got a bit of time to look this over.

matthew-brett · 2015-09-30T23:51:04Z

Yes, sorry, thanks for the bump.

Sorry to be ignorant, but won't the may_contain_header function(s) still raise some sort of predictable error if the byte strings are too short, even if you remove the explicit check.

The image API has now expanded to include path_maybe_image - I believe. It would be very nice to have some tests for the in test_image_api.py.

What do you think about the case where the sniff can't make any sense transferred from on image type to the next? At the moment, this can't happen, but sort of by accident. For example, if we are trying to load a .mnc file, then, because the previous files in the list don't have a .mnc extension as being valid, the file will not get sniffed. But, let's imagine that there are two image types with extension .foo., Image1 and Image2. Image1 has a .bar header, which, according to the Image1.path_maybe_image method, gets sniffed. But, Image1.path_maybe_image correctly rejects the sniff as not being valid for Image1. On we go to Image2. This doesn't have a .bar header, but we still pass in the sniff from the .bar header, which Image2 rejects. But we should not have passed in the sniff from the .bar header, because Image2 does not have .bar headers - we should have started again, and this time sniffed the .foo file.

Admittedly, this can't easily happen, but it does seem like an unstated assumption in the code, that it can't happen.

I was thinking of something like this, to make that more explicit

class InvalidateSniff(object):
    " Signals that previous sniff is no longer valid "


IMG_CLASS_LOAD_SEQ = [Nifti1Image, AnalyzeImage, InvalidateSniff, MINC1]
IMG_CLASS_SAVE_SEQ = [k for k in IMG_CLASS_LOAD_SEQ if not k is InvalidateSniff]

bcipolli · 2015-10-01T00:10:55Z

@matthew-brett Thanks for thinking so carefully. I understand the issue you're wishing to solve.

I'd prefer a solution that uses the class metadata to invalidate sniffs. Two ways to do this:

Pass sniff as a tuple: sniff=(filename, binaryblob). Then classes check where the sniff is coming from.
We could explicitly track the filename of the sniff in the load function, and automatically invalidate the sniff if it's from the wrong filename.

matthew-brett · 2015-10-01T00:14:03Z

Passing the filename with the sniff seems a bit ugly. How would you keep track of the filename in the load function? I guess you'd have to return it from the path_maybe_image ?

effigies · 2015-10-01T00:37:48Z

Sorry to be ignorant, but won't the may_contain_header function(s) still raise some sort of predictable error if the byte strings are too short, even if you remove the explicit check.

No. It will return False because slicing beyond the end of a bytes simply returns something smaller than the slice.

>>> b''[:4]
b''
>>> x = b''
>>> x[:4] == b'ABCD'
False

I'll think a bit about your other concern as I can.

matthew-brett · 2015-10-01T01:03:47Z

Ah - yes - but isn't False the right answer in this case?

effigies · 2015-10-01T01:10:21Z

It is. I think it was a desire for consistent behavior across image classes that led here. There's a commit in the stack that removed the .mnc exception in testing for ValueErrors, so re-adding it won't be a huge deal. Let me know if that's what you think should happen.

As to your other concern, I apparently have a couple thoughts already:

If passing the path back and forth is too ugly, what about switching to (sniff, ext), and checking whether ext is a valid header extension? If so, use the sniff, if not, drop it and replace:

is_valid, sniff, ext = image_klass.path_maybe_image(filename, sniff, ext)

On the off chance you have the sequence checks-foo, checks-bar, checks-foo, sniff could be a {ext: binaryblock} dictionary, so that sniffs aren't thrown away until they're known not to be needed. And we can add back path into this and make sniff be {path: binaryblock}.

I believe it's already pretty much strongly assumed that a header and image file are assumed to have the same basename, so the ext options seem safe until it becomes necessary to modify that assumption throughout the code.

matthew-brett · 2015-10-05T21:53:23Z

Sorry - keep on getting swamped in nipy stuff.

Thinking about it, maybe the (sniff, filename) tuple isn't too bad. I guess then you'd have to check whether the filename is the correct metadata filename for each image type, but that seems fine. So it's just redfining a sniff as as (bytes, filename) pair, which seems OK.

I have a vague preference for not raising errors in the maybe_header checks, but it's vague, so - your call.

I'll be more attentive here now - but please feel free to remind me at short (one day) intervals if I'm slacking.

effigies · 2015-10-06T16:16:19Z

I don't have a strong opinion here, but I would say if we're going to get rid of the exception in Minc1, then checks like this should return False as well, for consistency.

matthew-brett · 2015-10-06T16:23:09Z

Agree about the consistency of checks.

Currently doesn't occur, but possible that different images with a common extension could have different header extensions. This change checks that the header file name is the same before reusing a sniff.

effigies · 2015-10-06T16:41:41Z

Okay, just updated the doc string and killed the exceptions.

matthew-brett · 2015-10-06T19:16:35Z

I think this is looking good now.

Last thing - would you mind adding the tests for the expanded Image API in test_image_api.py? Specifically, I think the image API has extended to include the path_maybe_image class method.

I realize that test function is a bit hairy, so let me know if I can help - I should have time today and tomorrow.

effigies · 2015-10-06T20:03:19Z

How about adding something like the following to LoadImageAPI?

def validate_path_maybe_image(self, imaker, params):
    if hasattr(imaker, 'path_maybe_image'):
        for img_params in self.example_images:
            test, sniff = imaker.path_maybe_image(img_params['fname'])[0]
            assert test == True
            if sniff is not None:
                assert isinstance(sniff[0], bytes)
                assert isinstance(sniff[1], basestring)

matthew-brett · 2015-10-06T22:23:45Z

Aren't we insisting that every image class have path_maybe_image?

Should be assert_true rather than assert, and string_types rather than basestring but otherwise, that looks about right.

effigies · 2015-10-06T22:45:00Z

The PAR/REC test didn't have an image_maker attribute, so I added a separate attribute klass, which for the most part doubles up on image_maker. I didn't want to assert things to be makeable to test this method.

Not all tests have image_maker attributes, so add klass attribute to test path_maybe_image. Checks structure of return values on calls to example images.

matthew-brett · 2015-10-06T22:52:17Z

Ah yes, good point.

I think the PR is ready, if the tests pass. Ben - any further comments before I merge?

bcipolli · 2015-10-06T22:56:02Z

👍

MRG: Migrate load, save, class_map, ext_map to class properties and methods This is to address #317 and #323 (and aims to supercede PR #319), both issues around image-specific code being distributed across files, rather than localized in the class definition. Changes here include: * Removing image-specific code from `nib.load` and `nib.save` functions, moving the logic into `ImageKlass.is_image` and `HeaderKlass.is_header` class functions. * Deprecating `class_map` and `ext_map` objects, and moving necessary info to properties defined on the image classes themselves.

matthew-brett · 2015-10-06T23:21:10Z

Thanks a lot for your patience - sorry it took so long.

MRG: Migrate load, save, class_map, ext_map to class properties and methods This is to address nipy#317 and nipy#323 (and aims to supercede PR nipy#319), both issues around image-specific code being distributed across files, rather than localized in the class definition. Changes here include: * Removing image-specific code from `nib.load` and `nib.save` functions, moving the logic into `ImageKlass.is_image` and `HeaderKlass.is_header` class functions. * Deprecating `class_map` and `ext_map` objects, and moving necessary info to properties defined on the image classes themselves.

bcipolli reviewed Jul 20, 2015
View reviewed changes

effigies mentioned this pull request Jul 21, 2015

ENH: Add image_like function for SpatialImages #300

Closed

effigies reviewed Aug 4, 2015
View reviewed changes

effigies reviewed Aug 28, 2015
View reviewed changes

bcipolli mentioned this pull request Sep 1, 2015

Simplify the testing infrastructure for checking warnings #343

Closed

bcipolli mentioned this pull request Sep 3, 2015

Simplify interface for testing of warnings. #345

Merged

effigies reviewed Sep 4, 2015
View reviewed changes

matthew-brett reviewed Sep 4, 2015
View reviewed changes

STY: Remove misleading 'sizeof_hdr' from Minc1/2

c3518d1

RF: Tag sniffs with file name

5814435

Currently doesn't occur, but possible that different images with a common extension could have different header extensions. This change checks that the header file name is the same before reusing a sniff.

RF: Fail quietly on too-short header blocks

76440f2

TEST: path_maybe_image part of image API

8a926a4

Not all tests have image_maker attributes, so add klass attribute to test path_maybe_image. Checks structure of return values on calls to example images.

matthew-brett merged commit ec7926a into nipy:master Oct 6, 2015

This was referenced Oct 17, 2015

Push loadsave logic into individual classes, remove logic duplication #317

Closed

Push .imageclass.class_map logic into Image classes #323

Closed

RF: remove imageopener decorator #361

Merged

bcipolli deleted the reload branch October 31, 2015 13:51

WIP: Migrate load, save, class_map, ext_map to class properties and methods #329

WIP: Migrate load, save, class_map, ext_map to class properties and methods #329

Conversation

bcipolli commented Jul 20, 2015

Choose a reason for hiding this comment

effigies commented Aug 4, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthew-brett commented Aug 12, 2015

effigies commented Aug 12, 2015

Choose a reason for hiding this comment

bcipolli commented Aug 28, 2015

bcipolli commented Sep 1, 2015

effigies commented Sep 2, 2015

bcipolli commented Sep 2, 2015

effigies commented Sep 2, 2015

bcipolli commented Sep 4, 2015

effigies commented Sep 4, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

effigies commented Sep 4, 2015

Choose a reason for hiding this comment

effigies commented Sep 4, 2015

Choose a reason for hiding this comment

effigies commented Sep 22, 2015

effigies commented Sep 30, 2015

matthew-brett commented Sep 30, 2015

bcipolli commented Oct 1, 2015

matthew-brett commented Oct 1, 2015

effigies commented Oct 1, 2015

matthew-brett commented Oct 1, 2015

effigies commented Oct 1, 2015

matthew-brett commented Oct 5, 2015

effigies commented Oct 6, 2015

matthew-brett commented Oct 6, 2015

effigies commented Oct 6, 2015

matthew-brett commented Oct 6, 2015

effigies commented Oct 6, 2015

matthew-brett commented Oct 6, 2015

effigies commented Oct 6, 2015

matthew-brett commented Oct 6, 2015

bcipolli commented Oct 6, 2015

matthew-brett commented Oct 6, 2015