Skip to content

ims load options

Donald Boyce edited this page Jun 9, 2020 · 1 revision

Keyword Options for imageseries

Each type of imageseries has its own keyword options for loading and saving.

Image Files

The format name is image-files.

This is usually written by hand. It is a YAML-based format, so the load function doesn't take any keyword arguments. It defines a list of image files. It could be a list of single images or a list of multi-imagefiles.

  • YAML keywords:

    • image-files: dictionary defining the image files
      • directory: the directory containing the images
      • files: the list of images; it is a space separated list of file names or glob patterns;
    • empty-frames: (optional) number of frames to skip at the beginning of each multiframe file; this is a commonly used option
    • max-total-frames: (optional) the maximum number of frames in the imageseries; this option might be used for testing the data on a small number of frames;
    • max-file-frames: (optional) the maximum number of frames to read per file; this would be unusual (as far I know--Don)
    • metadata: (required) it can be just about anything, including empty
  • on write: There is actually no write function for this type of imageseries.

HDF5

The format name is hdf5.

This is used at CHESS (Cornell High Energy Synchrotron Source). Raw data from the Dexela detectors comes out in HDF5 format. We still will do the dark subtraction and flipping.

  • on write:

    • path: (required) path to directory containing data group (data set is named images)
    • shuffle: (default=True) HDF5 write option
    • gzip" (default=1) compression level
    • chunk_rows: (default=all) sets HDF5 chunk size in terms of number of rows in image
  • on open:

    • path: (required) path to directory containing data group (data set is named images)

Frame Cache

The format name is frame-cache.

A better name might be sparse matrix format because the images are stored as sparse matrices in numpy npz file. There are actually two forms of the frame-cache. The original is a YAML-based format. The advantage is that metadata can be added naturally; the disadvantage is that it creates multiple files and needs special handling for numpy arrays (e.g. omega data). The other format is a single .npz file. The advantage is that everything is a single file; the disadvantage is that general metadata is awkward to work with.

  • on write:

    • threshold: (required) this is the main option; all data below the threshold is ignored; be careful because a too small threshold creates huge files; normally, however, we get a massive savings of file size since the images are usually over 99% sparse.
    • output_yaml: (default=False)
  • on open: no options