Directory Structure for MDF files #45

tknopp · 2016-09-11T06:49:01Z

As already discussed in #42:

It might make sense that we develop a standardized folder structure for mdf files. We have not yet developed anything in that direction but start thinking about that.

Here is my current thinking

mainfolder/
-- measurements/
---- study1
------ 000001.mdf
-- systemFunctions/
----000001.mdf
-- reconstructions/
---- study1/
------ 000001/     # this is the experiment number
--------- 000001.mdf # this is the number of the reconstruction

The leading zeros might not be necessary. In the name "study" one might also encode the date as Bruker does this.

Ping @MandyA @profix898 @hofmannmartin

The text was updated successfully, but these errors were encountered:

tknopp · 2016-09-12T08:38:31Z

We should really do this. I think here about exchanging studies between groups and it would be very handy to just drop a study folder into the dataset directory without thinking how others structure the data.

hofmannmartin · 2016-09-12T09:04:51Z

I like this idea. +1

@tknopp You plan on adding a paragraph regarding this issue to the specifications?

MandyA · 2016-09-12T09:34:13Z

Personally, I like the idea. I will talk to Anselm if this would be a good solution to store MPS study data.

One thing that might be inconvinient: system matrices are very often reused for different studies. With this format we would have to store the matrix several times. With big 3D matrices this might be an issue.

tknopp · 2016-09-12T09:36:39Z

No system function go into a dedicated global folder that is not linked to any study. This is, by the way, also one major difference to the way Bruker handles its data.

MandyA · 2016-09-12T09:41:15Z

Ah, ok - I didn't saw this on the first glance ;)
So in relation to #46 it would make sense to have an optional paramter that includes the information which matrix has been used for reconstruction.

hofmannmartin · 2016-09-12T09:47:58Z

So in relation to #46 it would make sense to have an optional paramter that includes the information which matrix has been used for reconstruction.

Yes that would be necessary. Instead of the location one could also reference other mdf-files via their uuid, which in turn has the advantage that files can be renamed without loosing the reference.

tknopp · 2016-09-12T10:22:04Z

yes indeed, within this entire issue we should keep the UUID in mind.

tknopp · 2016-11-12T17:23:39Z

At the IBI group in HH we have implemented this. Works quite well. We will probably come up with something for the specification.

hofmannmartin · 2017-08-11T15:04:06Z

Should we move this issue to MPIFiles.jl? It is more related to the actual data handling than the specifications.

Neumann-A · 2017-08-25T08:42:55Z

Question why isnt it:

mainfolder/
-- study1/
---- measurements/
------ 000001.mdf
---- reconstructions/
------ 000001/ # this is the experiment (measurement) number
--------- 000245.mdf # this is the number of the reconstruction (changed the number)
-- systemFunctions/
----000126.mdf # (changed the number to make clear it us nothing to do with the above)

My Reasoning:
a study is a collection of measurements and reconstructions. Having an extra reconstruction folder in the mainfolder seems strange since a reconstruction is nothing without the context of a study.

hofmannmartin · 2017-08-25T08:48:16Z

In our work flow we did occasionally reconstruct the measurement from different studies, which would be a hassle, if reconstructions are assigned to studies.

Neumann-A · 2017-08-25T09:08:36Z

Let me guess: You start your reconstruction comparison script from the /reconstructions/ folder?
If you would start it from the main folder it would just be a reordering of the path string.

if reconstructions are assigned to studies

They are assigned to different studies in your current layout due to the extra /study folder in reconstructions.
(So you need the study path any way.)

instead of having /reconstructions/study/ you will have /study/reconstructions
which in my opinion is more logical because a study is a collection of measurements and reconstructions

hofmannmartin · 2017-08-25T09:29:05Z

Currently we have our measurement data and reconstruction data separated on different NAS systems. The NAS at the MPI Scanner stores our Measurement data, whereas the NAS at our workstation stores the reconstruction data. Your proposal requires one large file system, where everything is stored and does not allow to split off the reconstructed data.

Neumann-A · 2017-08-25T09:38:43Z

Your proposal requires one large file system.

No. Thats not necessary. Storage is a implementation detail, you can still store it anywhere you want. The only thing you have to do is present the data in the proposed directory structure. (Its a virtual structure)

hofmannmartin · 2017-08-25T09:49:56Z

No. Thats not necessary. Storage is a implementation detail, you can still store it anywhere you want. The only thing you have to do is present the data in the proposed directory structure. (Its a virtual structure)

That is true, but the proposal of @tknopp requires no mapping at all and can be written directly to a file system, which should be feasible, regardless if someone is a programming expert or not.

In this case I would vote for simplicity.

Neumann-A · 2017-08-25T10:07:04Z

You just need to mount your filesystem correctly beforehand which is just a configuration step. (Has nothing to do with being a programming expert)

Than you can just as easily write to the filesystem.

@tknopp requires no mapping

It also requires mapping if you have to many studies to store on one filesystem... You will sooner or later run into that issue. Possible solutions: Archive old studies somewhere else by moving a lot of data or add more HDDs or a new NAS. For the latter solution you will most likely then need a mapping anyway. Currently you are just delaying the issue ;)

tknopp · 2017-08-25T10:53:04Z

The reason is much more simple: We use two stores: One store is that from Bruker, the second is that from MDF. The first is a pure Measurement store (read only!!!), the second is the reconstruction store.

I can see that both systems are isomorph but have different advantages. Bruker does it in a similar fashion as @NeumannIMT proposes it. They put the reconstruction even as a subfolder of the experiment (which also makes sense).

AvGladiss · 2017-08-25T21:01:50Z

I do not have a smart idea about storing the data, but I have a question about it: A system function is a simple measurement in the first place. Will one file in /systemFunctions be a post-processed version of one /measurements/study/file ?

Furthermore, a system function may reconstruct another system function by handling the different spatial positions as frames (for test purposes). This should be kept in mind when designing a directory structure (then, /reconstruction would need a subfolder /systemFunction ?).

hofmannmartin · 2017-08-25T21:17:15Z

I do not have a smart idea about storing the data, but I have a question about it: A system function is a simple measurement in the first place. Will one file in /systemFunctions be a post-processed version of one /measurements/study/file ?

That should depend on what you want to do with it. If you perform a calibration measurement then it should always be stored in //systemFunctions/

Furthermore, a system function may reconstruct another system function by handling the different spatial positions as frames (for test purposes). This should be kept in mind when designing a directory structure (then, /reconstruction would need a subfolder /systemFunction ?).

I dont see a problem here. The directory structure merely provides standard locations for your stuff. Your personal reconstruction framework working on that structure then might do whatever it wants. The process of reconstruction is not part of MDF, but up to the user.

Neumann-A · 2017-08-25T21:20:27Z

Nice catch @AvGladiss. You seem to have the more general view on this.

Translating this into a file structure:
mainfolder/
-- study1/
---- measurements/
------ 000001.mdf
---- processed/ (renamed from reconstruction)
------ 000001/ # this is the experiment (measurement) number
--------- 000245.mdf # this is the number of the processing (system matrix)
--------- 000123.mdf # this is the number of the processing (reconstructed image)

(-- systemFunctions/)
(----Link to 0000245.mdf )

tknopp · 2017-08-25T21:45:36Z

(side note, our actually directory structure uses the name "calibration" instead of "systemFunction")

tknopp · 2017-08-25T21:58:43Z

We have very bad experience when mixing calibration measurements and regular measurements. In our opinion it does not make sense that a calibration belongs to a study. Therefore we went with a "flat" structure for calibration scans. It has the advantage that all calibration scans are directly available without the need so search through a deep directory structure.

I can understand Anselm use case. In that case "reconstruction/calibration" could be a good storage location. One could also move "calibration" into the "measurement" folder in which case "reconstruction/calibration" would actually be no workaround.

tknopp mentioned this issue Sep 12, 2016

Store Reconstruction Parameters #46

Open

hofmannmartin mentioned this issue Sep 12, 2016

Request of extending specific parameters for use with patches #42

Closed

hofmannmartin added enhancement feature request labels May 18, 2017

Neumann-A mentioned this issue Aug 15, 2017

time stamp for reconstructed data #72

Open

hofmannmartin added MDF specifications post v2.0 and removed enhancement labels Aug 22, 2017

Neumann-A mentioned this issue Aug 25, 2017

Reconstructed images: Linkage between measurement and reconstruction (and calibration)? #92

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directory Structure for MDF files #45

Directory Structure for MDF files #45

tknopp commented Sep 11, 2016

tknopp commented Sep 12, 2016

hofmannmartin commented Sep 12, 2016

MandyA commented Sep 12, 2016

tknopp commented Sep 12, 2016

MandyA commented Sep 12, 2016

hofmannmartin commented Sep 12, 2016

tknopp commented Sep 12, 2016

tknopp commented Nov 12, 2016

hofmannmartin commented Aug 11, 2017

Neumann-A commented Aug 25, 2017 •

edited

Loading

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017 •

edited

Loading

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017

tknopp commented Aug 25, 2017 •

edited

Loading

AvGladiss commented Aug 25, 2017

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017

tknopp commented Aug 25, 2017

tknopp commented Aug 25, 2017

Directory Structure for MDF files #45

Directory Structure for MDF files #45

Comments

tknopp commented Sep 11, 2016

tknopp commented Sep 12, 2016

hofmannmartin commented Sep 12, 2016

MandyA commented Sep 12, 2016

tknopp commented Sep 12, 2016

MandyA commented Sep 12, 2016

hofmannmartin commented Sep 12, 2016

tknopp commented Sep 12, 2016

tknopp commented Nov 12, 2016

hofmannmartin commented Aug 11, 2017

Neumann-A commented Aug 25, 2017 • edited Loading

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017 • edited Loading

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017

tknopp commented Aug 25, 2017 • edited Loading

AvGladiss commented Aug 25, 2017

hofmannmartin commented Aug 25, 2017

Neumann-A commented Aug 25, 2017

tknopp commented Aug 25, 2017

tknopp commented Aug 25, 2017

Neumann-A commented Aug 25, 2017 •

edited

Loading

Neumann-A commented Aug 25, 2017 •

edited

Loading

tknopp commented Aug 25, 2017 •

edited

Loading