Skip to content
This repository has been archived by the owner on Mar 24, 2021. It is now read-only.

Efield object #10

Merged
merged 34 commits into from
Jan 15, 2019
Merged

Efield object #10

merged 34 commits into from
Jan 15, 2019

Conversation

christophwelling
Copy link
Collaborator

Adds electricField class to store electric fields. They can be accessed by station.get_electric_fields and station.get_electric_fields_for_channels. Each electricField object stores a list of IDs of the channels it was reconstructed from, or for which it was simulated as well as it ray_path_type (for simulated E-fields).
The individual modules can mostly be used as before, with the following changes:

  • Station traces and sim_station channels should not be used anymore to store electric fields.
  • stationResampler, stationBandpassFilter and stationSignalReconstructor have been renamed to electricFieldResampler, electricFieldBandpassFilter and electricFieldSignalReconstructor.
  • efieldToVoltageConverter has been removed. Instead, the efieldToVoltageConverterPerChannel has been renamed to efieldToVoltageConverter and can now also handle cosmic rays.
  • The cosmic_ray_mode parameter has been removed from several modules. The station's is_cosmic_ray() flag is used instead.

…nt structure.

-There is now only one module for this job. It is the equivalent to the old efieldToVoltageConverterPerChannel, with the added capability to handle CR events
-The electric field class now also stores signal direction
…osmic_ray_mode parameter. Cosmic ray flag in station is used instead
Copy link
Contributor

@cg-laser cg-laser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes! I went through all of them and found a few things.
We should never reassign a enum number (same as in Offline). So in parameter.py, the parameters that stay in the file, should not be changed.
Also the file version needs to be bumped up, maybe even the major file version as this change is not backward compatible. If we anyway change that, we can also implement the change of the filename extension to 'nur'

In readCoreas.py: don't hand over the detector in the begin method. It is anyway unsave to buffer the number of channels as this is a time dependent quantity that might change with a different coreas event time (although currently never used). Instead, the give the detector as argument to the run method and request the number of channels for each event.
I think we should even hand over a list of the station ids to the make_sim_station function because channels might not named increasingly. For some reason, channel 2 could be missing as an example. Does the detector already has a function to return the ids of the channels of a station?

@cg-laser
Copy link
Contributor

For documentation: A discussion that we had via email before starting this pull request:

Hi,

I have added a zenith and azimuth parameter to the E-field parameter storage for now, so E_theta and E_phi are well-defined. As I wrote previously, if we want to we can later overwrite the set_trace and get_trace functions. I agree that a coordinate system tool would be very useful, but I would like to finish the changes to the electric field first. Messing around with that is already disruptive enough, so I would not want to make it more complicated by changing the coordinate system at the same time. We can then work on the coordinate system once we are reasonably sure that the new E-field works fine.

About not saving more than we need, I want to point out that right now, the electric field has an E_r component. It is of course 0, but it exists and takes up storage. Should we get rid of it? Unfortunately, I don't think the cs transformations in the radiotools can handle that right now, so maybe wait until coordinate system tools are implemented in NuRadioReco?

As for what these tools should look like, how about some vector class like it is done in Offline? That seemed to work pretty well (at least for me).

I also came across another detail that I am not quite happy with. If we assiciate E-fields with channels, we imply that these are what the field (and the signal direction) is at the antenna position, but this is not true for cosmic rays because of refraction at the air/ice boundary. It is not a big problem, but I can see how this could easily confuse people, so if anyone has an idea about how to clear this up, please tell me.

Cheers,

Christoph

On 08.01.19 10:32, Anna Nelles wrote:

Hi all,

I agree with Christian that we don't want to save more than is needed, but I also agree that we need to do something about coordinate systems and arrival directions.
I already started this #2 a while ago, because I was really confused, whether we store the arrival or the propagation direction and what coordinate system was going on.

I think we all agree that we need convenience getter and setter functions to help us around, but somewhere in the back we need a solid definition of basic units.

A couple of things to consider:

  1. Right now, we are implicitly assuming some cartesian local coordinate system in which everything is relative to a station center. From then we only store everything in the corresponding spherical coordinates. This won't be enough in the future anyway. Especially for multiple station coincidences, we will need some more global coordinate system. Think air shower detected in station A at the surface and in station B deep down. No one wants to do those geometry gymnastics in their head.

  2. There will be time dependent detector positions. The ice at SP has shear and flows, so everything will move around.

  3. Our coordinate system needs to convert nicely between whatever we choose locally and the let's say galactic coordinates. They are all interchangeable for a small detector, but not, if we end up having a portion at South Pole and one in Greenland. (If the government shutdown continues for years, this options becomes more and more realistic).

So I think we need to store the electric field and its arrival direction for every signal. I don't actually care what the default coordinate system is (I see natural vs. degeneracy at theta = 0), but we need a coordinate system tool that takes care of all the gymnastics in the background.

Does anyone have a good idea for this?

Anna

On 1/7/19 17:10, Christian Glaser wrote:

Hi Christoph,

what about just saving the direction of the wave in the efield class?

The thing about refraction is that nothing needs to be changed if the efield is represented in eTheta and ePhi, the two component stay exactly the same. The only thing that changes is the signal direction.

Another downside of storing xyz is that it stores redundant information as eR is always zero.

eTheta and ePhi is also what comes out of the efield reconstruction. Therefore, I'd say it is the more natural representation. Storing xyz requires an additional coordinate transformation after the reconstruction.

Christian

Am 07.01.19 um 16:21 schrieb Christoph Welling:

Hi all,

Daniel and I had a little chat about the E-field class and he suggested that we should store the electric field trace in cartesian coordinates but write the getter and setter functions so that they accept the coordinate system as an additional parameter and act accordingly. This was not really possible the way we stored electric fields before, but with a dedicated electric field class it is and I think it is a very good idea for several reasons:

  1. It's more convenient. We don't have to do coordinate transformations by hand but can just let the E-field class take care of it.

  2. We avoid the degeneracy at theta=0.

  3. Because of things like refraction at the air/ice boundary or during propagation it is very easy to get confused by which signal direction exactly to use. This implementation would avoid this confusion and is probably less error-prone.

The only downside I can think of right now is that it would create some overhead because of all the coordinate transformations, but I have no idea how big of a problem that would be.

So, what do you think? Right now the E-field class inherits from base_trace, so everything works as before. But later we might want to overwrite this. If we implement spherical coordinates as the default option, we should not even have to change anything in the modules.

Cheers

Christoph

On 05.01.19 10:43, Christian Glaser wrote:

Hi Christoph,

Am 04.01.19 um 16:14 schrieb Christoph Welling:

Hi all and happy new year,

for the parameter storage I think we should define a separate E-field parameter class. There are some parameters (like energy fluence and polarization) that only make sense in the context of an E-field while others (e.g. neutrino flavor) make no sense at all. I just realized that the way it is implemented right now is pretty weird, since those properties are part of the stationParameters (because that is where the E-field for cosmic rays is stored) and will be unusable when we move to neutrinos. One more reason for an E-field class I guess
good point

I would like to avoid differences between stations and sim_stations if possible. I think the different ray paths should be implemented as independent E-field objects that each have a parameter like the ray_path_type that is now in the channelParameters. This way the E-fields in stations and sim_stations look the same and modules to do things like resampling and filtering can treat them the same. I find it somewhat cumbersome to always have to implement these things for stations and sim_stations separately. Since this means that the number of E-fields can vary (I guess it would also vary in normal stations, depending on how well we can reconstruct the event) a list seems like the best way to store E-field objects. Do you have another way in mind? Of course, something like an iter_efields_of_channel function would be necessary.
sounds good to me. A list seems to be the best option with additional convenience functions.

If there are no serious concerns, I am just going to create a new branch and start implementing this.

yes please go ahead. When you start the pull request, can you add (the relevant part of) this email exchange to the discussion? Then it is well documented.

Cheers,

Christian

Cheers,

Christoph

On 04.01.19 14:55, Christian Glaser wrote:

Hi Anna, hi all,

I like the idea of a efield class. It will be very similar to the channel class and also inherit from the base_trace class. We would need to define a new parameter class for efield attributes though. Or we inherit the efield class from channel(?).

Especially the use case of the efield reconstruction will be useful, where we want to reconstruct the efield for different sets of channels and potentially with different methods.

The station class will then have a list of 'efield objects'.

But I'm not sure what the best implementation of the simStation is. Just a list of efields is cumbersome for the efieldtovoltage converter. Right now we have a function 'simstation.get_channel(1)' which returns all efields of channel 1. But we could probably implement a convenience function

def iter_efields_of_channel(channel_id):

for efield in self.__efields:

     if(channel_id in efield.get_participating_channels()):

           yield efield

This should do the job without any significant overhead.

Regarding the reuse of modules: Right now we already have a distinction between channel and station modules. This is inherited from Offline where channel and station was a strict distinction between voltage and efield traces. However, for NuRadioMC this is not true anymore which lead to these problems. Instead we should differentiate between voltage and efield modules. As the interface for efields will be the same for station and SimStation this should work fine. This is even an improvement and makes things easier than we have right now (I think). So thanks for thinking this through.

Why don't you or Christoph start a pull request, then we can add these emails as discussion and start with the implementation of the efield class. Not so much to do there, be we need to discuss and agree on the interfaces before we start modifying modules.

Cheers,

Christian

Am 04.01.19 um 13:29 schrieb Anna Nelles:

Hi all,

Christoph and I just discussed about a larger reorganizing of NuRadioReco, which I would like to hear your thoughts about.

Currently, you can ask the station for get_trace(), which returns the electric field of the station, a 3d array. Going forward, this only makes sense for cosmic rays, because for neutrinos there will be no such thing as one efield trace associated with one station. There might still be one at the surface, but as soon as we move deeper down, with V-pols and H-pols, we will need to combine different sets of antennas to reconstruct different electric fields. So, likely a station will soon have more than one electric field, which means that the get_trace() function will return something unclear for the whole station. Or a list, which needs additional attributes etc.

In addition, it has been causing some confusion that the sim_station has the electric fields on channel level. We don't really distinguish nicely between the channel, as in what is read-out at a station at physical channel level and an electric field at the position of a channel. Both are currently a channel. We all now know that the efield is set at channel level for the sim station, but together with the neutrino case, this won't be so clean in the future.

Christoph and I therefore thought about adding an efield class. It will store the electric field from simulations and also for a reconstruction the channels that have been used to reconstruct it. For example for the current ARIANNA station, the sim_station will only have one electric field for cosmic rays and one for the neutrino case. Adding a deeper component like it is envisioned for future cases, will have the station have three efields, rather than channels.

This of course means that sim_stations won't have channels typically, which means that a whole lot of modules won't work on them. However, we are already special casing in the channelResampler, so I am not sure that this is killing argument. In my view, it won't necessarily make things more elegant, but clearer for future users of the software.

I know that this is a major rewrite, but it might reduce the confusion going forward, since station.get_trace() is just not uniquely defined. One would then also account for the fact that an efield reconstruction might use more or fewer channels, for example reconstructing the efield for CRs with the current stations with LPDAs only and once using the dipoles, so one could have two reconstructed efields to compare.

Since this is such a big strategic decision, I wanted to hear your thoughts first, before we continue to starting a branch and suggesting an implementation.

So, what do you think? Christoph may add here from our discussion.

Anna

PS: It all started with the fact that station.get_sampling_rate() has been removed explicitly from the station (one can envision a station having channels with different sampling rates), but that it is implicitly still available since station inherits from trace. This is fine, but goes against what drove the decision to remove it in the first place.

PPS: Happy New Year, if we haven't talked yet

@cg-laser
Copy link
Contributor

don't we want to remove the channels from sim_station? Or is it needed for anything?

@anelles
Copy link
Collaborator

anelles commented Jan 14, 2019

I like the suggestion to use this opportunity to rename the default to .nur files. This will make it transparent, that we are breaking backwards compatibility with this change.

@christophwelling
Copy link
Collaborator Author

Update:

  • parameter indices reset to what they were originally
  • CoREASreader gets detector in the run() method. Channel-IDs are taken explicitly from the detector description.
  • SimStation channels removed. They are indeed not needed for anything anymore (I just forgot to remove them because I assumed they were inherited from BaseStation)
  • While I was on it I also removed Station.get_channels(). That one had a deprecation warning, so now is a good time to follow through with that.
  • About file endings: I think currently the only place in NuRadioReco where this is relevant is in the Eventbrowser, which will only look for .ari files (I now changed that to .nur). The eventWriter doesn't enforce any file endings. We may want to change this and just add a .nur to the filename if it is missing. This way we could also get rid of the eventWriter's annoying habit of producing files with names ending in things like .ari_part02
  • We talked about not wanting to store the E_r component of electric fields. I would suggest to overwrite the get and set functions for the traces and spectra. set_trace would check if the passed array has 2 or three components and only save E_theta and E_phi. The getter function would add the E_r component again, unless told not to do so (something like get_trace(n_dim=3), so it can be used the same way as before)
  • Increasing the version number sounds good. Is that done using the VERSION variable in modules/io/ARIANNAio.py?

@anelles
Copy link
Collaborator

anelles commented Jan 14, 2019

I am all in on making the EventWriter obey to .nur. And at least have the EventReader throw a warning, if an non .nur file is read in. Any thoughts, @cg-laser Christian?

@cg-laser
Copy link
Contributor

I think not saving E_r is not so critical but if it can be done transparently that the missing dimension is added on the fly in any getter function, I'm ok with it. We could even just do it in the (de)serialization function to save disk space. As we anyway don't consume a lot of memory, we could always have the three dimensions in memory.

how would you implement splitting files up into chunks without adding something like "part_02" at the end of the filename? The ability to split up the datafile into chunks of e.g. 1GB is a very useful feature.

And yes you need to change it in "modules/io/ARIANNAio.py".

@cg-laser
Copy link
Contributor

and fine for me to enforce the *.nur file ending in the eventWriter. Then the user only needs to specify the first part of the filename but not the extension. @christophwelling how do you suggest to split up events into multiple files.

@christophwelling
Copy link
Collaborator Author

I think adding _part02 to the filename is fine, it just looks weird that it is part of the file extension. I think "filename_part02.ari" makes more sense than "filename.ari_part02". Other than that I think it's working fine.

@christophwelling
Copy link
Collaborator Author

Update:

  • File/software version set to 2.0
  • eventWriter adds .nur to file name ending (unless the file already ends with .nur). If the user tries to save a file as .ari, it is changed to .ari.nur and a warning message asks the user to please use .nur next time

@cg-laser cg-laser merged commit 43ed653 into master Jan 15, 2019
@cg-laser
Copy link
Contributor

I leave this pull request open for a while in case we still need to add something.

@christophwelling
Copy link
Collaborator Author

We definitely need to fix the efieldToVoltageConverter. The way cable delays are handled is inconsistent and it does not work correctly for cosmic ray events because it does not take signal delay because of different travel times into account.

@cg-laser
Copy link
Contributor

this brings up actually another general problem: the signal time of the efield. For neutrinos, each efield has a start time that corresponds to the travel time from the vertex to the particular channel.
For cosmic rays we just have one efield for all channels and assume time zero is at the station center...
Given our new efield structure the reference point of the time is not properly defined. If we have only one corresponding channel, it is this channel, but what if we have several? We could add a 'reference point' property to the efield class which stores the position (xyz) relative to the station center. But for the cases with only one channel this will be cumbersome but doable.

Then the efieldToVoltageConverter can calculate additional delays from the displacement of the channel to the efield reference point if necessary.
Line 96 is currently t0 = electric_field.get_trace_start_time() + cab_delay and would need to be replaced by
t0 = electric_field.get_trace_start_time() + cab_delay + displacement_delay
and then again in 147

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants