Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEMA] Should 'measurement_units' be optional or required? #200

Open
stephenholleran opened this issue Dec 1, 2022 · 5 comments
Open
Labels
question Further information is requested

Comments

@stephenholleran
Copy link
Collaborator

Following on from this issue #193 it emerged that the inconsistency of the measurement_units property under the logger_measurement_config might be a result of us intending that this should be 'required' as we don't allow a 'null'. However it is listed as 'optional'. The question is, should this be 'optional' or 'required'?

One to discuss in the New Year.

cc @abohara @kersting

@abohara
Copy link
Collaborator

abohara commented Jan 6, 2023

@stephenholleran There is two ways I see it -

  1. If this is being used internally, most fields (including units) will have to be optional because sometimes a piece of information may not be available at the moment but available later.
  2. If it is being used for "exchange" between organization, then units ( and many other fields ) should be required based on what is minimally required to do a resource assessment. For e.g. we need to know the height ( but some property of the lattice structure may be less critical and hence optional).

In my view , what should be required or optional may need to be driven by the 2nd use case. The internal usage policy can be set by each team and they can modify the schema accordingly. Starting a common understanding of the minimum data necessity for a WRA would be another real "value" that the data model could bring about via what we mark as "required" / "optional"

@kersting
Copy link
Collaborator

kersting commented Jan 9, 2023

@abohara and @stephenholleran for me we need to think about the logger models that we have out there. Is there a logger in which we can set up a channel but not define the unit? If so then we should make it optional. I think that the Campbell Scientific loggers allow one to set up a channel without units but I'm not totally sure.

The other case that we may want to discuss is the case in which a column is marked with is_ignored in that case the units should not be required.

@abohara I'm not sure we need to make the distinction between internal and external because what is typed in the logger is independent from the official documentation in the same way that the slope and offset typed in the logger may not reflect the actual sensor slope and offset.

@stephenholleran
Copy link
Collaborator Author

Hi both,

I think that the data model is definitely for "exchange" of data between organizations and so therefore I agree that the data model mostly determines what is minimal for a wind resource assessment. It can't do this fully as a sensor may not have a logger_offset in the case of a pyranometer as an example. But we should do this as much a possible. I think we were thinking that way when developing it.

In terms of thinking about what the logger can send or not send, what we currently have with it optional satisfies that. However, I am not sure if I've ever seen a data file without units, even from Campbell Scientific. Seen as the units are a result of what slope and offset that is typed in, you could say that we should force the logger OEMs to always provide units. But we have no influence there.

Thinking about the logger more, even if the units are there they could be typed in incorrectly. E.g. incorrect units "deg_F" instead of "deg_C" or misspelled units "m/d" instead of "m/s".

Surmising below with my answers.

  1. Should measurement units be required for an assessment? Yes.
  2. Should measurement units be required from a logger? Not really.
  3. Can the measurement units be typed in incorrectly into the logger? Possibly.
    1. If so, then how do we know what the correct units are? We don't.

To satisfy all of that I think we need the correct measurement units stored somewhere else, most likely at the measurement point level, and be required. That way a logger may or may not provide the units. If the logger does provide, they can be incorrect and that is ok. For an assessment the correct and required units are with the measurement point.

This will work but I am not sure if we should implement or even bother to implement. What is the likelihood of the logger not providing the units or incorrect units? I must say I've never come across it but that doesn't mean it doesn't or can't happen. We would also want to think about the consequences of adding a field to the measurement point. The easy thing to do for the moment is to just leave it as optional for now and if the need arises then implement this solution.

@abohara
Copy link
Collaborator

abohara commented Jan 20, 2023

@stephenholleran Thanks for your explanation

One clarification in this context - by exchanging between "organizations" , what I had in my mind is a developer ( who commissioned the campaign ) sending the data package to a consultant or another downstream entity to kick start the financing , acquisition or other steps in the development cycle. Not necessarily the data coming out of a logger into the measurement campaign team.

So, while the logger OEM may not automatically report units correctly or there is a possibility of manual data entry errors during logger deployment, the responsible entity (e.g. the developer or campaign monitoring team) needs to set the correct information in the data model to reflect the ground truth. In this case, for e.g. editing the logger_sensor_config to reflect the actual units used in the logger. This might mean putting, '0' in the logger offset for a Pyranometer ( to make the assumption explicit), or correcting 'Celsius' to 'deg_C' ( to make it consistent with the data model). Otherwise, we are missing out on the benefits of the standardization coming from data model. Also, moving units to measurement point table would negate the purpose of the logger sensor config table and possibly create more misunderstandings ( beside forcing a new measurement point to be created because a unit change) with limited upside ( the data entry error is recorded for posterity ).

All of this is to say that if something is required for WRA, then we keep it as required.

@stephenholleran , @kersting thoughts ?

@kersting
Copy link
Collaborator

@stephenholleran and @abohara

@abohara thanks for the clarification on external use and exchange between organizations.

@stephenholleran mentioned:

  1. Can the measurement units be typed in incorrectly into the logger? Possibly.

    1. If so, then how do we know what the correct units are? We don't.

I'm not sure I agree with 3.1 statement. For example, if the units for a given anemometer are typed in the logger as m/h, the analyst can compare the data of a neighbor channels to find that it was indeed m/s. If a mast has a single temperature sensor that the units show C but the data for whatever reason is in K, one can use ERA5 data to infer the units.

If one is completing the data model, then I would assume that someone skilled in resource assessment has taken the time to find out what is the actual unit of the data columns in case it is not clear. Even for the catastrophic case of wrong units, slope, and offset the analyst dealing with the data model would then set the column as is_ignored and would post a comment that the channel is not usable because of problems in units, slope, and offset. I think that the analyst handling the data model will complete the metadata model so the data can actually be used including exchanged between organizations.

I agree with @stephenholleran that the right unit must be reported somewhere and the measurement point level makes sense to me. Should we make the effort to implement this now? Perhaps we hold in making a modification until someone comes with an example of units with typos. Meanwhile we could make the units required to satisfy an easy exchange of information between organizations (excluding the case of units for an ignored column).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants