-
Notifications
You must be signed in to change notification settings - Fork 9
Advice on Parameter Settings
It is difficult to give a blanket recommendation of default values for the test parameters, because they depend so strongly on the specifics of the data being tested. The type of data being tested, density of the network, prevalence of systematic errors, climatology of the local area, etc. will all greatly affect the ideal arguments.
The performance of the space quality control should be evaluated based on how the data are used. What constitutes "bad" data according to your application? What is the cost to your application of flagging a good data? What is the cost to your application of keeping a bad data? Further details can be found in Alerskans, E., C. Lussana, T. N. Nipen, and I. A. Seierstad, 2022: Optimizing Spatial Quality Control for a Dense Network of Meteorological Stations. J. Atmos. Oceanic Technol., 39, 973–984. When using observations to post-process a model one can conduct a cross-validation exercise and compute a validation score using 1) observations that are not quality controlled and 2) observations that are quality controlled. Further details can be found in Nipen, T. N., I. A. Seierstad, C. Lussana, J. Kristiansen, and Ø. Hov, 2020: Adopting Citizen Observations in Operational Weather Prediction. Bull. Amer. Meteor. Soc., 101, E43–E57, especially Fig 6.
Quality control may also be conducted differently depending on the amount of data available at the time of quality control, and the speed at which the quality control process needs to be completed.
In Titanlib settings discussion users can share the settings they are using. Settings used operationally at MET Norway for seNorge and MET Nordic analysis are available for both temperature and precipitation.
Titanlib settings length scales should be representative of the average station density. For instance if having one station every 10 km, in SCT one should choose inner radius larger than 3 times 10 km and outer radius 5 to to 10 times. An increase in the network's density can often lead to a corresponding improvement in the refinement of quality control.
Titantuner presents a GUI for manually tuning parameters. Users can manipulate the parameters directly, and see how their changes affect which data points are flagged. The parameters can be manually adjusted until the sets of flagged and unflagged data are sensible. Illustrations showing how the tests can be combined and tuned are given in titantuner wiki example section.
If you don't have the required expertise to determine the ideal values yourself, the next best thing is to automatically tune them using titantuner autotune.
Titantuner's autotune functionality can introduce a perturbations into a reference dataset, then perform a gradient descent using the parameters to minimise a cost function. This eventually spits out the set of parameters that produced the best ratio of correctly flagged perturbations to false alarms.
To get good result with automatic tuning, you should feed it a dataset that is representative of the data you would like to QC, but with as few errors as possible (perhaps manually QCed?). It is important to minimise the number of errors in the data, as undetected errors would otherwise be seen as false alarms under well adjusted parameters, skewing the cost function.
Copyright © 2019-2023 Norwegian Meteorological Institute