Skip to content

Spatial consistency test resistant

Cristian Lussana edited this page Dec 15, 2023 · 13 revisions

The sct_resistant function offers a more outlier-resistant spatial consistency test (SCT) compared to the standard sct, though it typically operates at a slower pace.

In this method, the p observed values are represented by a p-vector, values, and their geographical positions by points (latitude, longitude, and elevation). Users can select which observations to analyze using the obs_to_check vector, where a '1' indicates checking the corresponding observation and a '0' signifies using it without verification.

Like sct, sct_resistant employs a moving window approach across the area of interest. It identifies several 'centroid' observations, around which inner and outer circles are established. This approach allows for localized customization of the SCT, with many parameters being dependent on location.

An important aspect of SCT is determining a background or a preliminary guess for each observation. This is achieved by considering all observations within an outer circle, with several options available:

  • VerticalProfile. Estimates the background based on observation elevations, optimized for temperature profiles in complex terrains, as detailed in Frei (2014).
  • VerticalProfileTheilSen. Uses Theil-Sen linear regression (as per Wilks (2019)) to estimate the background, which is more resistant to data outliers than the VerticalProfile method.
  • MeanOuterCircle. Sets a constant background based on the mean value of observations within the outer circle.
  • MedianOuterCircle. Uses the median value of observations within the outer circle as a constant background.
  • External. Allows users to provide their own background values.

The output includes a quality flag p-vector for each observation and diagnostics for any observations flagged as unreliable, such as the z-score, in line with the core SCT algorithm described subsequently.

flag description
-999 missing flag (observation not checked)
0 good observation
1 bad observation
11 isolated observation, it is the only observation inside the inner circle
12 isolated observation, less than num_min_outer observations inside outer circle

Defining the Range of Valid and Admissible Values

Thesct_resistant function necessitates defining two distinct ranges around each observed value: the valid value range (values_minv, values_maxv) and the admissible value range (values_mina, values_maxa). Users must determine the minimum (valid range) and maximum (admissible range) levels of uncertainty for an independent estimate of a given observation. Consequently, if the discrepancies between observations and independent estimates fall within the valid value range in a predefined area, the observations are deemed reliable. Conversely, significant deviations beyond the admissible value range from the independent estimate cast doubt on the observed value. This concept is depicted in the figures Figure for the cross-section and Figure for the scheme.

To illustrate, consider setting ranges for temperature. A valid value range might be ±1°C around observed values, indicating that an independent estimate within this margin is acceptable. In contrast, an admissible value range of ±20°C implies that any independent estimate beyond this threshold potentially signifies an unreliable observation.

NOTE: Users must individually set the ranges for each observation using the "observations" vector. in the context of the example above, for the admissible values range, they should define values_mina as "observations - 20" and values_maxa as "observations + 20". A similar method should be used for determining the range of valid values.

Algorithm

This algorithm represents an iterative approach to assess the quality of observations, adjusting for potential outliers and varying conditions across different locations. It emphasizes the importance of local context in evaluating observation quality, combining statistical analysis with spatial considerations.

The primary algorithm for sct_resistant, along with the core SCT algorithm, is outlined as follows (see the Figures as references):

Main Algorithm

  • Initialization:
    • Define num_iterations for the number of sweeps through all observations.
    • In the first iteration, only flag bad observations. If none are found, exit the loop.
  • Loop Over Observations:
    • For each observation:
      • Determine if it's a suitable centroid for the concentric circles based on its status (neither previously flagged as good nor bad).
      • Select observations within the outer circle (p_out, with p_out greater or equal to num_min_outer and lesser or equal to num_max) and inner circle (p_in).
      • Choose observations within the inner circle not yet tested (p_test).
      • Check for isolated locations, which cannot be reliably tested and are thus flagged with specific codes. Conditions are 0 observations inside the inner circle OR p_out < num_minand the codes returned are 11 and 12.
      • Calculate or assign background values for p_out observations.
      • If observations and backgrounds for all p_test are close enough, flag them as good. The condition is that the background at each location to test is within the corresponding range of valid values.
      • Apply the SCT-core: flag the worst observation as bad or all as good based on statistics from p_in.
  • Two Additional Iterations are performed:
    • First iteration focuses on observations yet to be decided. Consider as centroids all the observations yet to be decided (this may happen when we reached num_iterations of the previous loop).
    • Second iteration revisits bad observations, potentially reconsidering them as good based on new information. Consider as centroids all bad observations and use good observations only. Bad observations can be saved if they pass the final round.

SCT-core algorithm

  • Correlation Estimation:
    • Use an analytical function to estimate error correlations at nearby locations.
    • Determine horizontal and vertical de-correlation lengths using adaptive estimates and predefined scales. kth_closest_obs_horizontal_scale determines the horizontal de-correlation length as the average distance between locations in the outer circle and their k-th nearest observation. This value is confined within the predefined min_horizontal_scale and max_horizontal_scale. The vertical de-correlation length is set by vertical_scale.
  • Loop Over Inner Circle Observations:
    • Compute analysis and cross-validation analysis for each observation.
    • Collect statistics of the variable chi = square.root[(observation - analysis) * (observation - cv-analysis)]. Only include values where the cross-validation analysis falls within the range of admissible values.
  • Analysis:
    • If in basic mode, use chi as the z-score.
    • Otherwise, compute normalized deviations z = (chi - median( chi )) / interquartile_range(chi).
    • If any p_test observation has a z value outside thresholds, flag the worst as bad.
    • Otherwise, flag all as good.

 Image  Image

Input parameters

Similar to spatial consistency test, but with some additional parameters.

Parameter Type Unit Description
max_horizontal_scale float m Maximum horizontal decorrelation length
kth_closest_obs_horizontal_scale int Number of closest observations to consider in the adaptive estimation of the horizontal decorrelation length
value_mina vec ou Minimum admissible value
value_maxa vec ou Maximum admissible value
value_minv vec ou Minimum valid value
value_maxv vec ou Maximum valid value

ou = Unit of the observation

Example (R-code)

# R code
obs_to_check <- rep(1, npoints)
background_values <- rep(0, npoints)
background_elab_type <- "MedianOuterCircle"
num_min_outer <- 3
num_max_outer <- 10
inner_radius <- 20000
outer_radius <- 50000
num_iterations <- 10
num_min_prof <- 1
min_elev_diff <- 100
min_horizontal_scale <- 250 
max_horizontal_scale <- 100000
kth_closest_obs_horizontal_scale <- 2
vertical_scale <- 200
tpos <- rep(1,N) * 16
tneg <- rep(1,N) * 16
eps2 <- rep(1,N) * 0.5
values_mina <- precip_obs - 20
values_maxa <- precip_obs + 20
values_minv <- precip_obs - 1
values_maxv <- precip_obs + 1
debug <- T
basic <- T
points <- Points(lats, lons, elevs)
res <- sct_resistant( points, precip_obs, obs_to_check, background_values, 
                      background_elab_type, num_min_outer, num_max_outer, 
                      inner_radius, outer_radius, num_iterations, num_min_prof, 
                      min_elev_diff, min_horizontal_scale, max_horizontal_scale, 
                      kth_closest_obs_horizontal_scale, vertical_scale, 
                      values_mina, values_maxa, values_minv, values_maxv, 
                      eps2, tpos, tneg, debug, basic)
# flags
print( res[[1]])
# scores
print( res[[2]])