From 795262981891be013d9b05f0ddafddf4cd8622b8 Mon Sep 17 00:00:00 2001
From: William Augustine McLean
This workflow assumes access to the tnc-dangermond bucket and its folders which contain the updated site data for the Dangermond @@ -126,10 +126,11 @@
The TNC Site names get wonky so replacing any spaces with underscores, filtering out NA values, and just pulling out the Site and -Divide (to be used later) fields will help.
+Divide (to be used later) fields will help. Active Divides will be used +in the CleanWells() function below.Additionally, creating a SiteBasins df that is just the distinct -Basin IDs is necessary for the pull. This will be fixed in a later -update
+Basin IDs is necessary if you want to pull well data from the TNC +Bucket.
##Join Divides and Sites
#ActiveDivides <- dangermond_sites %>%
@@ -147,41 +148,67 @@ Clean up the data a bit.Clean Well Data
CleanWells()
+For this example I will be using “all_combined_data” which is
+compiled from the TNC Sensor data at the Jack and Laura Dangermond
+Preserve. If you have the correct credentials you can pull this data as
+well, to do so you will need the SiteBasins df created above.
+
+#monthly_basin_average <- CleanWells(x = all_combined_data, y = ActiveDivides)
This function filters, transforms, and processes well water level
data from a combined dataset. It prepares the data for mapping and
analysis by calculating changes in well water levels, joining with basin
-information, and summarizing the data on a monthly basis.
+information from ActiveDivides, and summarizing the data on a monthly
+basis.
The output should look like this, time series data for all divides
containing sites that are measuring well water level, and the average
change in well water level from month to month:
+After Running CleanWells() you should also create a UniqueDivides
+dateframe
+This will be used in the next function as well.
+
+# UniqueDivides <- monthly_basin_average %>%
+# distinct(Divide)
Now that we have the site data, we can use the (model)_delta_str file from the WaterBalance workflow and compare the two.
-The Correlation() function calculates the correlation between well -water level changes and storage changes (e.g., from cabcm_ or -terra_delta_str) for different basins. It joins the data, computes -correlation statistics, and prepares the data for further analysis.
-CRITICAL:
-Correlation() copies “model_correlation_by_divide” to the package -environment, this will feed straight into ModelCorrMap(). You will still -need the scatterplot from ModelScatter() though so don’t skip ahead.
-
-#cabcm_dangermond_join <- Correlation(x = cabcm_delta_str, y = monthly_basin_average)
The JoinHydroData() function merges well water level data with change
+in storage data from our models for further analysis. The storage data
+is filtered to include only non-negative values and specific date
+ranges. The function limits the data to basins present in the
+UniqueDivides
dataset.
+#cabcm_dangermond_join <- JoinHydroData(x = cabcm_delta_str, y = monthly_basin_average, z = UniqueDivides)
Next you can use the Correlation() function, which computes the +correlation between changes in well water levels and modelled change in +storage for each divide (basin). It calculates standard deviations of +well water levels and storage, computes correlation coefficients, and +merges these correlations back into the original dataset.
+
+#cabcm_corr_join <- Correlation(x = cabcm_dangermond_join)
The output should look like this, it even adds convenient plot titles!
-I’ve called this dataframe cabcm_dangermond_join, no need to get too +
I’ve called this dataframe “cabcm_corr_join”, no need to get too fancy.
Now that our data is looking good we can finally make some plots
Plot_WWLvsSTR() does exactly what you think it will. It plots the change in well water level “mean_change_mm” against deltaSTR from your model data.
-
-#Cabcm_Plot <- Plot_WWLvsSTR(x = cabcm_dangermond_join)
+#cabcm_Plot <- Plot_WWLvsSTR(x = cabcm_corr_join)
+
+#cabcm_Plot
This is a great visualization of the accuracy of the model to the basin in terms of both timing and magnitude of changes in well water @@ -190,13 +217,16 @@
ModelScatter() uses the same data as above, but makes a scatterplot instead.
-
-#ModelScatterPlot <- ModelScatter(x = cabcm_dangermond_join)
+#ModelScatterPlot <- ModelScatter(x = cabcm_corr_join)
+
+#ModelScatterPlot
This is good, it shows us the general shape of the data and the correlation is easily visible as part of the title. But how does it @@ -204,8 +234,8 @@
-#Corr_Map <- ModelCorrMap(x = model_correlation_by_divide, y = ModelScatterPlot)
+#Corr_Map <- ModelCorrMap(x = cabcm_corr_join, y = ModelScatterPlot, z = NewDivides)
Developed by Billy McLean, TNC, .
+Developed by Billy McLean, , .
-The first step will be to set AWS Credentials Globally so that you can access the data held in the tnc-dangermond bucket.
+model_clean() separates out the soil moisture storage “str” from the “var” field and calculates day to day change in storage before joining -that data back into the primary dataset for analysis. The output is -cabcm_delta_str or terra_delta_str.
+that data back into the primary dataset for analysis. Assign your ouput +to [model name]_delta_str
-#model_clean(new_cabcm_data)
#cabcm_delta_str <- model_clean(new_cabcm_data)
From this:
To this:
@@ -230,7 +230,9 @@ERR = ppt - aet - rch - run - str
-#result_cabcm <- process_model_data(cabcm_delta_str, NewDivides)
+
+#result_cabcm <- process_model_data(cabcm_delta_str, NewDivides)
The process_model_data() function does this and more.
It widens the dataframe, calculates the error for each timestamp, assigns seasons based on month, calculates average error per season, and @@ -240,10 +242,6 @@
In addition to prepping the data for Seasonal Error plot, the -function assigns the widened data, such as “cabcm_data_wide” to the -global environment to be used in the balance_data() function below and -create our water balance plots.
plot_seasons() generates standardized plots for based on the above
seasonal data. The plots visualize the percent error across different
spatial features for a given season. This function is designed to be
-used within the GridSeasons()
function to arrange the plots
+used within the GridSeasons()
function to arrange the plots
in a grid layout.
You can customize it as suits you, but I have found this format to be very appealing visually.
@@ -281,12 +279,21 @@process_model_data()
and customizes the plot
titles and captions based on the data source, be it CABCM, Terra, or
(pending) NextGen
-### Wide +Data() We need to make the data wide before we can balance it +and use it to create our water balance plot. This can be done with the +widen_model_data() function
+
+#cabcm_data_wide <- widen_model_data(cabcm_delta_str)
use cabcm_delta_str as the x value and feed the output right into +balance_data() below
+#cabcm_long_balance <- balance_data(cabcm_data_wide)
balance_data() relies on terra_ or cabcm_data_wide from the process_data() function above. It formats the data so that it can be used to create the Water Balance Plots.
@@ -300,8 +307,8 @@CreateWaterBalancePlot() uses the output of balance_data(), terra_ or cabcm_long_balance, and creates a water balance plot comparing the inputs and outputs of the system.
-
-#combined_plot <- CreateWaterBalancePlot(x = terra_long_balance)
+#combined_plot <- CreateWaterBalancePlot(x = cabcm_long_balance)