-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdev_workflow_vignette.Rmd
357 lines (282 loc) · 12 KB
/
dev_workflow_vignette.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
---
title: "r2ogs6 Developer Guide"
date: "2023-04-05"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{r2ogs6 Developer Guide}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```r
library(r2ogs6)
```
## Hi there!
Welcome to my dev guide on `r2ogs6`. This is a collection of tips, useful info (and admittedly a few warnings) which will hopefully make your life a bit easier when developing this package.
## The basics
Before we dive into any implementation details, we will take a look at how exactly this package is structured first. `r2ogs6` was developed using the workflow described [here](https://r-pkgs.org/index.html). I strongly recommend keeping it that way as it will save you time and headaches.
...
In the main folder `R/` you will find a lot of scripts, most of which can be grouped into the following categories:
* `export_*.R` export functions
* `generate_*.R` code generation
* `read_in_*.R` import functions
* `ogs6_*.R` simulation class definitions
* `prj_*.R` class definitions for XML tags found in a `.prj` file
* `*_utils.R` utility functions used in multiple scripts
## The classes
`r2ogs6` is largely built on top of S3 classes at the moment. For reasons I will elaborate on later, it is very viable to switch to R6 classes. But let's look at what we have first.
....
## Generating new classes
If you've familiarized yourself with OpenGeoSys 6, you know that there are a lot, and by a lot I mean a LOT of parameters and special cases regarding the `.prj` XML tags. For a nice new class based on such a tag, you will have to consider all of them.
To save me (and you) a bit of typing, I've written a few useful functions for this.
### analyse_xml()
The first and arguably most important one is `analyse_xml()`. It matches files in a folder, reads them in as XML and searches for XML elements of a given name. It then analyses those elements and returns useful information about them, namely the names of their attributes and child elements. It prints a summary of its findings and also returns a list which we will look at in a moment.
I used this function for two things: Analysing ... . Secondly, as soon as I had decided which tags should be represented by a class, I used the function output for class generation.
### generate_*()
So say we have some `.prj` files stored in a folder. I will show the workflow on a small dataset (that is, on a folder with only two `.prj` files) here, the path I usually passed to `analyse_xml()` was the directory containing all of the benchmark files for OpenGeoSys 6 which can be downloaded from [here](https://gitlab.opengeosys.org/ogs/ogs/-/tree/master/Tests/Data/).
```r
test_folder <- system.file("extdata/vignettes_data/analyse_xml_demo",
package = "r2ogs6")
```
Now say we have decided we are going to make a class based on the element with tag name `nonlinear_solver`. For readability reasons, I will store the results of `analyse_xml()` in a variable and pass it to our generator function. If you want, you can skip this step and call `analyse_xml()` in the generator function directly.
```r
analysis_results <- analyse_xml(path = test_folder,
pattern = "\\.prj$",
xpath = "//nonlinear_solver",
print_findings = TRUE)
#>
#> I parsed 2 valid XML files matching your pattern.
#>
#> I found at least one element named nonlinear_solver in the following file(s):
#> beam.prj
#> beam3d.prj
#>
#> In total, I found 5 element(s) named nonlinear_solver.
#>
#> These are the child elements I found:
#> name ex_occ p_occ total total_mean
#> 1 name 2 0.4 2 0.4
#> 2 type 2 0.4 2 0.4
#> 3 max_iter 2 0.4 2 0.4
#> 4 linear_solver 2 0.4 2 0.4
#> 5 maximum_iterations 1 0.2 1 0.2
#> 6 error_tolerance 1 0.2 1 0.2
#> 7 damping 1 0.2 1 0.2
```
First, I define my path and specify that only files with the ending `.prj` will be parsed. I'm looking for elements named `nonlinear_solver`, and I'm looking for them in the whole document. This often isn't the best option since sometimes nodes may have the same name but contain different things depending on their exact position in the document, which is also the case here. To narrow it down further, change `xpath` accordingly.
```r
analysis_results <- analyse_xml(path = test_folder,
pattern = "\\.prj$",
xpath = "/OpenGeoSysProject/nonlinear_solvers/nonlinear_solver",
print_findings = TRUE)
#>
#> I parsed 2 valid XML files matching your pattern.
#>
#> I found at least one element named nonlinear_solver in the following file(s):
#> beam.prj
#> beam3d.prj
#>
#> In total, I found 2 element(s) named nonlinear_solver.
#>
#> These are the child elements I found:
#> name ex_occ p_occ total total_mean
#> 1 name 2 1.0 2 1.0
#> 2 type 2 1.0 2 1.0
#> 3 max_iter 2 1.0 2 1.0
#> 4 linear_solver 2 1.0 2 1.0
#> 5 damping 1 0.5 1 0.5
```
Now we can be sure our future class will be generated from the correct parameters.
`analyse_xml()` returns a named list invisibly, let's have a short look at it.
```r
analysis_results
#> $xpath
#> [1] "/OpenGeoSysProject/nonlinear_solvers/nonlinear_solver"
#>
#> $children
#> name type max_iter linear_solver damping
#> TRUE TRUE TRUE TRUE FALSE
#>
#> $attributes
#> logical(0)
#>
#> $both_sorted
#> name type max_iter linear_solver damping
#> TRUE TRUE TRUE TRUE FALSE
```
You can see the list contains the `xpath` parameter passed to `analyse_xml()`, along with three named logical vectors called `children`, `attributes` and `both_sorted` respectively. They can be read like this: If an attribute or a child of the element specified by `xpath` always occurred, it is a required parameter for the new class. Else, it is an optional parameter. The logical vectors are sorted by occurrency, so the rarest children and attributes will go to the very end of their logical vector. Now, let's generate some code!
For S3 classes, we generate a constructor like this:
```r
generate_constructor(params = analysis_results,
print_result = TRUE)
#> new_prj_nonlinear_solver <- function(name,
#> type,
#> max_iter,
#> linear_solver,
#> damping = NULL) {
#> structure(list(name = name,
#> type = type,
#> max_iter = max_iter,
#> linear_solver = linear_solver,
#> damping = damping,
#> xpath = "nonlinear_solvers/nonlinear_solver",
#> attr_names = c(),
#> flatten_on_exp = character()
#> ),
#> class = "prj_nonlinear_solver"
#> )
#> }
#>
```
For S3 classes, we generate a helper like this:
```r
generate_helper(params = analysis_results,
print_result = TRUE)
#> #'prj_nonlinear_solver
#> #'@description tag: nonlinear_solver
#> #'@param name
#> #'@param type
#> #'@param max_iter
#> #'@param linear_solver
#> #'@param damping Optional:
#> #'@export
#> prj_nonlinear_solver <- function(name,
#> type,
#> max_iter,
#> linear_solver,
#> damping = NULL) {
#>
#> # Add coercing utility here
#>
#> new_prj_nonlinear_solver(name,
#> type,
#> max_iter,
#> linear_solver,
#> damping)
#> }
#>
```
For R6 classes, we generate a constructor like this:
```r
generate_R6(params = analysis_results,
print_result = TRUE)
#> OGS6_nonlinear_solver <- R6::R6Class("OGS6_nonlinear_solver",
#> public = list(
#> #'@description
#> #'Creates new OGS6_nonlinear_solverobject
#> #'@param name
#> #'@param type
#> #'@param max_iter
#> #'@param linear_solver
#> #'@param damping Optional: initialize = function(name,
#> type,
#> max_iter,
#> linear_solver,
#> damping = NULL){
#> self$name <- name
#> self$type <- type
#> self$max_iter <- max_iter
#> self$linear_solver <- linear_solver
#> self$damping <- damping
#> }
#> ),
#>
#> active = list(
#> #'@field name
#> #'Access to private parameter '.name'
#> name = function(value) {
#> if(missing(value)) {
#> private$.name
#> }else{
#> private$.name <- value
#> }
#> },
#>
#> #'@field type
#> #'Access to private parameter '.type'
#> type = function(value) {
#> if(missing(value)) {
#> private$.type
#> }else{
#> private$.type <- value
#> }
#> },
#>
#> #'@field max_iter
#> #'Access to private parameter '.max_iter'
#> max_iter = function(value) {
#> if(missing(value)) {
#> private$.max_iter
#> }else{
#> private$.max_iter <- value
#> }
#> },
#>
#> #'@field linear_solver
#> #'Access to private parameter '.linear_solver'
#> linear_solver = function(value) {
#> if(missing(value)) {
#> private$.linear_solver
#> }else{
#> private$.linear_solver <- value
#> }
#> },
#>
#> #'@field damping
#> #'Access to private parameter '.damping'
#> damping = function(value) {
#> if(missing(value)) {
#> private$.damping
#> }else{
#> private$.damping <- value
#> }
#> },
#>
#> #'@field is_subclass
#> #'Access to private parameter '.is_subclass'
#> is_subclass = function() {
#> private$.is_subclass
#> },
#>
#> #'@field subclasses_names
#> #'Access to private parameter '.subclasses_names'
#> subclasses_names = function() {
#> private$.subclasses_names
#> },
#>
#> #'@field attr_names
#> #'Access to private parameter '.attr_names'
#> attr_names = function() {
#> private$.attr_names
#> }
#> ),
#>
#> private = list(
#> .name = NULL,
#> .type = NULL,
#> .max_iter = NULL,
#> .linear_solver = NULL,
#> .damping = NULL,
#> .is_subclass = TRUE,
#> .subclasses_names = character(),
#> .attr_names = c(),
#> )
#> )
```
Ta-daa, you now have some nice stubs. Copy them into a script in the `R` folder of this package, add some documentation and validation to it and you're almost done.
## Integrating new classes
Now that we have a class, we need to tell the package it exists. This is so when we're reading in or exporting a `.prj` file, it knows to automatically turn the content of our `nonlinear_solver` tag into an object of our new class and the other way around. To achieve this, execute the code in `data_raw/xpaths_for_classes.R`. What this will do is update the `xpaths_for_classes` parameter, adding an entry for your class. Afterwards, run `xpaths_for_classes[["your_class_name"]]`. It should return the `xpath` parameter of your class like so:
```r
xpaths_for_classes[["prj_process"]]
# A class can have multiple xpaths if the represented node occurs at different positions.
xpaths_for_classes[["prj_convergence_criterion"]]
```
If the class you've created is a `.prj` top level class or a child of a top level wrapper node like `processes`, add a corresponding `OGS6` private parameter and an active field. For example, the `processes` node is represented as a list, so I added the private parameter `.processes = list()` and the active field `processes`.
A lot of things in the `r2ogs6` package work in a way that is a bit "meta". Often times, functions are called via `eval(parse(text = call_string))` where `call_string` has for example been concatenated out of info about the parameter names of a certain class. This saves a lot of code regarding import, export and script generation but requires that you've made the respective info available as shown here.
So we've analysed some files, generated some code, created a new class and registered it with the package... what now? That's it actually, that's the workflow. Well, at least it's supposed to be.
## Recursive function guide
If that wasn't it, I'm afraid you might have to take a look at the functions handling import, export and benchmark script generation. These are a bit tricky because they use recursion which so far has proven to be efficient structure-wise but not exactly fun to think about.
### read_in
### to_node
### generate_benchmark_script
## Conclusion
I hope you've taken away some helpful information from this short guide. If you make changes to improve the workflow, please update this vignette for the next dev!