`data` argument to `setup_bootstrap_run` and remove `nm_join` requirement #707

barrettk · 2024-06-26T21:36:24Z

We use nm_join() under the hood to create the filtered starting dataset we resample from. There was previously no method for passing a .join_col, so the default NUM would always be used under the hood.

The workaround was to add a NUM column to the original model and resubmit before attempting to bootstrap it, though this is obviously inconvenient and shouldn't be the behavior. Note that a support ticket has already been filed for this issue, so this PR was initially in response to that.

Rather than providing a new .join_col argument, this PR removes the nm_join() dependency by parsing the IGNORE and ACCEPT options in a $DATA record, returning a filtered dataset.

Since we had to touch this function anyways, I reached out to @kylebaron about adding a data argument as well. There have been previously discussions about this, though mostly offline. I followed the same approach used for nmsim models (save new data to the output directory after some checks) with some minor differences.

- move param documentation to above details section for consistency

…each of the table files

barrettk · 2024-06-28T18:32:26Z

@kylebaron requested your review from more of a conceptual and documentation perspective (though feel free to make any additional comments). Im planning on discussing the technical details with @seth127 upon his return for a more formal review

barrettk · 2024-06-28T18:33:52Z

Local Tests

> devtools::test(filter = "boot")
ℹ Testing bbr
✔ | F W  S  OK | Context
✔ |        115 | testing bootstrap functionality and running bbi [115.8s]                                                              

══ Results ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Duration: 115.8 s

[ FAIL 0 | WARN 0 | SKIP 0 | PASS 115 ]

- get_input_columns has to utilize the based_on model instead of .boot_run, as the referenced dataset will no longer exist when overwriting - If a dataset was previously passed, but not when overwriting, we need to revert the data path back to the one specified in the original control stream - add regression tests for passing in a .join_col or user provided dataset

- Extracts and formats IGNORE and ACCEPT record options, transforming them into `dplyr::filter()` expressions

- After some internal discussion, it was suggested that the ability to create a full NM-TRAN dataset via parsing the $DATA and $INPUT records could be valuable. - This commit adds the dropping of columns via `DROP` and `SKIP` options, handles the renaming of columns, and null mapping (changing null values to a particular character string). - nm_data_filter was renamed from filter_nm_data. All the helper function names may change, but was trying to be consistent across the new helper functions at the minimum. - minor bug fix to nm_data_filter: can handle `=` IGNORE/ACCEPT options

- read_data_record and get_records() can only be used for reading the records, not overwriting. This slipped past me in a previous commit. - dont apply `#` filter to all columns - only the first

- pull out the individual expression translation into a separate function - move all related functions to this PR to a new script

- (list) type expressions must be split up first. This wasn't necessary before the refactor, but is now - add examples and documentation

- removed other NM-TRAN setup functions. They were incomplete/not fully accurate, and we can just use NM-TRAN directly to create that dataset - Hook up filter_nm_data to setup_bootstrap_run and update tests

- pulled out additional logic for parsing the IGNORE and ACCEPT options - fix: adjusted `@` IGNORE filter option to only apply to the first column

- add tests for all NONMEM operators

- adjust map call placement to improve clarity of helper functions - abort if both IGNORE and ACCEPT expressions are found - now return the `type` as part of `get_data_filter_exprs()`, which is then passed to `translate_nm_expr()` - inform user of how many records are dropped as part of `nm_data(filter = TRUE)` - adjust documentation and arguments

- doesnt work by default for internal functions. I imagine there is a workaround for this, though the inclusion of these examples are not important and only for developer purposes

- dont look for digits - only look for one character

- extra `\\` are added, which are escaped when parse() is called down the line

- Some of this was a bit out of scope for this PR, but would be ok with the scope of the parent PR (updating bootstrap functionality). The out of scope portions included capitalizing parameter documentation for all bootstrap related functions - Added examples and improved regex for IGNORE/ACCEPT list parsing

If parsing the NONMEM filtering expressions fails, we now instruct users to provide a starting dataset instead of proceeeding with an unfiltered `nm_data(mod)` dataset - filter_nm_data no longer returns NULL on failure

- If the based_on model has run, check the number of records in `starting_data` to make sure the filtering went ok. This introduces a `.bbi_args` argument passed to `model_summary()`, similar to the behavior in `nm_join()`.

- This will read in the _starting_ dataset (i.e. does not take any resampling for each sub-model into account) - This is done ahead of the next commit, which will change the control stream template that's used for parsing NONMEM filter expressions

…pressions - previously used the parent model. However we want any changes made in the bootstrap control stream template to be reflected during setup_bootstrap_run - Add a warning if the parent model hasn't been submitted (and `data` is not provided, as this prevents us from being able to perform certain checks (number of expected records).

…arent model - minor bug fix

- move data path fix to above parsing the control stream. This previously wasnt necessary since we parsed the parent model

Parse $DATA record for the purpose of filtering the input data

- This may have a use down the road within `nm_join()`, but removing it from `bbr` for now since it's no longer utilized. - It's not used anymore because we now use `nm_data(mod, filter = TRUE)` instead of `nm_join()` within `setup_bootstrap_run()`

R/nm-file.R

kylebaron

Just one request for less verbose information.

barrettk added bug Something isn't working bootstrap Bootstrap development labels Jun 26, 2024

barrettk added 3 commits June 28, 2024 12:51

add .join_col argument to setup_bootstrap_run

4a0b7f9

- move param documentation to above details section for consistency

can_be_nm_joined: now check that .join_col or "ID" can be found in …

3d1ba62

…each of the table files

add support for passing in a starting dataset to resample from

8c4b5aa

barrettk force-pushed the bootstrap/starting-data branch from d58a0ac to 8c4b5aa Compare June 28, 2024 16:52

barrettk requested review from kylebaron and seth127 June 28, 2024 18:30

barrettk force-pushed the bootstrap/starting-data branch from 49d9f9b to ed2c258 Compare June 28, 2024 20:22

barrettk added 10 commits July 2, 2024 17:12

Parse $DATA record for the purpose of filtering the input data

bfae5ff

- Extracts and formats IGNORE and ACCEPT record options, transforming them into `dplyr::filter()` expressions

bug fix: revert modify_data_path_ctl change

594b070

- read_data_record and get_records() can only be used for reading the records, not overwriting. This slipped past me in a previous commit. - dont apply `#` filter to all columns - only the first

rename nm_data_drop_skip_records --> nm_data_drop_records

7650823

modularize parsing of nonmem expressions

ea48228

- pull out the individual expression translation into a separate function - move all related functions to this PR to a new script

remove new line addition to modify-records

0114008

fix method of inverting expressions

a4f5cd9

- (list) type expressions must be split up first. This wasn't necessary before the refactor, but is now - add examples and documentation

fix: cols_rename when not dropping any columns

5c8aff6

Hook up filtering to setup_bootstrap_run and add tests

a77c84e

- removed other NM-TRAN setup functions. They were incomplete/not fully accurate, and we can just use NM-TRAN directly to create that dataset - Hook up filter_nm_data to setup_bootstrap_run and update tests

add filter arg to nm_data() and add tests for filter_nm_data()

4857202

- pulled out additional logic for parsing the IGNORE and ACCEPT options - fix: adjusted `@` IGNORE filter option to only apply to the first column

barrettk mentioned this pull request Jul 10, 2024

Parse $DATA record for the purpose of filtering the input data #711

Merged

barrettk changed the title ~~Add .join_col and data arguments to setup_bootstrap_run~~ data argument to setup_bootstrap_run and remove nm_join requirement Jul 11, 2024

barrettk added 6 commits July 11, 2024 11:28

add support for NONMEM 7.3 filter options EQN and NEN

6a6525f

- add tests for all NONMEM operators

update nm_data() filter parameter documentation for clarity

7442e7d

test fix: update referenced object

170bcf4

dont run examples for translate_nm_expr

b174664

- doesnt work by default for internal functions. I imagine there is a workaround for this, though the inclusion of these examples are not important and only for developer purposes

adjust regex for IGNORE=c1 type filtering

857b737

- dont look for digits - only look for one character

barrettk added 14 commits July 18, 2024 11:15

fix @ filtering: Look for first _non-blank_ character

16f60f8

- extra `\\` are added, which are escaped when parse() is called down the line

Change handling if parsing filter expressions fails

2807dab

If parsing the NONMEM filtering expressions fails, we now instruct users to provide a starting dataset instead of proceeeding with an unfiltered `nm_data(mod)` dataset - filter_nm_data no longer returns NULL on failure

documentation updates

aa5c54b

error out if any unsupported fortran logical operators are found

2531941

Check number of records for finished based on models

4e0eb9e

- If the based_on model has run, check the number of records in `starting_data` to make sure the filtering went ok. This introduces a `.bbi_args` argument passed to `model_summary()`, similar to the behavior in `nm_join()`.

update .bbi_args parameter documentation in setup_bootstrap_run

0e23943

fix warning from previous commit

7eddb3e

adjust existing test: overwrite bootstrap control stream instead of p…

47c8301

…arent model - minor bug fix

more test adjustments and bug fix

df3566e

- move data path fix to above parsing the control stream. This previously wasnt necessary since we parsed the parent model

Merge pull request #711 from metrumresearchgroup/parse-data-record

e2d2e15

Parse $DATA record for the purpose of filtering the input data

remove can_be_nm_joined()

89c2084

- This may have a use down the road within `nm_join()`, but removing it from `bbr` for now since it's no longer utilized. - It's not used anymore because we now use `nm_data(mod, filter = TRUE)` instead of `nm_join()` within `setup_bootstrap_run()`

kylebaron reviewed Sep 19, 2024

View reviewed changes

R/nm-file.R Show resolved Hide resolved

barrettk mentioned this pull request Sep 19, 2024

Check if model can be passed to nm_join() #717

Open

kylebaron reviewed Sep 19, 2024

View reviewed changes

adjust nm_data() messages: dont show percentages

8ad89b9

kylebaron self-requested a review September 19, 2024 22:33

kylebaron approved these changes Sep 19, 2024

View reviewed changes

barrettk merged commit 03bec34 into main Sep 19, 2024
8 checks passed

barrettk deleted the bootstrap/starting-data branch September 19, 2024 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`data` argument to `setup_bootstrap_run` and remove `nm_join` requirement #707

`data` argument to `setup_bootstrap_run` and remove `nm_join` requirement #707

barrettk commented Jun 26, 2024 •

edited

Loading

barrettk commented Jun 28, 2024

barrettk commented Jun 28, 2024

kylebaron left a comment

data argument to setup_bootstrap_run and remove nm_join requirement #707

data argument to setup_bootstrap_run and remove nm_join requirement #707

Conversation

barrettk commented Jun 26, 2024 • edited Loading

barrettk commented Jun 28, 2024

barrettk commented Jun 28, 2024

kylebaron left a comment

Choose a reason for hiding this comment

`data` argument to `setup_bootstrap_run` and remove `nm_join` requirement #707

`data` argument to `setup_bootstrap_run` and remove `nm_join` requirement #707

barrettk commented Jun 26, 2024 •

edited

Loading