Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor metadata fixes #788

Merged
merged 52 commits into from
Nov 19, 2023
Merged

Minor metadata fixes #788

merged 52 commits into from
Nov 19, 2023

Conversation

ehwenk
Copy link
Collaborator

@ehwenk ehwenk commented Nov 19, 2023

endless small changes to match new tests

ehwenk and others added 30 commits November 17, 2023 10:24
- unsupported basis_of_value
code used:
subs <- read_metadata("data/ANBG_2019/metadata.yml")
-- then manually removed all substitutions  --
subs$substitutions  %>% util_list_to_df2() %>% distinct(find, .keep_all = TRUE) -> to_read_in
metadata_add_substitutions_list("ANBG_2019", to_read_in)
remove comma
- remove unneeded taxonomic updates
- add missing rows to Nano
- removing duplicate upodates
- removing updates for taxa not in data.csv files - I am somewhat baffled how these got added; almost all are automatic adds; I think they represent secondary changes to a primary name change, where the primary name change eventually got changed itself, so there is no link... there are 100's more of these and eventually I'll programatically remove them, but still wanting to check they really are all vestiges.
Yeah! Removed/changed lots of taxonomic updates that yielded errors and the only  change adds an actual name!
yangsophieee and others added 22 commits November 17, 2023 17:52
removing taxonomic updates that don't match a taxon name in the data file - strange artifacts from long ago. Removing large numbers of these has no effect on taxon list rebuilding
...but now austraits$traits suddenly has 6214 fewer entries, compared with a few commits ago... will compare
- Lots of excluded data from mistyped substitutions
- Also taxonomic_updates
- many of these unneeded names exist because the same list of taxon updates was used for all WAH studies (& then Wenk_2022), even though some taxa didn't have data scored in some datasets
- this was the first set of "unneeded" taxon removals that deleted ~10-15 taxon names from taxon_list because of the deleted substitutions - these were taxon names for which there is data. Temporarily reinstating.
- all tests active again, all passing
- all original names in taxonomic_updates now match to a taxon name (first time ever!)
- bits of other cleaning
1 final change
Copy link
Member

@dfalster dfalster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment in the PR why so many changes were made

@ehwenk ehwenk merged commit 65145c5 into develop Nov 19, 2023
1 check passed
@ehwenk ehwenk deleted the minor-metadata-fixes branch November 19, 2023 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants