Skip to content

Commit

Permalink
Minor metadata fixes (#788)
Browse files Browse the repository at this point in the history
A number of new tests in traits.build captured metadata inconsistencies,  that led to fixes in austraits.build metadata files and also a few refinements to the tests.

* lots of taxonomic_updates that were no longer being used - a mix of duplicates and "artifacts" from long ago, when 1) additional updates were added for a secondary fix, rather than overwriting the first one; 2) a time a while ago when fixes were made to standardised names, not truly original names; and 3) for some of the flora scraping studies the same list of taxonomic updates was used for all studies from the same flora, although some taxa didn't have any entries in one of the 2-3 datasets.
* changed process.R in traits.build such that `excluded_observations` looks for `original_name` not the standardised name, and changed those accordingly
* found a few commas delimiting values, where there should have been spaces (in basis_of_record)
* removed duplicate substitutions in ANBG_2019

In the process of removing duplicate taxonomic_updates, kept rebuilding taxon_list & checking taxonomic_updates tibble. Taxon_list changed minimally and in expected ways.

At the end:
* all original names in taxonomic_updates now match to a taxon name (first time ever!)
* `combined_table` had same number of rows as `austraits$traits` (i.e. no duplication)
* all tests pass - this also means all datasets pivot
* looked carefully at alignments in taxa table and they look good - various filtering, against different `taxon_rank`, `taxonomic_dataset` values
---------

Co-authored-by: yangsophieee <[email protected]>
  • Loading branch information
ehwenk and yangsophieee authored Nov 19, 2023
1 parent 915df00 commit 65145c5
Show file tree
Hide file tree
Showing 85 changed files with 1,423 additions and 16,327 deletions.
56 changes: 32 additions & 24 deletions config/taxon_list.csv

Large diffs are not rendered by default.

17 changes: 7 additions & 10 deletions data/ABRS_1981/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ dataset:
units = ifelse(units %in% c("m", "cm"), "mm", units),
value = as.character(value)
);
data_categorical_other <- data %>%
filter(is.na(units) & !(trait %in% c("longevity", "flowering time start", "flowering time end"))
) %>%
Expand All @@ -44,14 +44,14 @@ dataset:
left_join(filter(data, trait == "flowering time start") %>%
select(species_name, i), by = "species_name");
data_longevity <- data %>%
data_longevity <- data %>%
filter( !is.na(value) & trait == "longevity") %>%
mutate(trait = "leaf_phenology",
value = ifelse(value %in% c("EP", "E", "EAP", "EPB", "EB", "EAB", "EA"), "evergreen", value),
value = ifelse(value %in% c("DP", "D", "DPB"), "deciduous", value)) %>%
filter(value %in% c("evergreen", "deciduous"));
data_lifehistory <- data %>%
data_lifehistory <- data %>%
filter( !is.na(value) & trait == "longevity" &
value %in% c("EP", "P", "DP", "EDP", "A", "EA", "EB", "B","AB", "EAB", "AP", "EAP", "EPB", "DPB", "PB")) %>%
mutate(trait = "life_history", value = ifelse(value %in% c("EP", "P", "DP", "EDP"), "perennial", value),
Expand All @@ -70,7 +70,7 @@ dataset:
value == "HA" ~ "aquatic")
);
bind_rows(data_numeric, data_categorical_other, data_flowering,
bind_rows(data_numeric, data_categorical_other, data_flowering,
data_longevity, data_lifehistory, data_substrate) %>%
arrange(i) %>%
select(-i) %>%
Expand All @@ -87,7 +87,7 @@ dataset:
group_by(species_name, trait) %>%
distinct(value, .keep_all = TRUE) %>%
ungroup()
'
'
collection_date: unknown/2015
taxon_name: species_name
trait_name: trait
Expand Down Expand Up @@ -517,8 +517,5 @@ taxonomic_updates:
reason: match_14. Automatic alignment with species-level canonical name in APC accepted
when notes are ignored (2022-11-12)
taxonomic_resolution: species
exclude_observations:
- variable: taxon_name
find: Platanus x hispanica 'Acerifolia'
reason: excluding cultivars
exclude_observations: .na
questions: .na
2 changes: 1 addition & 1 deletion data/ABRS_2022/data.csv
Original file line number Diff line number Diff line change
Expand Up @@ -12250,7 +12250,7 @@ Opuntia,Opuntia microdasys," spreading shrub to 1 m high, often somewhat creepin
Opuntia,Opuntia polyacantha var. hystricina, ,perennial,inferred_from_family,,,shrub,inferred_from_species,,,,,,,woody,
Opuntia,Opuntia," succulent, erect or spreading shrubs or trees, sometimes creeping, often many-branched from the base; ",perennial,inferred_from_family,,,shrub tree,shrubs trees,,,,,erect creeping spreading,erect creeping spreading,woody,
Opuntia,Opuntia leucotricha," erect shrub 3-4 m high, to 4 m diam., trunk well-developed, to 1 m high and 15-30 cm wide in older plants. ",perennial,inferred_from_family,,,shrub,shrub,,,,,erect,erect,woody,
Opuntia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt, ,perennial,inferred_from_family,,,herb,inferred_from_life_history,,,,,,,herbaceous,
Opuntia,Opuntia polyacantha var. erinacea (Engelm . & J.M.Bigelow) B.D.Parfitt, ,perennial,inferred_from_family,,,herb,inferred_from_life_history,,,,,,,herbaceous,
Opuntia,Opuntia tomentosa," erect shrub or treelike, to 8 m high. ",perennial,inferred_from_family,,,shrub palmoid,shrub treelike,,,,,erect,erect,woody,
Opuntia,Opuntia ficus-indica," erect shrub to 5 m high and 8 m wide, older plants with well-developed trunk to 1.2 m high and 1 m diam. ",perennial,inferred_from_family,,,shrub,shrub,,,,,erect,erect,woody,
Oraniopsis,Oraniopsis," solitary, erect, dioecious palm. ",perennial,inferred_from_growth_form,,,palmoid,palm,,,dioecious,dioecious,erect,erect,woody,inferred_from_growth_form
Expand Down
70 changes: 4 additions & 66 deletions data/ABRS_2022/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ dataset:
data %>%
filter(str_detect(taxon_name, " ")) %>%
mutate(
woodiness_a = stringr::str_replace(woodiness_a, "soft_wood","semi_woody")
woodiness_a = stringr::str_replace(woodiness_a, "soft_wood", "semi_woody")
)
'
'
collection_date: unknown/2022
taxon_name: taxon_name
description: Plant growth form data extracted from the Flora of Australia online
Expand Down Expand Up @@ -139,10 +139,6 @@ taxonomic_updates:
replace: Aristida ramosa x Aristida vagans
reason: Manual alignment with canonical species name in APC (Elizabeth Wenk, 2022-11-22)
taxonomic_resolution: species
- find: Blechnum pennamarina subsp. alpina
replace: Blechnum penna-marina subsp. alpina
reason: Manual alignment with canonical species name in APC (Elizabeth Wenk, 2022-11-21)
taxonomic_resolution: subspecies
- find: Brassica juncea
replace: Brassica x juncea
reason: match_14. Automatic alignment with species-level canonical name in APC accepted
Expand Down Expand Up @@ -173,10 +169,6 @@ taxonomic_updates:
reason: match_05. Automatic alignment with scientific name in APC accepted list
(2022-11-22)
taxonomic_resolution: species
- find: Crepidomanes aphleboides
replace: Crepidomanes aphlebioides
reason: match_07_fuzzy. Fuzzy alignment with accepted canonical name in APC (2022-11-21)
taxonomic_resolution: species
- find: Dactyliophora novaeguineae
replace: Dactyliophora novae-guineae
reason: match_07_fuzzy. Fuzzy alignment with accepted canonical name in APC (2022-11-22)
Expand Down Expand Up @@ -327,10 +319,6 @@ taxonomic_updates:
replace: Lindsaea terrae-reginae
reason: match_07_fuzzy. Fuzzy alignment with accepted canonical name in APC (2022-11-21)
taxonomic_resolution: species
- find: Marsdenia hemi-ptera
replace: Marsdenia hemiptera
reason: match_07_fuzzy. Fuzzy alignment with known canonical name in APC (2022-11-21)
taxonomic_resolution: species
- find: Melicytus novae-zelandae subsp. centurionis
replace: Melicytus novae-zelandiae subsp. centurionis
reason: match_07_fuzzy. Fuzzy alignment with accepted canonical name in APC (2022-11-22)
Expand All @@ -339,15 +327,6 @@ taxonomic_updates:
replace: Mesua sp. (Boonjee)
reason: match_08. Automatic alignment with synonymous name in APNI (2022-11-22)
taxonomic_resolution: species
- find: Monimiaceae Q3 (Mt. Hemmant)
replace: Monimiaceae Q3 (Mt. Hemmant) sp. Q1 (Mt. Hemmant)
reason: match_12. Automatic alignment with infraspecific canonical name in APC known
names when notes are ignored (2022-11-21)
taxonomic_resolution: species
- find: Myrsine maccomishii
replace: Myrsine mccomishii
reason: match_07_fuzzy. Fuzzy alignment with accepted canonical name in APC (2022-11-21)
taxonomic_resolution: species
- find: Neoalsomitra sp. A ( B.Hyland 10923)
replace: Neoalsomitra sp. A (B.Hyland 10923)
reason: match_07_fuzzy. Fuzzy alignment with known canonical name in APC (2022-11-22)
Expand All @@ -356,12 +335,7 @@ taxonomic_updates:
replace: Oligochaetochilus macrocalymma
reason: match_07_fuzzy. Fuzzy alignment with known canonical name in APC (2022-11-22)
taxonomic_resolution: species
- find: Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt
replace: Opuntia polyacantha
reason: match_14. Automatic alignment with species-level canonical name in APC accepted
when notes are ignored (2022-11-22)
taxonomic_resolution: species
- find: Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt
- find: Opuntia polyacantha var. erinacea (Engelm . & J.M.Bigelow) B.D.Parfitt
replace: Opuntia polyacantha
reason: Exact match of the first two words of the taxon name to an APC-accepted
canonical name (2023-11-02)
Expand Down Expand Up @@ -401,10 +375,6 @@ taxonomic_updates:
replace: Pterostylis x ralphcranei
reason: match_07_fuzzy. Fuzzy alignment with accepted canonical name in APC (2022-11-22)
taxonomic_resolution: species
- find: Rubiaceae Gen. Nov. sp. (Shute Harbour DAH Q811)
replace: Gynochthodes retropila
reason: Manual alignment with APC accepted name (E. Wenk, 2023-06-16)
taxonomic_resolution: species
- find: Salvinia molesta
replace: Salvinia x molesta
reason: match_14. Automatic alignment with species-level canonical name in APC accepted
Expand Down Expand Up @@ -481,21 +451,6 @@ taxonomic_updates:
reason: match_14. Automatic alignment with species-level canonical name in APC accepted
when notes are ignored (2022-11-22)
taxonomic_resolution: species
- find: x Cynochloris macivorii
replace: x Cynochloris macivorii
reason: match_06. Automatic alignment with synonymous term among accepted canonical
names in APC (2022-11-22)
taxonomic_resolution: species
- find: x Cynochloris reynoldensis
replace: x Cynochloris reynoldensis
reason: match_06. Automatic alignment with synonymous term among accepted canonical
names in APC (2022-11-22)
taxonomic_resolution: species
- find: x Glossadenia x tutelata
replace: Glossodia x tutelata
reason: match_06. Automatic alignment with synonymous term among known canonical
names APC (2022-11-21)
taxonomic_resolution: species
- find: xCyanthera glossodioides
replace: x Cyanthera glossodioides
reason: Manual alignment with canonical species name in APC (E. Wenk, 2023-11-02)
Expand All @@ -505,28 +460,11 @@ taxonomic_updates:
reason: Manual alignment with canonical species name in APC (E. Wenk, 2023-11-02)
taxonomic_resolution: species
exclude_observations:
- variable: taxon_name
find: Acacioides Group, Adenotricha Group, Aethiopicum Group, Agrifolia Group, Aspera
Group, Bulbiferum Group, Buxifolia Group, Capitisyork Group, Ceratophylla Group,
Cirsiifolia Group, Clavata Group, Cristata Group, Cucullata Group, Eriantha Group,
Eryngioides Group, Goodii Group, Hakeoides Group, Heliosperma Group, Hilliana
Group, Huegelii Group, Incrassata Group, Integrifolia Group, Linearifolia Group,
Linearis Group, Lissocarpha Group, Longistyla Group, Lorea Group, Marriottii Group,
Megalosperma Group, Microcarpa Group, Multilineata Group, Nodosa Group, Obliqua
Group, Obtusatum Group, Oncogyne Group, Parvum Group, Petiolaris Group, Petrophiloides
Group, Polyodon Group, Pteridifolia Group, Pythara Group, Quercifolia Group, Robusta
Group, Rudis Group, Ruscifolia Group, Salicifolia Group, Shiressii Group, Simplicifrons
Group, Strumosa Group, Thelemanniana Group, Trifida Group, Trifurcata Group, Triloba
Group, Trineura Group, Ulicina Group, Undulata Group, Wickhamii Group
reason: not species in APC/APNI
- variable: taxon_name
find: Platanus xhispanica 'Acerifolia'
reason: excluding cultivars
- variable: taxon_name
find: Asplenium lobulatum Mett. ex Kuhn, Cerastium pyrenaicum J.Gay, Passiflora miniata, Quercus ilex,
Selenicereus undatus, Silene gracilis DC.
reason: excluding non-native, non-naturalised species
- variable: taxon_name
find: Anacolosa sp., x Cynochloris, Sphaerocionium
find: Anacolosa sp., x Cynochloris
reason: excluding genera
questions: .na
10 changes: 5 additions & 5 deletions data/ABRS_2023/data.csv
Original file line number Diff line number Diff line change
Expand Up @@ -8878,7 +8878,7 @@ Flora_of_Australia,Opuntia microdasys,fruit_dehiscence,,indehiscent,inferred_fro
Flora_of_Australia,Opuntia monacantha,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Flora_of_Australia,Opuntia polyacantha,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Flora_of_Australia,Opuntia polyacantha var. hystricina,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Flora_of_Australia,Opuntia puberula,fruit_dehiscence,,dehiscent,splitting,,mode
Flora_of_Australia,Opuntia robusta,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Flora_of_Australia,Opuntia schickendantzii,fruit_dehiscence,,indehiscent,inferred_from_genus,,mode
Expand Down Expand Up @@ -43769,7 +43769,7 @@ Flora_of_Australia,Opuntia microdasys,leaf_arrangement,,spiral,inferred_from_fam
Flora_of_Australia,Opuntia monacantha,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. hystricina,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia puberula,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia robusta,leaf_arrangement,,spiral,inferred_from_family,,mode
Flora_of_Australia,Opuntia schickendantzii,leaf_arrangement,,spiral,inferred_from_family,,mode
Expand Down Expand Up @@ -54404,7 +54404,7 @@ Flora_of_Australia,Opuntia microdasys,leaf_compoundness,,simple,inferred_from_fa
Flora_of_Australia,Opuntia monacantha,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. hystricina,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia puberula,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia robusta,leaf_compoundness,,simple,inferred_from_family,,mode
Flora_of_Australia,Opuntia schickendantzii,leaf_compoundness,,simple,inferred_from_family,,mode
Expand Down Expand Up @@ -89520,7 +89520,7 @@ Flora_of_Australia,Opuntia microdasys,leaf_margin,,entire,inferred_from_family,,
Flora_of_Australia,Opuntia monacantha,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. hystricina,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia puberula,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia robusta,leaf_margin,,entire,inferred_from_family,,mode
Flora_of_Australia,Opuntia schickendantzii,leaf_margin,,entire,inferred_from_family,,mode
Expand Down Expand Up @@ -189437,7 +189437,7 @@ Flora_of_Australia,Opuntia microdasys,plant_photosynthetic_organ,,cladode,manual
Flora_of_Australia,Opuntia monacantha,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia polyacantha,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia polyacantha var. hystricina,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia polyacantha var. erinacea (Engelm. & J.M.Bigelow) B.D.Parfitt,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia puberula,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia robusta,plant_photosynthetic_organ,,cladode,manual,,mode
Flora_of_Australia,Opuntia schickendantzii,plant_photosynthetic_organ,,cladode,manual,,mode
Expand Down
Loading

0 comments on commit 65145c5

Please sign in to comment.