Skip to content

Commit

Permalink
Documentation: Update readme, contributing guide, remove website and …
Browse files Browse the repository at this point in the history
…vigenettes (#776)

* Remove vignettes, docs and website, now that traits.build has more developed documentation.
* Enhance Readme & contributing page

---------

Co-authored-by: ehwenk <[email protected]>
  • Loading branch information
dfalster and ehwenk authored Oct 10, 2023
1 parent bfcbe7d commit be866d6
Show file tree
Hide file tree
Showing 139 changed files with 162 additions and 36,938 deletions.
80 changes: 75 additions & 5 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@


# Contributing to austraits.build

We envision AusTraits as an on-going collaborative community resource that:
Expand All @@ -7,10 +9,78 @@ We envision AusTraits as an on-going collaborative community resource that:
3. Aspires to fully transparent and reproducible research of highest standard, and
4. Builds a sense of community among contributors and users.

We'd love for you to contribute. You can read more about the ways you can contribute on our website.
We'd love for you to contribute. You can read more about the ways you can contribute below.

- [Contributing new data](#contributing-new-data)
- [Improving data quality and reporting errors ](#improving-data-quality-and-reporting-error)
- [Improving documentation](#improving-documentation)
- [Development of `traits.build` package workflow](development-of-traitsbuild-workflow)


Please note that the AusTraits project has adopted a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project you agree to abide by its terms.

## Improving data quality and reporting error

All users can contribute to continual improvement, by reporting issues you encounter.

If you notice a possible error in AusTraits, please [post an issue on GitHub](https://github.com/traitecoevo/austraits.build/issues). If you can, please provide code illustrating the problem.
## Improving documentation

All users can contribute to continual improvement of AusTraits documentation, by letting us know what parts of our documentation were unclear.

If you have a suggestion, please [post an issue on GitHub](https://github.com/traitecoevo/austraits.build/issues).
## Development of `traits.build`` workflow

AusTraits uses the `traits.build` package to harmonise different sources. Interested users can help us develop this package at the package website <https://github.com/traitecoevo/traits.build/>

## Contributing new data {#data}

We gladly accept new data contributions to AusTraits, including recently collected trait data, legacy trait data from your file archives, transcribed reference works, and transcribed datasets from the literature.

If you would like to contribute data, the requirements are:

- Data was collected for Australian plant species growing in Australia
- You collected data on one of the traits listed in the [trait definitions table](http://traitecoevo.github.io/austraits.build/articles/trait_definitions.html)
- You are willing to release the data under an open license for reuse by the scientific community
- You make it is as easy as possible for us to incorporate your data by following the instructions.

### What do I need to do?

The AusTraits curators will merge each dataset into AusTraits. For each study we carefully check to ensure units are accurate, continuous trait values map in the expected range, categorical trait values map onto sensible terms, location data are accurate, taxon names are aligned to current standards, and all metadata are recorded.



As a first step, all we really require is a **Data Spreadsheet** and a copy of your **Manuscript**.

After completing a series of quality checks, we will send you a report to review that summarises the data and metadata. The reports include plots for each continuous trait, comparing values in your submission to those already in AusTraits. It plots your study locations (sites) on a map. It summarises your metadata and indicates the taxonomic alignments made. The report includes both targeted questions (sometimes) and automated questions, acting as prompts to review aspects of the report. Reviewing your report should not take long, and confirms the transparent, thorough process used to build AusTraits.

### Data

**Your dataset, preferably in a spreadsheet format.**

* **Traits:** Make sure the trait names used in your dataset are easy to interpret or, alternatively, provide a brief definition
* **Units:** Please make sure the units for each trait are provided as part of the trait name or in a separate spreadsheet/worksheet
* **Value type:** We prefer to incorporate raw values (or individual means) in AusTraits, but can use population or multi-site means if that is what is available. For mean values, please provide sample size.
* **Location:** For field studies, please provide location details (see more below).
* **Context:** Optional, but AusTraits can read in one (or more) column(s) with contextual information, such as canopy position, experimental manipulation, dry vs. wet season, etc.
* **Collection date:** Optional, but AusTraits can read in a column with sampling date (in any format)
* **Species/taxa:** Please provide complete species names or a look-up table to match species codes. Out-dated taxonomy is fine – we have name-matching algorithms.

### Metadata

The AusTraits structure has fields to input all metadata associated with your study, including methods, location details, and context. In detail:
* **Methods:** For published studies the necessary methods and study information can be extracted from a publication; just attach a copy of the manuscript or the DOI.
- The only commonly missing information is the general sampling period, such as ‘October-December 2020’; this is only required if your data file doesn't have a date column.
- For unpublished studies, provide brief methods for how each trait was measured; you can simply refer to a standard published protocol
* **Study locations:** Whenever possible, AusTraits includes location names, location coordinates (latitude/longitude), and any other location properties you have measured/recorded (vegetation description, soil chemistry, climate data, etc.). This information can be provided as a second spreadsheet or as additional columns in the main data spreadsheet. Just make sure the location name is the same in both spreadsheets.
* **Context:** If your study includes contextual variables, make sure the context values are included as columns in the data spreadsheet. Also, please make sure the contextual values are self-explanatory or provide the necessary explanation.
* **Authors:** Authorship is extended to anyone who played a key intellectual role in the experimental design and data collection. Most studies have 1-3 authors. For each author, please provide a **name**, **institutional affiliation**, **email address**, and their **ORCID** (if available). Please nominate a single contributor to be the dataset's point of contact; this person's email will not be listed in the metadata file, but is the person future AusTraits users are likely to seek out if they have questions. Additional field assistants can be listed.
* **Source:** The published manuscript is generally the source. If different traits or observations from a single dataset were published separately, please provide both references. If the dataset you are submitting is a compilation from many sources, please provide a complete list of sources and indicate which rows of data are attributable to which source.


### Common hang-ups

### Code of Conduct
Some home issues with contributions include:

Please note that the austraits project is released with a
[Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this
project you agree to abide by its terms.
* **Categorical trait values:** If you have categorical traits, please define any trait values (i.e. entries for that trait) that are not self-explanatory. A copy of our definitions file, including allowable values for each trait is available [here](http://traitecoevo.github.io/austraits.build/articles/trait_definitions.html). The definitions file is a work-in-progress and additional trait values can be added if needed to capture the exact meaning you intended.
* **Data sourced from others:** For numerical traits, AusTraits strives to only include data collected by you for this project, to avoid having multiple entries of the same measurement/observation. If you have certain trait values that were sourced from the literature, an online database, or colleagues, please indicate that clearly. If trait values for some species were collected by you and others were sourced, it is very helpful if you could add a column to your spreadsheet that indicates the source for different rows of data.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ temp
.local/
.config/
.vs/
man/*
*.Rproj
tmp*
reports
Expand All @@ -27,6 +28,7 @@ waiting_to_build
ignore
inst/doc
doc
docs
Meta
config/APC/*
config/NSL/*
Expand Down
131 changes: 85 additions & 46 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,67 +4,106 @@
<!-- badges: start -->
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.3568417.svg)](https://doi.org/10.5281/zenodo.3568417)
[![build](https://github.com/traitecoevo/austraits.build/actions/workflows/check-build.yml/badge.svg)](https://github.com/traitecoevo/austraits.build/actions/workflows/check-build.yml)
[![R-CMD-check](https://github.com/traitecoevo/austraits.build/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/traitecoevo/austraits.build/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/traitecoevo/austraits.build/branch/develop/graph/badge.svg)](https://app.codecov.io/gh/traitecoevo/austraits.build?branch=develop)
<!-- badges: end -->

<img src="docs/figures/logo.png">
![](inst/figures/logo.png)

AusTraits is a transformative database, containing measurements on the
traits of Australia’s plant species, standardised from hundreds of
disconnected primary sources. So far, data have been assembled \> 250
distinct sources, describing more than 400 plant traits and over 25k
taxa. The dataset and approach is documented in detail in the following publication
AusTraits is a transformative database, containing measurements on the traits of Australia’s plant species, standardised from hundreds of disconnected primary sources. So far, data have been assembled \> 300 distinct sources, describing > 500 plant traits for > 25k taxa. The dataset and approach is documented in detail in the following publication

> Falster D, Gallagher R, Wenk, E et al. (2021) AusTraits, a curated plant trait
database for the Australian flora. Scientific Data 8: 254.
DOI: [10.1038/s41597-021-01006-6](http://doi.org/10.1038/s41597-021-01006-6)
> Falster D, Gallagher R, Wenk, E et al. (2021) AusTraits, a curated plant trait database for the Australian flora. Scientific Data 8: 254. DOI: [10.1038/s41597-021-01006-6](http://doi.org/10.1038/s41597-021-01006-6)
Those interested in simply using data from AusTraits, should visit download the
compiled resource from the versioned releases archived on Zenodo at doi:
[10.5281/zenodo.3568417](https://doi.org/10.5281/zenodo.3568417).
The repo contains the data for rebuilding AusTraits, while the workflow to rebuild the dataset is on the [traits.build repo](https://github.com/traitecoevo/traits.build).

AusTraits is continually evolving, as new datasets are contributed. As such, there is no single canonical version. We are continually making new versions available. Overtime, we expect that different versions will be released and used in different analyses.

## Accessing data

Those interested in simply using data from AusTraits, should visit download the compiled resource from the versioned releases archived on Zenodo at DOI: [10.5281/zenodo.3568417](https://doi.org/10.5281/zenodo.3568417).

Users will want to read up on the [database structure, described in the `traits.build` manual](https://traitecoevo.github.io/traits.build-book/database_structure.html).

Definitions for the traits are described the AusTraits Plant Dictionary (APD), at

- Formalised vocabulary at <http://w3id.org/APD/>
- preprint Wenk et al 2023, DOI: [10.1101/2023.06.16.545047](https://doi.org/10.1101/2023.06.16.545047)

There you will also find detailed information regarding appropriate use
of AusTraits. Further information about the AusTraits project is available at the project
website [austraits.org](https://austraits.org).
## Citation

Users of AusTraits are requested to cite the source publication, which documents the dataset and approach:

> Falster D, Gallagher R, Wenk, E et al. (2021) AusTraits, a curated plant trait database for the Australian flora. Scientific Data 8: 254. DOI: [10.1038/s41597-021-01006-6](http://doi.org/10.1038/s41597-021-01006-6)
## Rebuilding AusTraits from source

This repository (`austraits.build`) contains the raw data and code used to compile AusTraits from diverse, original sources.
This repository (`austraits.build`) contains the raw data and code used to compile AusTraits from diverse, original sources.

To handle the harmonising of diverse data sources, we use a reproducible workflow to implement the various changes required for each source to reformat it into a form suitable for incorporation in AusTraits. Such changes include restructuring datasets, renaming variables, changing variable units, changing taxon names. For the sake of transparency and continuing development, the entire workflow is made available here.

![](inst/figures/Workflow.png)

We use the [`traits.build`](https://traitecoevo.github.io/traits.build/) R package and workflow to harmonise > 300 different sources into a unified dataset. The workflow is fully-reproducible and open, meaning it exposes the decisions made in the processing of data into a harmonised and curated dataset and can also be rerun by others. AusTraits is built so that the database can be rebuilt from its parts at any time. This means that decisions made along the way (in how data is transformed or encoded) can be inspected and modified, and new data can be easily incorporated.

To build the database follows these steps

***Install `traits.build`***

To handle the harmonising of diverse data sources, we use a reproducible
workflow to implement the various changes required for each source to
reformat it suitable for incorporation in AusTraits. Such changes
include restructuring datasets, renaming variables, changing variable
units, changing taxon names. For the sake of transparency and continuing
development, the entire workflow is made available here.
The first step is to install a copy of [traits.build](https://github.com/traitecoevo/austraits.build/):

AusTraits is continually evolving, as new datasets are contributed. As
such, there is no single canonical version. We are continually making
new versions available. Overtime, we expect that different versions will
be released and used in different analyses.
```{r, eval=FALSE, echo=TRUE}
remotes::install_github("traitecoevo/traits.build", quick = TRUE)
```
***Clone repository***

Those interested in building AusTraits from source or contributing to AusTraits
should see further information at this
http://traitecoevo.github.io/austraits.build/articles/austraits.build.html
Next you need to download a copy of this repository from Github. Then open the Rstudio project, or open R into the right repo directory.

***Compile via `remake`***

One of the packages that will be installed with the `traits.build` is [`remake`](https://github.com/richfitz/remake). This package manages the compiling, and also helps streamline the amount of recompiling needed when new sources are added.

Running the following command will rebuild AusTraits and save the assembled database into an RDS file located in `export/data/curr/austraits.rds`.

```{r, eval=FALSE, echo=TRUE}
remake::make()
austraits <- readRDS("export/data/curr/austraits.rds")
```

Remake can also load the compiled dataset directly into R by calling:

```{r, eval=FALSE, echo=TRUE}
austraits <- remake::make("austraits")
```

## Contributing to AusTraits

We envision AusTraits as an ongoing collaborative community resource that:

1. Increases our collective understanding of the Australian flora
2. Facilitates the accumulation and sharing of trait data
3. Builds a sense of community among contributors and users
4. Aspires to be fully transparent and reproducible research of the highest standard.

We'd love for you to contribute to the projects. Below are some ways you can contribute:

- Contributing new data
- Improving data quality and reporting errors
- Improving documentation
- Development of `traits.build`` workflow

For details on on how to contribute, please see the file [CONTRIBUTING.md](https://github.com/traitecoevo/austraits.build/blob/develop/.github/CONTRIBUTING.md)

The AusTraits project is released with a [Contributor Code of Conduct](https://github.com/traitecoevo/austraits.build/blob/develop/.github/CODE_OF_CONDUCT.md). By contributing to this project you agree to abide by its terms.
## Acknowledgements

**Funding**: This work was supported via the following funding sources:
fellowship grants from Australian Research Council to Falster
(FT160100113), Gallagher (DE170100208) and Wright (FT100100910), a grant
from Macquarie University to Gallagher, and grants from the[Australian
Research Data Commons (ARDC)](https://ardc.edu.au), via their
“Transformation data collections” [doi:
10.47486/TD044](https://doi.org/10.47486/TD044) and “Data Partnerships”
[doi: 10.47486/DP720](https://doi.org/10.47486/DP720) programs. The ARDC
is enabled by NCRIS.

**Recognition**: Many people have contributed to AusTraits. A list of contributors
is provdied on the on Zenodo at doi:
**Funding**: This work was supported via the following investments:

- Investment (https://doi.org/10.47486/TD044, https:// doi.org/10.47486/DP720) from the Australian Research Data Commons (ARDC). The ARDC is funded by the National Collaborative Research Infrastructure Strategy (NCRIS).
- Fellowship from the Australian Research Council to Falster (FT160100113), Gallagher (DE170100208) and Wright (FT100100910),
- A UNSW Research Infrastructure Grant to Falster, and
- A grant from Macquarie University to Gallagher.

**Recognition**: Many people have contributed to AusTraits. A list of contributors is provided on the on Zenodo at DOI:
[10.5281/zenodo.3568417](https://doi.org/10.5281/zenodo.3568417).

**Resuse**: At this stage, only the compiled xAusTraits dataset is available for reuse,
via Zenodo. The raw data sources provided in this repository are not available
for reuse in their current form, without further discussion from data contributors.
Further information about the AusTraits project is available at the project website [austraits.org](https://austraits.org).

**Resuse**: At this stage, only the compiled AusTraits dataset is available for reuse, via Zenodo. The raw data sources provided in this repository are not available for reuse in their current form, without further discussion from data contributors.
Loading

0 comments on commit be866d6

Please sign in to comment.