Skip to content

Commit

Permalink
Merge pull request GEO-BON#183 from GEO-BON/genes_from_space
Browse files Browse the repository at this point in the history
Genes from space
  • Loading branch information
JoryGriffith authored Oct 18, 2024
2 parents 608ee32 + 83e37ba commit e91cf15
Show file tree
Hide file tree
Showing 30 changed files with 3,289 additions and 34 deletions.
Binary file added .DS_Store
Binary file not shown.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ runner.env
**/.config
.server

rosm.cache
.RData
*.Rproj
lib/

# R temporary files
.Rproj.user
.Rhistory
Expand Down
46 changes: 46 additions & 0 deletions pipelines/GenesFromSpace/Ne500.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Species genetic diversity is a critical aspect of ecosystem health, but assessing it can be challenging due to the complexity of gathering and analyzing relevant data across large spatial scales. Traditional methods often require extensive fieldwork and labor-intensive sampling for DNA sequencing, which limits the frequency and scale of genetic diversity assessments. The [Genes From Space monitoring tool in BON in a Box](https://www.google.com/url?q=https://teams.issibern.ch/genesfromspace/monitoring-tool/) uses Earth Observations (EO) to track habitat changes over time and infer population trends as indicators of genetic diversity. Leveraging public EO data, the tool enables users to calculate two genetic diversity indicators adopted by the Convention on Biological Diversity:
1. the Ne > 500 indicator, indicating the fraction of populations with an effective population size (Ne) above 500 units. Populations with Ne below 500 units are at risk of genetic erosion. Ne > 500 a headline indicator in the GBF.
2. the Populations Maintained indicator (PM), indicating the fraction of populations that are maintained (i.e., did not go extinct) over time. This is a complementary indicator in the GBF.
The tool provides an interface that simplifies the process of selecting EO datasets, running analyses, and interpreting genetic diversity indicators. Ultimately, this tool offers a more scalable and accessible solution for researchers, conservationists, and policymakers to monitor and protect biodiversity at local, regional, and global levels.

**Methods:**
The Tool is made of three components: (1) a population input, which defines the spatial distribution of the species populations; (2) a habitat input, which summarizes changes in the species suitable habitat over time; and, (3) a processing tool that combines population and habitat inputs to calculate genetic diversity indicators. Populations are defined as polygons representing areas where distinct populations can potentially be found. The habitat input is a set of suitability maps describing the area in which the species can realistically exist over time. For example, the habitat map of a forest dwelling species can show areas with tree cover and change over time. The pipeline uses population polygons and habitat suitability maps over time to calculate the habitat size for each population. Habitat size is combined with provided population estimates to calculate the genetic diversity indicators.
![Screenshot 2024-10-15 143938](https://github.com/user-attachments/assets/69818156-6a77-465e-87e0-d419e5d6f318)

**BON in a Box Pipelines:**
There are several pipelines and sub-pipelines in BON in a Box to calculate the genetic diversity indicators. The pipelines contain the following inputs:
* **Species names:** The user defines the species of interest
* **Countries:** The user provides one or more country for which to calculate the indicator
* **Population polygons:** The user can either input files of known population boundaries, input monitoring data to generate these polygons, or use GBIF occurrences to generate polygons of populations.
![Screenshot 2024-10-15 143951](https://github.com/user-attachments/assets/dc891ebb-2212-4c35-a8b5-3f9a9850fb28)
* **Start and end year for GBIF data:** If the user is using GBIF data, they can provide the start and end year for which they want to pull occurrences.
* **Size of buffer:** The user specifies the buffer distance for drawing polygons around populations (how much distance between the points and the edge of the polygon).
* **Distance between populations:** The user defines the distance that defines separate populations, based on estimates of dispersal distances.
* **Habitat types:** The user can either use forest cover data from the Global Forest Watch to measure habitat suitability or land cover data from the European Space Agency. If using land cover data, the user also specifies the land cover classes that are suitable for the species.
* **Years of interest - habitat change:** The user specifies the years that they want to measure habitat change and estimate Ne.
* **Population density:** The user inputs the population density based on known population values, or can specify multiple densities. This will be used to estimate population size (Nc).
* **Ne:Nc ratio estimate:** The user specifies the effective population size (Ne) to Nc ratio. This will be used to estimate the Ne for different populations and calculate the Ne>500 indicator.

The pipeline gives the following outputs:
* **Interactive plot:** This is an interactive plot that shows a map of the populations, the Ne for each population over the years of interest, a table of the effective population size over time, and plots of habitat size over time and changes in habitat size over time. The user can highlight different populations on the map to see the values in the plots.
* **Ne>500 indicator:** A number with the Ne>500 indicator, as a proportion of populations with an effective population size greater than 500.
* **Population maintained indicator:** A number with the proportion of populations that are still extant.
* **Effective population size:** A table of effective population sizes over time, that can be downloaded as a TSV.
See an example pipeline output here (coming soon).

**Contributors:**
Oliver Selmoni ([email protected])
Simon Pahls ([email protected])

**Citations:**
ESA. Land Cover CCI Product User Guide Version 2. Tech. Rep. (2017). Available at: maps.elie.ucl.ac.be/CCI/viewer/download/ESACCI-LC-Ph2-PUGv2_2.0.pdf

Hansen, M. C., Potapov, P. V., Moore, R., Hancher, M., Turubanova, S. A., Tyukavina, A., Thau, D., Stehman, S. V., Goetz, S. J., Loveland, T. R., Kommareddy, A., Egorov, A., Chini, L., Justice, C. O., & Townshend, J. R. G. (2013). High-Resolution Global Maps of 21st-Century Forest Cover Change. Science, 342(6160), 850–853. https://doi.org/10.1126/science.1244693

Schuman, M. C., Röösli, C., Mastretta-Yanes, A., Helfenstein, I. S., Vernesi, C., Selmoni, O., Millette, K. L., Tobón-Niedfeldt, W., Albergel, C., Leigh, D., Hebden, S., Schaepman, M. E., Laikre, L., & Asrar, G. R. (2024). Genes from space: Leveraging Earth Observation satellites to monitor genetic diversity. https://ecoevorxiv.org/repository/view/7274/






268 changes: 268 additions & 0 deletions pipelines/GenesFromSpace/Tool/Forest_cover_v_GBIF_countries.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
{
"nodes": [
{
"id": "114",
"type": "io",
"position": {
"x": 754,
"y": 155
},
"data": {
"descriptionFile": "GenesFromSpace>ToolComponents>GetHabitatMaps>GFS_Habitat_map_GFW_tree_canopy_2000-2023.json"
}
},
{
"id": "116",
"type": "io",
"position": {
"x": 1501,
"y": 52
},
"data": {
"descriptionFile": "GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json"
}
},
{
"id": "119",
"type": "output",
"position": {
"x": 2261.485593095917,
"y": 68.9412798016136
},
"data": {
"label": "Output"
}
},
{
"id": "122",
"type": "io",
"position": {
"x": 111.68393097643138,
"y": -98.6256400843074
},
"data": {
"descriptionFile": "GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json"
}
},
{
"id": "123",
"type": "constant",
"position": {
"x": -210.01871928913155,
"y": -41.722232600012205
},
"dragHandle": ".dragHandle",
"data": {
"type": "text",
"value": "EPSG:4326"
}
}
],
"edges": [
{
"source": "114",
"sourceHandle": "GFS_IndicatorsTool>get_TCY.yml@23|tcyy",
"target": "116",
"targetHandle": "pipeline@101",
"id": "reactflow__edge-114GFS_IndicatorsTool>get_TCY.yml@23|tcyy-116pipeline@101"
},
{
"source": "114",
"sourceHandle": "GFS_IndicatorsTool>get_TCY.yml@23|time_points",
"target": "116",
"targetHandle": "pipeline@102",
"id": "reactflow__edge-114GFS_IndicatorsTool>get_TCY.yml@23|time_points-116pipeline@102"
},
{
"source": "116",
"sourceHandle": "GFS_IndicatorsTool>get_Indicators.yml@127|interactive_plot",
"target": "119",
"targetHandle": null,
"id": "reactflow__edge-116GFS_IndicatorsTool>get_Indicators.yml@127|interactive_plot-119"
},
{
"source": "116",
"sourceHandle": "GFS_IndicatorsTool>get_Indicators.yml@127|ne_table",
"target": "119",
"targetHandle": null,
"id": "reactflow__edge-116GFS_IndicatorsTool>get_Indicators.yml@127|ne_table-119"
},
{
"source": "116",
"sourceHandle": "GFS_IndicatorsTool>get_Indicators.yml@127|pm",
"target": "119",
"targetHandle": null,
"id": "reactflow__edge-116GFS_IndicatorsTool>get_Indicators.yml@127|pm-119"
},
{
"source": "116",
"sourceHandle": "GFS_IndicatorsTool>get_Indicators.yml@127|ne500",
"target": "119",
"targetHandle": null,
"id": "reactflow__edge-116GFS_IndicatorsTool>get_Indicators.yml@127|ne500-119"
},
{
"source": "122",
"sourceHandle": "GFS_IndicatorsTool>get_pop_poly.yml@5|population_polygons",
"target": "114",
"targetHandle": "GFS_IndicatorsTool>get_TCY.yml@23|population_polygons",
"id": "reactflow__edge-122GFS_IndicatorsTool>get_pop_poly.yml@5|population_polygons-114GFS_IndicatorsTool>get_TCY.yml@23|population_polygons"
},
{
"source": "122",
"sourceHandle": "GFS_IndicatorsTool>get_pop_poly.yml@5|population_polygons",
"target": "116",
"targetHandle": "pipeline@100",
"id": "reactflow__edge-122GFS_IndicatorsTool>get_pop_poly.yml@5|population_polygons-116pipeline@100"
},
{
"source": "123",
"sourceHandle": null,
"target": "122",
"targetHandle": "pipeline@16",
"id": "reactflow__edge-123-122pipeline@16"
}
],
"inputs": {
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|runtitle": {
"description": "Set a name for the pipeline run.",
"label": "Title of the run",
"weight": 0,
"type": "text",
"example": "Quercus sartorii, Mexico, Habitat decline by tree cover loss, 2000-2023"
},
"GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json@122|pipeline@12": {
"description": "Scientific name of the species, used to look for occurrences in GBIF. ",
"label": "Species names",
"weight": 1,
"type": "text",
"example": "Quercus sartorii"
},
"GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json@122|pipeline@22": {
"description": "countries of interest, will be used to look for GBIF observations.",
"label": "Countries list",
"weight": 2,
"type": "text[]",
"example": [
"Mexico",
"Guatemala"
]
},
"GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json@122|pipeline@14": {
"description": "Integer, 4 digit year, start date to retrieve occurrences.",
"label": "Start year - GBIF observations",
"weight": 3,
"type": "int",
"example": 1980
},
"GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json@122|pipeline@15": {
"description": "Integer, 4 digit year, end date to retrieve occurrences.",
"label": "End year - GBIF observations",
"weight": 4,
"type": "int",
"example": 2000
},
"GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json@122|GFS_IndicatorsTool>get_pop_poly.yml@5|buffer_size": {
"description": "Radius size [in km] to determine population presence around the coordinates of species observations.",
"label": "Size of buffer",
"weight": 5,
"type": "float",
"example": 10
},
"GenesFromSpace>ToolComponents>GetPopulationPolygons>GFS_Population_polygons_from_GBIF_occurences_country.json@122|GFS_IndicatorsTool>get_pop_poly.yml@5|pop_distance": {
"description": "Distance [in km] to separate species observations in different populations.",
"label": "Distance between populations",
"weight": 6,
"type": "float",
"example": 50
},
"GenesFromSpace>ToolComponents>GetHabitatMaps>GFS_Habitat_map_GFW_tree_canopy_2000-2023.json@114|GFS_IndicatorsTool>get_TCY.yml@23|yoi": {
"description": "List of years for which tree cover should be extracted (maximum range 2000 - 2023).",
"label": "Years of interest - habitat change",
"weight": 7,
"type": "int[]",
"example": [
2000,
2005,
2010,
2015,
2020
]
},
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|ne_nc": {
"description": "Estimated Ne:Nc ratio for the studied species. Multiple values can be provided, separated by a comma.",
"label": "Ne:Nc ratio estimate",
"weight": 8,
"type": "float[]",
"example": [
0.1,
0.2
]
},
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|pop_density": {
"description": "Estimated density of the population [number of individuals per km2]. Multiple values can be provided, separated by a comma.",
"label": "Population density",
"weight": 9,
"type": "float[]",
"example": [
50,
100,
1000
]
},
"GenesFromSpace>ToolComponents>GetHabitatMaps>GFS_Habitat_map_GFW_tree_canopy_2000-2023.json@114|GFS_IndicatorsTool>get_TCY.yml@23|res": {
"description": "Desired resolution for tree cover map, will be obtained via resampling. To be specified in decimal degrees (0.01 ~ 1 km). Minimal value 0.001 (~100m).",
"label": "Resolution of tree cover map",
"weight": 10,
"type": "float",
"example": 0.01
}
},
"outputs": {
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|interactive_plot": {
"description": "An interactive interface to explore indicators trends across geographical space and time.",
"label": "Interactive plot",
"weight": 0,
"type": "text/html"
},
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|ne500": {
"description": "Estimated proportion of populations with Ne>500 at latest time point.",
"label": "Ne>500 indicator",
"weight": 1,
"type": "float"
},
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|pm": {
"description": "Estimated proportion of mantained populations, comparing earliest and latest time point.",
"label": "Population maintained indicator",
"weight": 2,
"type": "float"
},
"GenesFromSpace>ToolComponents>GetIndicators>GFS_Indicators.json@116|GFS_IndicatorsTool>get_Indicators.yml@127|ne_table": {
"description": "Estimated effective size of every population, based on the latest time point of the habitat cover map.",
"label": "Effective population size",
"weight": 3,
"type": "text/tab-separated-values"
}
},
"metadata": {
"name": "Forest cover loss by populations from GBIF occurrences (country)",
"description": "Genes from Space tool. The tool retrieves species occurrence from GBIF, then used to define polygons of population distribution based on geographic proximity. The tool then draws a habitat suitability map over time, based on the presence of forest cover. Finally, the tool estimates the size of suitable habitat over time for every population, and computes indicators of genetic diversity monitoring accordingly (Ne500 and Populations Maintained indicators). Population maps and genetic diversity indicators are displayed through an interactive interface. Forest cover loss data comes from [Global Forest Watch](https://www.globalforestwatch.org/)",
"author": [
{
"name": "Oliver Selmoni",
"email": "[email protected]"
}
],
"external_link": "https://teams.issibern.ch/genesfromspace/",
"references": [
{
"text": "Schuman et al., EcoEvoRxiv.",
"doi": "https://doi.org/10.32942/X2RS58"
},
{
"text": "Hansen et al., Science (2013)",
"doi": "https://doi.org/10.1126/science.1244693"
}
]
}
}
Loading

0 comments on commit e91cf15

Please sign in to comment.