Skip to content

Commit

Permalink
Merge pull request #78 from sharifX/master
Browse files Browse the repository at this point in the history
add archive
  • Loading branch information
samleeflang authored Jan 25, 2024
2 parents 9185241 + afb295d commit 22d69c1
Show file tree
Hide file tree
Showing 26 changed files with 2,316 additions and 0 deletions.
8 changes: 8 additions & 0 deletions archive/api-spec/api-intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Introduction to the openDS API

The API must comply with the requirements of the [Digital Object Interface Protocol (DOIP)](https://hdl.handle.net/0.DOIP/DOIPV2.0) specification published by the DONA Foundation.


*Specific requirements for DiSSCo to be written.*

END.
15 changes: 15 additions & 0 deletions archive/faqs/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Frequently asked questions (FAQ) about Digital Specimens and the openDS approach

## FAQs organised into categories

We've provided answers to frequently asked questions about Digital Specimens, the openDS specification and the benefits of the openDS approach. The questions and answers are grouped into categories, each with its own page:

- [Digital Specimens](faqds.md)
- [The openDS specification](faqopends.md)
- [The benefits of the openDS approach](faqbenefits.md)
- [openDS initiative compared to the Extended Specimens Network](faqcompare.md)

For ease of reference, questions are numbered sequentially within each category i.e., Q101, Q102, ...;Q201, Q202, ...; Q301, Q302, ...; etc.


END.
31 changes: 31 additions & 0 deletions archive/faqs/faqbenefits.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Frequently asked questions (FAQ) about the benefits of the openDS approach

**Q301. What are the benefits of the openDS approach?**

There are multiple benefits to all sectors of the scientific community, society and the general public from the openDS approach. Many of these benefits are associated with specific use case examples. We give some examples below, but this is not an exhaustive list.

- *Adding information to collections.*
In paleontology (for example, but not limited to that) some collections can contain many specimens that have not been identified. This can be because of lack of appropriate expertise within/of the collection owner or it may simply be due to lack of enough time and resources to do the work. Publishing the bare details of available specimens as Digital Specimens and allowing external experts to carry out identification and curation work on a community basis can add valuable information to an otherwise less informative collection. Of course, such work can be subject to moderation by a small number of responsible, expert persons. Such a community co-identification and/or co-curation approach can be attractive for engaging citizen experts from the wider public, as well as showing that scientists don’t yet know everything. Individuals can be acknowledged and recognised for work they perform.

- *Enhancing value chains that begin in natural science collections.*
Open Digital Specimens (or ‘specimens on the Internet’) enhance the value chains that begin in natural science collections with the combination of the hard evidence (the physical specimens themselves) and the expertise that interprets that evidence. openDS extends the value chains from initial gathering and organisation of specimens, through conduct and commercialization of specific science based on specimens, to sharing the ensuing economic and social benefits in a fair and equitable way. Products of digital value chains provide the translational research evidence for regulatory processes in health, food, security, sustainability and environmental change, and new educational uses. Digital Specimens are essential for recently described notions of extended specimens [Webster 2017], next generation collections [Schindel 2018] and visions of virtual collections underlining research infrastructure initiatives such as Europe’s Distributed System of Scientific Collections (DiSSCo) and the One World Collection [Owens 2019]. In the future, new software applications can work with and on Digital Specimen objects to provide more sophisticated computer assistance to both the present day known work tasks and to unimaginable future works of collection specialists, scientists and other translational researchers working daily with specimens.

- *Representing a physical thing in cyberspace (i.e., on the Internet) with context and meaning that allows processing in vast numbers by machines i.e., machine actionable, human readable.*
As a new kind of philosophical object alongside natural objects and fabricated tools, machine-actionable open Digital Specimen objects on the Internet are amenable to processing by and transport between heterogeneous information systems. Interoperability difficulties are much reduced through the type and operations definition mechanisms underlying the digital object concept. These objects are unambiguously identified by a persistent identifier (PID) mechanism that transcends changes in underlying Internet and World Wide Web technologies. Such objects have the implicit capability to remain findable, accessible, interoperable and reusable (FAIR) over timescales familiar to collection-holding institutions of many decades (100 years) and in this way integrate collection data in the data rich world of (*inter alia*) the Earth System Sciences and Life Sciences.

- *Easy exchange of data between different information/computer systems.*

- *Trusted, secure by design against unauthorised usage and/or modification.*
openDS, working together with the [Digital Object Interface Protocol (DOIP)](https://hdl.handle.net/0.DOIP/DOIPV2.0) specification brings digital specimen data out into the community, making them systematically organized mutable objects (units) with a standardized structure and representation to be shared, interpreted, annotated and acted upon by many. Implicit role-based access ensures that only those with authority can curate and modify the core data (what, where, when, who) whilst a wider range of approved individuals can easily supplement the core data with links to third-party sources, interpretations and annotations.

- *Capturing and anchoring all the data derived from physical specimens.*
Digital Specimens are different from traditional databased specimen records in their ability as a container object to capture and anchor all the data derived from physical specimens by a multiplicity of processes that includes morphological, DNA, chemical, imaging analysis and many more. The key lies in distributed curation (without institutional boundaries) of open, authoritative packages of information linking all the different data classes back to the physical specimen from which they were derived; and in ensuring such packages, anchored with globally unique, persistent identifiers become the efficient and reliable, trusted and human- and machine-actionable source of scholarly data about a specimen.

- *Can be referenced unambiguously when needed i.e., easily citable.*
Assignment and registration of unambiguous persistent identifiers to digital specimen information can begin immediately. *Natural Science Identifier* (*NSId*) names can be easily registered to immediately point to locations where authoritative units of specimen data can be found, in the same manner as DOIs locate journal articles; no matter where those are published or stored.

- *Have the scale, form and precision required for modern data science (mining, analysis, inference).*

- *Enabling new freedoms and opportunities arising from digital transformation of natural science collections.*

END.
22 changes: 22 additions & 0 deletions archive/faqs/faqcompare.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Frequently asked questions about the openDS initiative compared to the Extended Specimens Network

**Q401. How is the openDS initiative part of the DiSSCo programme?**

openDS, the Digital Specimen concept, and the persistent identifiers that unambiguously identify them are part of the [Distributed System of Scientific Collections (DiSSCo)](https://www.dissco.eu/) programme in Europe. The initiative aims to implement the Extended Specimen concept described by Michael Webster and colleagues in the 2017 book, "[*The Extended Specimen: Emerging Frontiers in Collections-Based Ornithological Research (Studies in Avian Biology)*](https://isbnsearch.org/isbn/9781498729154)" ![image of the book cover](/images/es-bookcover.png) whilst at the same time providing an enabling mechanism for the transformational change of working practices in collections-based science that DiSSCo foresees.

**Q402. What is the Extended Specimen Network (ESN)?**

The [Extended Specimen Network (ESN)](https://doi.org/10.1093/biosci/biz140) is an broad initiative by the [Biodiversity Collections Network (BCoN)](https://bcon.aibs.org/) in the USA to strengthen USA collections cyberinfrastructure by focusing on five main focus areas of collecting, digitization, integration/attribution, infrastructure, workforce and education of which the Extended Specimen concept is a key driving force; hence has an emphasis on both physical and digital representations. Alike in many respects, the ESN is akin to the overall DiSSCo programme to unify European natural science assets under common curation, access, policies and practices.

**Q403. What are the technical similarities and differences between openDS and ESN?**

DS and ES similarities and differences were explored in a [Birds of a Feather (BoF) session](https://www.tdwg.org/conferences/2020/working-sessions/#bof01:%20converging%20digital%20specimens%20and%20extended%20specimens%20-%20towards%20a%20global%20specification) convened during [TDWG 2020](https://www.tdwg.org/conferences/2020/). The session was recorded and is [available on YouTube](https://www.youtube.com/watch?v=8ljokNRkjeo).

DiSSCo explains a Digital Specimen (DS) as a curated and authoritative open package of links to data associated with or derived from a physical specimen. As such, it is a ‘twin’ on the Internet for a specimen in a physical collection anchoring all the information known about or derived from that physical specimen.

ESN explains an Extended Specimen (ES) as being the linkage between a physical specimen and all its derived preparation materials to all its local and externally derived digital products.

Thus, DS and ES are conceptually very alike. Technical notions of how DS and ES might be implemented have emerged independently in Europe and the USA, giving rise to the openDS initiative and the Extended Specimen Network respectively. Nevertheless, with their common origin in the concept work by [Webster and colleagues](https://isbnsearch.org/isbn/9781498729154), they share the same basic idea and goals of connecting all the data derived from or about a physical specimen to that specimen in order to extend what is known about it.


END.
38 changes: 38 additions & 0 deletions archive/faqs/faqds.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Frequently asked questions (FAQ) about Digital Specimens

![Illustration of the data types tied up in physical specimens](/images/whatsinabug.png)

**Q101. What is a Digital Specimen (DS)?**

A Digital Specimen, referenceable by its unique Natural Sciences Identifier (NSId) represents the sum of information on the Internet about a natural specimen object. The Digital Specimen acts as a processable digital twin on the Internet for the physical specimen in a natural sciences collection (for example, in a Natural History Museum).

This [DiSSCo Tech article](https://dissco.tech/2020/03/31/what-is-a-digital-specimen/) introduces the Digital Specimen concept in more depth.

**Q102. What is an _open_ Digital Specimen?**

The word ‘open’ is applied in the sense of an open format for Digital Specimens as defined by a published specification, the openDS specification. With the intention to improve interoperability, open formats can be used by anyone and their use is often encouraged within, by and for specific communities and the associated software applications and systems.

From the perspective of the commons and the [Open Definition](https://opendefinition.org/), open can be applied further to mean that open Digital Specimens can be “_freely used, modified, and shared by anyone for any purpose_.” However, initial publication of Digital Specimen content is constrained by considerations of applicable legislative constraints, which in the bio- and geo-diversity domain is interpreted to mean that the aim with Digital Specimens is to be “_as open as possible, as closed as legally necessary_.”

‘Open’ is also used in the sense that a Digital Specimen is a mutable digital object, meaning that it can be manipulated and modified by appropriately authorized persons. Open Digital Specimens ultimately act as a common curation space where experts should be able to contribute above and beyond the data held at the local level of a specific natural science collection.

**Q103. How are open Digital Specimens represented?**

Within an information system, the design of the internal data structures used to hold data associated with Digital Specimens is a decision for the system’s designers. However, the importance of being FAIR (‘findable, accessible, interoperable and reusable’) with its emphasis on machine actionability and the increasing use of object store infrastructures with their correspoinding APIs strongly advocates in the direction of using standardized, easily discoverable object type definitions such as the Digital Specimen object type. This is especially true at the external interfaces of systems. Knowing the (object type) definition of a Digital Specimen and what operations can be performed on it provides a shared and congruent understanding of the object context that is essential for correct functional operation and performance of systems, especially at the level of machine-to-machine interactions. Digital object type definitions and interface protocols directly support high levels of interoperability and reusability between systems and between applications. These are two of the four guiding principles of FAIR. To learn how Digital Object Architecture offers adherence to the FAIR principles as an integral characteristic, you may like to read this [paper by Lannom, Koureas and Hardisty](https://doi.org/10.1162/dint_a_00034).

When it is necessary for transfer between different information systems, an open Digital Specimen can be represented (serialized) as a JSON document in which the various attributes of the specimen are represented as related sets of name/value pairs in [Javascript Object Notation (JSON)](https://www.json.org/json-en.html). The details of this transfer representation (serialization) are specified by the openDS specification.

**Q104. How are Digital Specimens made FAIR (Findable, Accessible, Interoperable, Reusable)?**

Digital Specimens are first-class [FAIR Digital Objects](https://fairdo.org/) (FDO), complying with the generic guidelines and requirements of the [FAIR Digital Object Framework](http://bit.ly/fdof102) ([Version 1.02, November 2019, FDOF Technical Implementation Guideline](https://github.com/GEDE-RDA-Europe/GEDE/blob/master/FAIR%20Digital%20Objects/FDOF/FAIR%20Digital%20Object%20Framework-v1-02.docx)). In this respect, they draw on controlled vocabularies and ontologies to introduce meaning (context) into their structure and are represented by object type definitions deposited in (a) public type registry(ies).

**Q105. Can a Digital Specimen contain data for multiple specimens?**

A Digital Specimen has a one-to-one relation with a physical specimen in a physical collection and corresponds to whatever the owning/holding institution deems to be an individual specimen. A specimen in this sense can be any object or entity (physical or conceptual) and is independent from categories such as preparation, sample or biological individual.

If a specimen is a box container or spirit jar containing multiple organisms and that is catalogued as a single specimen, then it will have a single Digital Specimen as a digital surrogate for it. If each of the organisms in the box/jar is catalogued separately then each should have its own Digital Specimen surrogate with appropriate linking relations to the others within the same container.

A Digital Specimen can correspond to a specimen that is no longer physically extant (e.g., lost, destroyed, split into specimens with separate identities, a fish released after taking measurements and samples, etc.). Such _status_ information should be made clear in the DS.


END.
Loading

0 comments on commit 22d69c1

Please sign in to comment.