Skip to content

Commit 99edd25

Browse files
author
Joe Roe
committed
Resolve several TODOs
1 parent 426f3db commit 99edd25

29 files changed

+91
-34
lines changed

paper.pdf

4.49 KB
Binary file not shown.

paper.qmd

+18-30
Original file line numberDiff line numberDiff line change
@@ -72,12 +72,13 @@ execute:
7272
cache: true
7373
---
7474

75-
```{r setup}
75+
```{r setup, cache=FALSE}
7676
#| include: false
7777
library("countrycode")
7878
library("cowplot")
7979
library("dplyr", warn.conflicts = FALSE)
8080
library("dm")
81+
library("english")
8182
library("giscoR")
8283
library("ggplot2")
8384
library("glue")
@@ -92,6 +93,7 @@ library("RPostgres")
9293
library("sf")
9394
library("spatstat")
9495
library("stars")
96+
library("stringr")
9597
library("tidyr")
9698
library("webshot2")
9799
@@ -124,7 +126,7 @@ As a necessary prerequisite to understanding the context of any past event or pr
124126
If archaeology is to be an open science [@Lake2012], it is therefore critical that effective open access to chronological information be placed front and centre.
125127

126128
Over the last two decades, archaeologists have answered this call by publishing an increasing number of compilations of dates from archaeological contexts as open data.
127-
These efforts have facilitated major reevaluations of previously-established chronologies [e.g. @HighamEtAl2014; @LoftusEtAl2019; @PratesEtAl2020; @KatsianisEtAl2020], important new insights into past processes [@RirisEtAl2024], and the development of novel ways of using chronological data [@Crema2022].
129+
These efforts have facilitated re-evaluations of chronologies themselves [e.g. @HighamEtAl2014; @LoftusEtAl2019; @PratesEtAl2020; @KatsianisEtAl2020] but also the development of novel ways of using chronological data [@MoodyEtAl2021; @Crema2022; @RirisEtAl2024].
128130
<!-- TODO: Would welcome more citations here -->
129131
The focus has been overwhelmingly on radiocarbon dating and most compilations focus on a single region and/or period.
130132
The profusion of open radiocarbon data in particular has prompted several initiatives towards a global synthesis [e.g. @SchmidEtAl2019; @BronkRamseyEtAl2019; @BirdEtAl2022].
@@ -147,7 +149,6 @@ Since we envisage both XRONOS as a dataset and XRONOS as software to be continua
147149
## Compilations of radiocarbon dates {#sec-c14-compilation}
148150

149151
Though an explicit emphasis on 'open data' is a relatively recent phenomenon in archaeology [@Lake2012], the open publication of compiled radiocarbon dates has a substantial prehistory.
150-
<!-- TODO: check and update this about date lists... -->
151152
Arnold and Libby [-@ArnoldLibby1951] initiated the tradition of regularly publishing all the dates they had obtained.
152153
This practice was subsequently continued as radiocarbon laboratories periodically shared and compiled their own 'date lists', published mainly in the journals *Radiocarbon* and *Archaeometry*.
153154
However, as the number of labs and volume of radiocarbon dates being produced grew, this paper-based format became impractical and mostly disappeared [with exceptions, e.g. @NdeyeEtAl2022] without being replaced by another form of systematic data-sharing or dissemination [@BronkRamseyEtAl2019].
@@ -169,7 +170,7 @@ Our review of the literature identified `r n_c14_datasets` published since 1994.
169170
This is almost certainly an undercount, because our firsthand knowledge of regional literature was limited to Europe and West Asia and many resources only ever existed in 'grey' formats (e.g. websites that were not indexed and no longer exist).
170171
We also restricted ourselves to structured datasets disseminated primarily in a digital format;
171172
'date lists' in printed periodicals and gazetteers were excluded.
172-
A full list of the datasets we identified is presented in appendix XXXX<!-- TODO -->.
173+
A full list of the datasets we identified can be found in the supplementary materials.
173174

174175
```{r fig-c14-datasets-time}
175176
#| fig-cap: Cumulative number of radiocarbon compilations published since 1995 according to our survey (see Supplementary Material).
@@ -194,7 +195,7 @@ Notable early examples include ANDES 14C in 1994 [Central Andes, @MichczynskiEtA
194195
From 2010, coinciding with broader shifts in scientific publishing [@TenopirEtAl2011], it became more common to publish standalone 'open data' products in the form of journal supplements, archives in repositories and/or data papers;
195196
the *[Journal of Open Archaeology Data](<https://openarchaeologydata.metajnl.com>)*, launched in 2012, has been a prominent venue for this latter category.
196197
Most recently there has been a trend towards providing version-controlled plain text data via platforms such as [GitHub](https://github.com), reflecting the broader adoption of these tools amongst computational archaeologists over the last decade [@BatistRoe2024].
197-
The shift from online databases towards more static but more preservable open data products is welcome, given how many databases from the first generation have subsequently ceased to be accessible. <!-- TODO: can this be quantified or visualised? -MH -->
198+
The shift from online databases towards more static but more preservable open data products is welcome, given how many databases from the first generation have subsequently ceased to be accessible.
198199
Version-controlled repositories are particular well-suited to data compilation projects because they allow for continued updates whilst still providing snapshot 'releases' that are citeable and can be archived in long-term repositories.
199200

200201
```{r data-basemap}
@@ -251,10 +252,11 @@ Most radiocarbon datasets we reviewed were compiled with a specific goal in mind
251252
Laboratory databases solve the problem of currency, but tend to have more arbitary coverage, since the inclusion of data is determined by who submits dates to that lab, not any form of principled curation.
252253
There are also comparatively few of them – most active labs no longer directly publish dates that they produce (if they ever did).
253254

254-
In addition, the temporal and geographic coverage of these resources is uneven [@ChaputGajewski2016; @AlcantaraPedrozainpress], systematically biased [@ClistEtAl2023], and duplicative of each other.
255-
<!-- TODO: example, X databases for Europe, Y for the rest of the world? -->
256-
By our count, <!-- TODO: X --> of the databases are not 'open' according to the Open Knowledge Foundation's definition of data openness ["Open data and content can be freely used, modified, and shared by anyone for any purpose", @OpenKnowledgeFoundation], which both limits the access to and reuse potential of these datasets.
257-
<!-- TODO: X --> are not currently available in readily machine-readable formats (e.g. plain text or database files rather than PDFs or hypertext).
255+
The temporal and geographic coverage of these resources is uneven [@ChaputGajewski2016; @AlcantaraPedrozainpress], systematically biased [@ClistEtAl2023], and duplicative.
256+
For example, we identified `r english(sum(str_detect(c14_datasets$m49_region, "Western Europe"), na.rm = TRUE))` different databases covering Western Europe but none covering South Asia.
257+
The quality and accessibility of published compilations is also variable.
258+
`r str_to_sentence(english(sum(c14_datasets$open, na.rm = TRUE)))` of the `r english(nrow(c14_datasets))` resources we reviewed are not 'open' according to the Open Knowledge Foundation's definition of data openness ["Open data and content can be freely used, modified, and shared by anyone for any purpose", @OpenKnowledgeFoundation], which both limits the access to and reuse potential of these datasets.
259+
And even of these, many are not currently available in readily machine-readable formats (e.g. plain text or database files rather than PDFs or hypertext).
258260

259261
The fragmentation of the radiocarbon record into regional datasets also hinders analysis at larger scales.
260262
Although the core elements of a radiocarbon date—laboratory identifier, radiocarbon age, measurement error—are more or less standardised, there is no such consistency in contextual information on the sample or site.
@@ -427,7 +429,8 @@ This is also typically present in many other forms of systematic compilation wor
427429
Aggregated typological information from such sources are often used in aoristic analysis and related methods [@Mischka2004; @Crema2024].
428430
What is lacking in this presentation of typological dating is metadata on how the determination was made and how exactly it is to be understood.
429431
Like any archaeological date, a typological date is derived from a physical sample – the object or set of object from which a chronological estimate was derived.
430-
Typological dates on one class of object may well clash with other classes of object, or for that matter with scientific dates, but without this kind of metadata such inconsistencies are difficult to resolve. <!-- TODO: clarify - MH -->
432+
Typological dates on one class of object may well clash with other classes of object, or for that matter with scientific dates – does one trust the date on pottery, the date on architecture, or the radiocarbon date?
433+
Without additional metadata on e.g. who made the typological determination or what the radiocarbon date was obtained on, such inconsistencies are difficult to resolve
431434
Similarly the absolute date range corresponding to a typological determination (e.g. "Late Neolithic") can be interpreted in multiple ways depending on the region and intentions of the expert making the determination.
432435
PeriodO [@RabinowitzEtAl2016] is a linked open data infrastructure that includes a shared vocabulary of typological periods and corresponding calendar age estimates, and an important step towards addressing the latter problem.
433436
However, it remains to be systematically linked to actual compilations of typological dates.
@@ -444,22 +447,7 @@ Our overall aims in developing XRONOS is to bring this model, which RADON has op
444447
## Design goals
445448

446449
XRONOS is our answer to Kintigh's call [@Kintigh2006] for digital infrastructures that don't just provide access to chronological data but enables researchers to "archive, access, integrate, and mine disparate data sets".
447-
It parallels and draws inspiration from several similar initiatives within and outwith archaeology, such as <!-- TODO examples
448-
Examples:
449-
https://link.springer.com/article/10.1007/s10816-018-9399-6
450-
https://link.springer.com/article/10.1007/s10816-010-9098-4
451-
https://www.cambridge.org/core/journals/latin-american-antiquity/article/public-database-of-archaeological-resources-on-easter-island-rapa-nui-using-google-earth/F30677F6FC99B4C2BC762BCC98FD0966
452-
https://www.cambridge.org/core/journals/advances-in-archaeological-practice/article/data-integration-in-the-service-of-synthetic-research/E297C57441239AB35053F987AA758EB9
453-
https://www.cambridge.org/core/journals/advances-in-archaeological-practice/article/tdar/FCE1056A210A452E5FE65560763D4CC7
454-
https://www.cambridge.org/core/journals/advances-in-archaeological-practice/article/googlebased-freeware-solution-for-archaeological-field-survey-and-onsite-artifact-analysis/7914F673D1E1CAD96E5270E794CDFFD9
455-
https://link.springer.com/article/10.1007/s10816-015-9272-9
456-
https://link.springer.com/article/10.1007/s10816-015-9251-1
457-
https://link.springer.com/article/10.1007/s10816-015-9240-4
458-
https://anatomypubs.onlinelibrary.wiley.com/doi/10.1002/ar.23130
459-
https://academic.oup.com/dsh/article/29/3/349/2938128
460-
https://www.sciencedirect.com/science/article/pii/S0167739X13000678?via%3Dihub
461-
462-
-->.
450+
It complements several similar open data infrastructures within and outwith archaeology, such as SEAD for environmental archaeology [@Buckland2014], IMPACT for mummified human remains [@NelsonWade2015], Neotoma for palaeoecological data [@WilliamsEtAl2018], IsoArcH for stable isotope data [@PlompEtAl2022], and the 'Big Interdisciplinary Archaeological Database' (BIAD), an ambitious new initiative to combine many of these individual domains, including chronology [@ReiterEtAl2024].
463451
To improve upon existing global syntheses of radiocarbon dates (see @sec-global-compilations), we aimed to develop a living infrastructure that both continually collected data from diverse sources and presented a seamless single database to the user.
464452
The Global Biodiversity Information Facility [GBIF, <https://gbif.org>, @CanhosEtAl2004]—which provides a single, consistent interface to many sources of global biodiversity data—has served as an exemplar for us in this regard.
465453

@@ -526,7 +514,7 @@ xronos_dm_svg <- xronos_dm |>
526514
dm_add_fk("versions", "item_id", "sites") |>
527515
dm_add_fk("versions", "item_id", "taxons") |>
528516
dm_add_fk("versions", "item_id", "typos") |>
529-
# TODO: Self-references (primarily persuadable models), or too messy?
517+
# Exclude self-references (primarily supersedable models) for readability, e.g.
530518
# dm_add_fk("contexts", "superseded_by", "contexts") |>
531519
dm_draw("BT", view_type = "title_only", column_types = TRUE) |>
532520
DiagrammeRsvg::export_svg()
@@ -537,7 +525,7 @@ write(xronos_dm_svg, xronos_dm_svg_file)
537525
ggdraw() + draw_image(xronos_dm_svg_file)
538526
```
539527

540-
<!-- TODO: shorten -->
528+
<!-- TODO: shorten? -->
541529

542530
At the base of the XRONOS data model (@fig-data-model) are sets of spatiotemporal coordinates or, as we call them, *chrons*.
543531
In an archaeological context, we conceptualise a chron as an assertion linking human activity with a particular point in space and time.
@@ -638,9 +626,9 @@ This basic REST pattern is augmented by seven 'actions' (following the standard
638626
The 'show' action represents interaction with a single resource, as described above.
639627
The 'index' action, which lists resources of a given type (e.g. <https://xronos.ch/c14s> for radiocarbon dates), is worth special mention because it is through this that the filtering logic at the core of XRONOS' two interfaces is implemented.
640628
By passing a query as HTTP GET parameters to the index action of a resource, the list returned the user is modified to only include records that match that query.
641-
For example, <https://xronos.ch/sites?site[country_code]=CH> (the part of the URL after the `?` character encodes the SQL WHERE clause `country = 'CH'` as a GET parameter) lists sites in Switzerland <!-- TODO: check this actually works -->.
629+
For example, <https://xronos.ch/sites?site[country_code]=CH> (the part of the URL after the `?` character encodes the SQL WHERE clause `country = 'CH'` as a GET parameter) lists sites in Switzerland.
642630
More complex queries can be executed using nested parameters.
643-
For example, <https://xronos.ch/c14s?c14[sample][material][name]=charcoal> (encoding that the `c14` table should be joined to the `material` table via `sample`, followed by the WHERE clause `material.name = 'charcoal'`) lists radiocarbon dates obtained from charcoal samples <!-- TODO: check that this actually works... -->.
631+
For example, <https://xronos.ch/c14s?sample[material][name]=charcoal> (encoding that the `c14` table should be joined to the `material` table via `sample`, followed by the WHERE clause `material.name = 'charcoal'`) lists radiocarbon dates obtained from charcoal samples.
644632
Uniquely, index actions can also respond with the result in a tabular data format (i.e. `.csv`).
645633

646634
## Data ingestion and curation {#sec-implementation-data}

paper_cache/pdf/__packages

+6-4
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ countrycode
22
cowplot
33
dplyr
44
dm
5+
english
56
giscoR
67
ggplot2
78
glue
@@ -14,10 +15,6 @@ purrr
1415
readr
1516
RPostgres
1617
sf
17-
abind
18-
stars
19-
tidyr
20-
webshot2
2118
spatstat.data
2219
spatstat.geom
2320
spatstat.random
@@ -27,3 +24,8 @@ rpart
2724
spatstat.model
2825
spatstat.linnet
2926
spatstat
27+
abind
28+
stars
29+
stringr
30+
tidyr
31+
webshot2
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
0 Bytes
Binary file not shown.
0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)