diff --git a/.nojekyll b/.nojekyll index 49836f1..b2594ba 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -64e27650 \ No newline at end of file +472b07c3 \ No newline at end of file diff --git a/index.html b/index.html index f992f48..1a9f02a 100644 --- a/index.html +++ b/index.html @@ -492,7 +492,7 @@
The CoCo Data Model is based on international standards, such as CIDOC CRM (Doerr 2003), Dublin Core, and ICA Records in Contexts to promote interoperability with other datasets. The model supports the modelling of the relevant properties of letter metadata (Letter, Actor, Place, and Time-Span, provenance i.e. MetadataRecord and archival/collection level information) from the source datasets. To represent actors in these different source datasets, we use an adaptation of the Open Archives Initiative Object Reuse and Exchange (OAI-ORE) proxy concept. The collected metadata are transformed into linked open data using an automated transformation pipeline (Drobac 2023), which consists of several steps. First, each received dataset is processed into an intermediate RDF format. Then the data are harmonised with the CoCo Data Model and enriched by linking the recognised actors and places to external resources, such as Wikidata, Finnish National Biographies, as well as the Finnish AcademySampo (Drobac 2023) and the BiographySampo (Tamper 2023). Finally, the transformation pipeline produces a harmonised dataset of correspondence metadata. The availability of the aggregated letter metadata in the linked open data format facilitates the use of the data for data exploration and answering humanities research questions by publishing the data in a semantic portal or by using SPARQL queries. The project’s semantic portal allows users to search, browse and analyse the letters, archives, actors, and places in the CoCo dataset. It is based on the Sampo model (Hyvönen 2022) and is implemented using the Sampo-UI programming framework (Ikkala 2022). The user interface works on a faceted search paradigm, which allows the user to search, e.g. for letters sent by a certain person, letters from a certain period or letters kept in a certain organisation. Data such as the sending places can be visualised on a map, and other visualisations include the yearly distributions of letters, top correspondents, and correspondence networks. The portal also offers some network analysis figures.
+The CoCo Data Model is based on international standards, such as CIDOC CRM (Doerr 2003), Dublin Core, and ICA Records in Contexts to promote interoperability with other datasets. The model supports the modelling of the relevant properties of letter metadata (Letter, Actor, Place, and Time-Span, provenance i.e. MetadataRecord and archival/collection level information) from the source datasets. To represent actors in these different source datasets, we use an adaptation of the Open Archives Initiative Object Reuse and Exchange (OAI-ORE) proxy concept. The collected metadata are transformed into linked open data using an automated transformation pipeline (Drobac et al. 2023), which consists of several steps. First, each received dataset is processed into an intermediate RDF format. Then the data are harmonised with the CoCo Data Model and enriched by linking the recognised actors and places to external resources, such as Wikidata, Finnish National Biographies, as well as the Finnish AcademySampo (Drobac et al. 2023) and the BiographySampo (Tamper et al. 2023). Finally, the transformation pipeline produces a harmonised dataset of correspondence metadata. The availability of the aggregated letter metadata in the linked open data format facilitates the use of the data for data exploration and answering humanities research questions by publishing the data in a semantic portal or by using SPARQL queries. The project’s semantic portal allows users to search, browse and analyse the letters, archives, actors, and places in the CoCo dataset. It is based on the Sampo model (Hyvönen 2022) and is implemented using the Sampo-UI programming framework (Ikkala et al. 2022). The user interface works on a faceted search paradigm, which allows the user to search, e.g. for letters sent by a certain person, letters from a certain period or letters kept in a certain organisation. Data such as the sending places can be visualised on a map, and other visualisations include the yearly distributions of letters, top correspondents, and correspondence networks. The portal also offers some network analysis figures.
We realised early on that it was extremely important to gather user experience before actually launching the CoCo portal. So, in early February 2024, we opened the portal to a test group of 17 people for a period of 2.5 months. The group was assembled partly through an open call on social media and at some conferences, and partly by asking specific people to join. The volunteers were mostly academic humanities researchers and invited specialists from two museums with 19th-century letter collections. We provided the testers with background material on the portal and the CoCo data endpoint documentation, the organisations whose data was currently in the portal, with user instructions, and with some questions or tasks to help them get started. We asked five specific questions to which we wanted answers, but we welcomed all kinds of feedback. Building an engaged and motivated community of test users proved to be a challenge. The opening online session, where the project’s portal experts taught how to use it, received only four participants. However, they were active in asking questions and satisfied with the introduction. We also offered the possibility of a joint online closing session to discuss the experiences, but only two people registered, so we cancelled it. All this shows that once the initial excitement of joining the test group had worn off, it was difficult to reach the testers and very difficult to engage them in the test or create a group spirit. People got lost in their own research and daily work. In the end, however, after a reminder, we received feedback from eight testers, i.e. from half of the group. We are still waiting for feedback from the later test group of archivists in the Finnish Literary Society. The feedback can be divided into two groups: comments on errors in the data, mostly errors in the disambiguation of actors, and comments on the functionalities and performance of the portal. The latter was what we had hoped for. It is interesting to note that the testers, who work in CH organisations and are used to working with collection management systems and have even catalogued archival material themselves, commented more on the functionalities, while the researchers clearly focused on data errors. In our view, this shows how much hands-on experience with databases affects a humanist’s ability to study mass data material and turn into digital humanities. You have to change your mindset, as our own experience in the project has shown. We also noticed that working only with the metadata of letters was a barrier for some of the testers; in the wish list for the future development of the portal, digitised letters or their transliterations stood out. A positive result from the project’s point of view was that the testers found the portal easy to use, the user interface with its four perspectives clear and the data offered useful both for their own research and for information services in CH organisations. They were able to find unknown connections, relationships and people in the data. The metadata available seemed to respond well to their research questions and to encourage further research. The ability to do queries with places was particularly appreciated. The fact that we had to write ‘preliminary’ instructions and update the portal for the testers in advance stimulated internal discussion about the future development of the portal and how to create a smooth feedback system for data errors and functional problems. Ethical data work (Ahnert and Weingart 2020) requires us to be very open, precise and thorough in documenting what data we have received and how we have harmonised it, which organisations are included and how they are implemented in the portal. Some difficulties with functionalities, visualisations and data errors have either already been fixed or are in progress or at least under discussion for further action. The feedback has provided us with an insight into the difficulties that users, both ‘traditional’ humanists and members of the wider public, may encounter when using the portal, and some guidelines for us to help them learn to read digital. The feedback proves that we are on the right track towards our goal of creating a new research resource, a virtual archive that crosses organisational silos.
+We realised early on that it was extremely important to gather user experience before actually launching the CoCo portal. So, in early February 2024, we opened the portal to a test group of 17 people for a period of 2.5 months. The group was assembled partly through an open call on social media and at some conferences, and partly by asking specific people to join. The volunteers were mostly academic humanities researchers and invited specialists from two museums with 19th-century letter collections. We provided the testers with background material on the portal and the CoCo data endpoint documentation, the organisations whose data was currently in the portal, with user instructions, and with some questions or tasks to help them get started. We asked five specific questions to which we wanted answers, but we welcomed all kinds of feedback. Building an engaged and motivated community of test users proved to be a challenge. The opening online session, where the project’s portal experts taught how to use it, received only four participants. However, they were active in asking questions and satisfied with the introduction. We also offered the possibility of a joint online closing session to discuss the experiences, but only two people registered, so we cancelled it. All this shows that once the initial excitement of joining the test group had worn off, it was difficult to reach the testers and very difficult to engage them in the test or create a group spirit. People got lost in their own research and daily work. In the end, however, after a reminder, we received feedback from eight testers, i.e. from half of the group. We are still waiting for feedback from the later test group of archivists in the Finnish Literary Society. The feedback can be divided into two groups: comments on errors in the data, mostly errors in the disambiguation of actors, and comments on the functionalities and performance of the portal. The latter was what we had hoped for. It is interesting to note that the testers, who work in CH organisations and are used to working with collection management systems and have even catalogued archival material themselves, commented more on the functionalities, while the researchers clearly focused on data errors. In our view, this shows how much hands-on experience with databases affects a humanist’s ability to study mass data material and turn into digital humanities. You have to change your mindset, as our own experience in the project has shown. We also noticed that working only with the metadata of letters was a barrier for some of the testers; in the wish list for the future development of the portal, digitised letters or their transliterations stood out. A positive result from the project’s point of view was that the testers found the portal easy to use, the user interface with its four perspectives clear and the data offered useful both for their own research and for information services in CH organisations. They were able to find unknown connections, relationships and people in the data. The metadata available seemed to respond well to their research questions and to encourage further research. The ability to do queries with places was particularly appreciated. The fact that we had to write ‘preliminary’ instructions and update the portal for the testers in advance stimulated internal discussion about the future development of the portal and how to create a smooth feedback system for data errors and functional problems. Ethical data work (Ahnert et al. 2020) requires us to be very open, precise and thorough in documenting what data we have received and how we have harmonised it, which organisations are included and how they are implemented in the portal. Some difficulties with functionalities, visualisations and data errors have either already been fixed or are in progress or at least under discussion for further action. The feedback has provided us with an insight into the difficulties that users, both ‘traditional’ humanists and members of the wider public, may encounter when using the portal, and some guidelines for us to help them learn to read digital. The feedback proves that we are on the right track towards our goal of creating a new research resource, a virtual archive that crosses organisational silos.