Project

General

Profile

OpenAtlas + ARCHE Meeting 2025-02-03, 12:00

Location: ACDH-CH, Bäckerstraße 13, room 2D
Updated information in the course of the meeting is in color and/or marked with an ✅. Every participant is welcome to add and adapt.

Participants

  • OpenAtlas: Alexander Watzinger, Bernhard Koschiček-Krombholz, Nina Richards
  • ARCHE: Massimiliano Carloni, Mateusz Żółtak, Seta Štuhec

Topics

This meeting is about archiving concluded OpenAtlas projects at the long term archiving system ARCHE
Ideally we can implement a function that exports OpenAtlas data in an ARCHE friendly way to speed up this process for all future archiving endeavors.

  • Public protocol ok? ✅
  • Almost all information here was added in/after the meeting so we refrained from turning everything green in this case

Project metadata

  • What would be the ideal workflow between us and the cooperation partners, e.g. about the deposition agreement
    -> When a project expresses interest in archiving, simply refer to the ARCHE team (in particular, Seta)
  • How these metadata are generated, depends on OpenAtlas workflow: either Excel spreadsheet manually filled out by depositors or RDF/CSV/... automatically created by extracting metadata from OpenAtlas

Model data

Currently, OpenAtlas has a PostgreSQL database as well as an API.
  • A SQL dump seems against the philosophy to make data independent of the application
  • A RDF dump would be the most interoperable way to deal with the data (#2467)
    • There is no high barrier regarding the content of the file, maybe just syntax validation will be implemented – but there is no best practice from the ARCHE side regarding semantics
    • Metadata can be provided as a single file or as many separate files, the important thing is that there is a mapping from the subject of a triple to the respective file/folder (e.g. when using RDF metadata, project namespace + path, where the path refers to a base directory, example https://id.acdh.oeaw.ac.at/bitem/some_folder_name/some_file_name.pdf)
  • ARCHE can do both (SQL and RDF) --> SQL is easier to re-import into OpenAtlas
One question is how much we want to provide on metadata level. We could include named entities in the metadata to enhance find-ability (#2466).
  • Possibly in ARCHE would be:
    • Places
    • Persons
    • Organizations
  • There are properties in ARCHE to record entities like actors or places with GeoNames. But there might be potentially hundreds or thousands of places for a single file (= dump)
  • We should think about what entities might be used for search. Less relevant entities could still be deleted in curation phase on the ARCHE side
  • There is a list of allowed namespaces (e.g. GeoNames); if needed, others can be added
  • There is the possibility to test data locally before ingestion into ARCHE (PHP tools that can be executed in Docker containers)
    • In case of errors a feedback loop with the depositors would be needed

Files

  • Files can be anything (images, videos, PDFs...)
  • There are some recommended formats in ARCHE, but ARCHE can be flexible if needed
  • OpenAtlas has 3D models as GLB, which is prefer (easier to handle for dissemination, since they contain everything in one file)
  • There are some necessary properties (like license), but more data can be extracted to have richer metadata
  • Rights-related fields in OpenAtlas which can be used for the metadata in ARCHE:

Candidates (#2372)

  • bITEM - ready to begin
  • CONNEC - ready to begin
  • SHAHI - will still be working this year but preparation can begin
  • MoByz - to be clarified

Appendix 1. Example table summarizing requirements for binaries and metadata in ARCHE

Directory structure Binaries needed? Metadata needed?
project_collection - +
project_collection/some_SQL_dump + +
project_collection/some_RDF_dump + +

Appendix 2. How to model exact matches in ARCHE (example in Turtle)

<https://example.org/identifier_of_a_resource> acdh:hasSpatialCoverage <https://www.geonames.org/2761369>. # Vienna
<https://www.geonames.org/2761369> acdh:hasIdentifier <http://www.wikidata.org/entity/Q1741>.

Also available in: PDF HTML TXT