OpenAtlas + ARCHE Meeting 2025-02-03, 12:00¶
Location: ACDH-CH, Bäckerstraße 13, room 2D
Updated information in the course of the meeting is in color and/or marked with an ✅. Every participant is welcome to add and adapt.
Participants¶
- OpenAtlas: Alexander Watzinger, Bernhard Koschiček-Krombholz, Nina Richards
- ARCHE: Massimiliano Carloni, Mateusz Żółtak, Seta Štuhec
Topics¶
This meeting is about archiving concluded OpenAtlas projects at the long term archiving system ARCHE
Ideally we can implement a function that exports OpenAtlas data in an ARCHE friendly way to speed up this process for all future archiving endeavors.
- Public protocol ok? ✅
- Almost all information here was added in/after the meeting so we refrained from turning everything green in this case
Project metadata¶
- What would be the ideal workflow between us and the cooperation partners, e.g. about the deposition agreement
-> When a project expresses interest in archiving, simply refer to the ARCHE team (in particular, Seta)
- How these metadata are generated, depends on OpenAtlas workflow: either Excel spreadsheet manually filled out by depositors or RDF/CSV/... automatically created by extracting metadata from OpenAtlas
Model data¶
Currently, OpenAtlas has a PostgreSQL database as well as an API.
- A SQL dump seems against the philosophy to make data independent of the application
- A RDF dump would be the most interoperable way to deal with the data (#2467)
- There is no high barrier regarding the content of the file, maybe just syntax validation will be implemented – but there is no best practice from the ARCHE side regarding semantics
- Metadata can be provided as a single file or as many separate files, the important thing is that there is a mapping from the subject of a triple to the respective file/folder (e.g. when using RDF metadata,
project namespace + path, where the path refers to a base directory, example https://id.acdh.oeaw.ac.at/bitem/some_folder_name/some_file_name.pdf)
- ARCHE can do both (SQL and RDF) --> SQL is easier to re-import into OpenAtlas
One question is how much we want to provide on metadata level. We could include named entities in the metadata to enhance find-ability (#2466).
- Possibly in ARCHE would be:
- Places
- Persons
- Organizations
- There are properties in ARCHE to record entities like actors or places with GeoNames. But there might be potentially hundreds or thousands of places for a single file (= dump)
- We should think about what entities might be used for search. Less relevant entities could still be deleted in curation phase on the ARCHE side
- There is a list of allowed namespaces (e.g. GeoNames); if needed, others can be added
- There is the possibility to test data locally before ingestion into ARCHE (PHP tools that can be executed in Docker containers)
- In case of errors a feedback loop with the depositors would be needed
Files¶
- Files can be anything (images, videos, PDFs...)
- There are some recommended formats in ARCHE, but ARCHE can be flexible if needed
- OpenAtlas has 3D models as GLB, which is prefer (easier to handle for dissemination, since they contain everything in one file)
- There are some necessary properties (like license), but more data can be extracted to have richer metadata
- Rights-related fields in OpenAtlas which can be used for the metadata in ARCHE:
Candidates (#2372)¶
- bITEM - ready to begin
- CONNEC - ready to begin
- SHAHI - will still be working this year but preparation can begin
- MoByz - to be clarified
Appendix 1. Example table summarizing requirements for binaries and metadata in ARCHE
| Directory structure |
Binaries needed? |
Metadata needed? |
| project_collection |
- |
+ |
| project_collection/some_SQL_dump |
+ |
+ |
| project_collection/some_RDF_dump |
+ |
+ |
Appendix 2. How to model exact matches in ARCHE (example in Turtle)
<https://example.org/identifier_of_a_resource> acdh:hasSpatialCoverage <https://www.geonames.org/2761369>. # Vienna
<https://www.geonames.org/2761369> acdh:hasIdentifier <http://www.wikidata.org/entity/Q1741>.