Feature #1848
Updated by Massimiliano Carloni about 2 years ago
For the INDIGO project we are trying to import data already stored in ARCHE to OpenAtlas.
h1. Info on INDIGO test collection on ARCHE
A test collection with data provided by Geert Verhoeven and Benjamin Wild was imported into the "staging instance":https://arche-curation.acdh-dev.oeaw.ac.at of ARCHE, hosted on Minerva. The test collection has identifier https://id.acdh.oeaw.ac.at/indigo_test, which automatically resolves to the page with the details of the collection on ARCHE staging (i.e., https://arche-curation.acdh-dev.oeaw.ac.at/browser/oeaw_detail/1390136).
h2. Arrangement of the collection
The main collection *INDIGO Test Collection* is what is called in ARCHE a *Top Collection*, the main folder that contains all the data related to the collection: https://id.acdh.oeaw.ac.at/indigo_test.
This contains two *Collections*, i.e. two "sub-folders", which correspond to the two batches of data sent by Benjamin and Geert:
# *Large Ortophotos* (https://id.acdh.oeaw.ac.at/indigo_test/large_orthos)
Four large TIFF files sent by Benjamin Wild, with sizes ranging from 270 MB to 4 GB.
# *Test Photos* (https://id.acdh.oeaw.ac.at/indigo_test/test_photos)
Eight test photos sent by Geert Verhoeven, including two color checkers, in different formats and with accompanying metadata files.
Each file contained in these Collections is called a *Resource* in the ARCHE ontology.
h2. Test Photos
I would suggest to start working with the *Test Photos* collection, since it is now the most complete with different formats and metadata.
More precisely, each picture was provided by Geert in both *JPG* and *NEF* format (Nikon proprietary RAW format) and is accompanied by an *XMP sidecar file*, containing metadata to the picture. More information about the different metadata formats can be found in Geert's "info document":https://docs.google.com/document/d/1cFZLY2I8HV9FE9s18v883R3vboTQu6gp/edit?usp=share_link&ouid=103619898417018108798&rtpof=true&sd=true.
In addition, each picture was processed by means of "ExifTool":https://exiftool.org. All the metadata contained in the JPG file, NEF file, and XMP file were combined into one single JSON file, where each line contains a specific property with a tag identifying its metadata schema. For example: @"IPTC:Sub-location": "Donaukanal"@. These metadata files are identified by the suffix @_metadata@. When specific metadata properties coming from different files (JPG, NEF, XMP) did not have the same value in each of the sources, they were moved to a different metadata file, identified by the suffix @_not_unique_values@.
Each of these metadata files is of class *Metadata* in the ARCHE ontology, and it is linked to the original file through property @acdh:isMetadataFor@. You can see the relationship in the GUI too, by viewing the Details page of a metadata file (e.g., https://arche-curation.acdh-dev.oeaw.ac.at/browser/oeaw_detail/1390181):
Otherwise, if you view the Details of the original file (e.g., https://arche-curation.acdh-dev.oeaw.ac.at/browser/oeaw_detail/1390166), you can find the info by switching to the *Expert-View* (which is in general very useful for viewing more metadata about a resource):
and then scrolling to the *Inverse Data* section:
Therefore, given @INDIGO_2022-07-22_Z7II-A_0007@ as name of one picture, in the *Test Photos* collection you can find five different resources about this picture:
# @INDIGO_2022-07-22_Z7II-A_0007.jpg@
# @INDIGO_2022-07-22_Z7II-A_0007.nef@
# @INDIGO_2022-07-22_Z7II-A_0007.xmp@
# @INDIGO_2022-07-22_Z7II-A_0007_metadata.json@
# @INDIGO_2022-07-22_Z7II-A_0007_not_unique_values.json@