Project

General

Profile

Feature #1848

Updated by Alexander Watzinger about 1 year ago

For the INDIGO project we will are trying to import data from already stored in ARCHE to OpenAtlas. 

 h1. Info on INDIGO test collection on ARCHE 

 A test collection with data provided by Geert Verhoeven and Benjamin Wild was imported into the "staging instance":https://arche-curation.acdh-dev.oeaw.ac.at of ARCHE, hosted on Minerva. The test collection has identifier https://id.acdh.oeaw.ac.at/indigo_test, which automatically resolves to the page with the details of the collection on ARCHE staging (i.e., https://arche-curation.acdh-dev.oeaw.ac.at/browser/oeaw_detail/1390136). 

 h2. Arrangement of the collection 

 The main collection *INDIGO Test Collection* is what is called in ARCHE a *Top Collection*, the main folder that contains all the data related to the collection: https://id.acdh.oeaw.ac.at/indigo_test. 

 This contains two *Collections*, i.e. two "sub-folders", which correspond to the two batches of data sent by Benjamin and Geert: 
 # *Large Ortophotos* (https://id.acdh.oeaw.ac.at/indigo_test/large_orthos) 
 Four large TIFF files sent by Benjamin Wild, with sizes ranging from 270 MB to 4 GB. 
 # *Test Photos* (https://id.acdh.oeaw.ac.at/indigo_test/test_photos) 
 Eight test photos sent by Geert Verhoeven, including two color checkers, in different formats and with accompanying metadata files. 

 Each file contained in these Collections is called a *Resource* in the ARCHE ontology. 

 h2. Test Photos 

 I would suggest to start working with the *Test Photos* collection, since it is now the most complete with different formats and metadata. 
 More precisely, each picture was provided by Geert in both *JPG* and *NEF* format (Nikon proprietary RAW format) and is accompanied by an *XMP sidecar file*, containing metadata to the picture. More information about the different metadata formats can be found in Geert's "info document":https://docs.google.com/document/d/1cFZLY2I8HV9FE9s18v883R3vboTQu6gp/edit?usp=share_link&ouid=103619898417018108798&rtpof=true&sd=true. 

 In addition, each picture was processed by means of "ExifTool":https://exiftool.org. All the metadata contained in the JPG file, NEF file, and XMP file were combined into one single JSON file, where each line contains a specific property with a tag identifying its metadata schema. For example: @"IPTC:Sub-location": "Donaukanal"@. These metadata files are identified by the suffix @_metadata@. When specific metadata properties coming from different files (JPG, NEF, XMP) did not have the same value in each of the sources, they were moved to a different metadata file, identified by the suffix @_not_unique_values@. 

 Each of these metadata files is of class *Metadata* in the ARCHE ontology, and it is linked to the original file through property @acdh:isMetadataFor@. You can see the relationship in the GUI too, by viewing the Details page of a metadata file (e.g., https://arche-curation.acdh-dev.oeaw.ac.at/browser/oeaw_detail/1390181): 

 !{width:800px}GUI_isMetadataFor.png! 

 Otherwise, if you view the Details of the original file (e.g., https://arche-curation.acdh-dev.oeaw.ac.at/browser/oeaw_detail/1390166), you can find the info by switching to the *Expert-View* (which is in general very useful for viewing more metadata about a resource): 

 !{width:800px}GUI_ExpertView.png! 

 and then scrolling to the *Inverse Data* section: 

 !{width:800px}GUI_InverseData.png! 

 Therefore, given @INDIGO_2022-07-22_Z7II-A_0007@ as name of one picture, in the *Test Photos* collection you can find five different resources about this picture: 
 # @INDIGO_2022-07-22_Z7II-A_0007.jpg@ 
 # @INDIGO_2022-07-22_Z7II-A_0007.nef@ 
 # @INDIGO_2022-07-22_Z7II-A_0007.xmp@ 
 # @INDIGO_2022-07-22_Z7II-A_0007_metadata.json@ 
 # @INDIGO_2022-07-22_Z7II-A_0007_not_unique_values.json@ 

Back