Project

General

Profile

Feature #2568

Updated by Bernhard Koschiček-Krombholz 5 months ago

Create an admin interface with the magical one click button to create an all-inclusive ARCHE dump. 

 This dump should include: 
 * Files sorted by extension ✅ 
 * Metadata for files in a ttl (#2466) ✅ 
 * Lists of failed files due to ARCHE restrictions (no license, no creator, no license holder, etc.) ✅ 
 * Stripped SQL dump ✅ (currently, all data is in there)  
 * RDF dump (#2551) ✅ 

 One major issue is, how the ARCHE metadata is stored/transferred into OpenAtlas. Currently, the data is stored in the production.py, which is not very handy to handle.  

 Todo:  
 * change folder structure into @data@, @metadata@, and @debug@ 
 * maybe add statistic to debug (how large, how many files, how many folders, etc.) -> also used for #2580 
 * maybe check for duplicate files via hashes (but ARCHE will do the same with their file checker) 
 * SQL dump with only the project data 
 * if there is a reference, add the reference as named entity of class acdh:Publication and link it to media file with acdh:isSourceOf 
 * Enrich the description of files 
 ** if linked entity has ext ref system URL, check with "arche_assets":https://github.com/acdh-oeaw/arche-assets?tab=readme-ov-file#python, if correct, then add a link to a new entity (only for Actors and Places) 
 ** if no ext ref system, then just add a named entity 
 * convert URL to ASCII 
 * Add file checker 
 ** No license 
 ** No creator 
 ** No license holder 
 ** Maybe duplicated file hash 
 * write manual entry (what is needed, where and who can enter metadata, who can export, which file checkers are there)

Back