Actions
Feature #2568
closedAdmin interface for generating ARCHE dumps
Status:
Closed
Priority:
Normal
Assignee:
Category:
Backend
Target version:
Start date:
2025-06-24
Estimated time:
Description
Create an admin interface with the magical one click button to create an all-inclusive ARCHE dump.
This dump should include:- Files sorted by extension ✅
- Metadata for files in a ttl (#2466) ✅
- Lists of failed files due to ARCHE restrictions (no license, no creator, no license holder, etc.) ✅
- Stripped SQL dump ✅ (currently, all data is in there)
- RDF dump (#2551) ✅
One major issue is, how the ARCHE metadata is stored/transferred into OpenAtlas. Currently, the data is stored in the production.py, which is not very handy to handle.
Done:- change folder structure into
data,metadata, anddebug✅ - add statistic to debug (how large, how many files, how many folders, etc.) ✅
- check for duplicate files via hashes ✅
- add the reference as named entity of class acdh:Publication and link it with acdh:isSourceOf ✅
- convert URL to ASCII ✅
- Enrich the description of files ✅
- if linked entity has ext ref system URL, check with arche_assets, if correct, then add a link to a new entity (only for Actors and Places)✅
- if no ext ref system, then just add a named entity ✅
- Add file checker ✅ --> functionality will be used in #2580
- create API endpoint for ARCHE metadata ✅
- write manual entry (what is needed, where and who can enter metadata, who can export, which file checkers are there) ✅
- SQL dump with only the project data -> #2613
Files
Updated by Bernhard Koschiček-Krombholz 6 months ago
- Related to Feature #2551: Admin interface for generating RDF dumps added
- Related to Feature #2466: API: Export files with ARCHE RDF metadata added
Updated by Bernhard Koschiček-Krombholz 6 months ago
The view is completed. Thing to discuss:
- Where to store the ARCHE relevant metadata
- stored in config.py/production.py (current solution)
- upload a json/toml/yml with the metadata, which is discarded at the end of the process
- store it in the database
- store different configuration in files/ with json/toml/yml
- ...
Updated by Bernhard Koschiček-Krombholz 6 months ago
- Status changed from In Progress to Resolved
Updated by Bernhard Koschiček-Krombholz 5 months ago
- File TopCollection.xlsx TopCollection.xlsx added
There is a CSV for the topCollection. Maybe use this instead of the config variables.
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Status changed from Resolved to In Progress
Put it back to in progress, there are still some things to do:
topCollection information as csvwill be included manually- change folder structure into
data,metadata, anddebug - maybe add statistic to debug (how large, how many files, how many folders, etc.) -> also used for #2580
- maybe check for duplicate files via hashes (but ARCHE will do the same with their file checker)
- SQL dump with only the project data
- if there is a reference, add the reference as named entity of class acdh:Publication and link it to media file with acdh:isSourceOf
- Enrich the description of files
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Again new todos:
- convert URL to ASCII
- Add file checker
- No license
- No creator
- No license holder
- Maybe duplicated file hash
- write manual entry (what is needed, where and who can enter metadata, who can export, which file checkers are there)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Description updated (diff)
Updated by Bernhard Koschiček-Krombholz 5 months ago
- Related to Feature #2580: Report generation for ARCHE import issues added
Updated by Bernhard Koschiček-Krombholz 4 months ago
- Description updated (diff)
- Status changed from In Progress to Resolved
Updated by Bernhard Koschiček-Krombholz 4 months ago
- Related to Feature #2613: ARCHE export: SQL dump added
Updated by Bernhard Koschiček-Krombholz 3 months ago
- Status changed from Resolved to Closed
Actions