Project

General

Profile

Actions

Feature #1617

closed

CONNEC: Transform sub events links to source links

Added by Alexander Watzinger almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
-
Start date:
2021-12-09
Estimated time:

Description

In the course of the presentation site development we discovered relations between events and sub events in CONNEC that needs fixing to a "this letter mentioned another letter" relation.
Because they are too many to solve manually we will think about an automatic solution.

Test link provide by Becca: https://connec.acdh.oeaw.ac.at/entity/13064


Files

connec_letter.png (33.3 KB) connec_letter.png Alexander Watzinger, 2022-01-12 15:48
Actions #1

Updated by Alexander Watzinger almost 3 years ago

  • Description updated (diff)
  • Status changed from Assigned to In Progress
I took a look at it, there are 623 links to sub events so fixing them manually won't be an option.
After giving it some thought I think it would be best to:
  • Link the artifact of the sub event to the source of the artifact of the super event
  • Delete the sub event link

In the model this would than mean that in the source (of the artifact of the super event) other letters (the artifacts of the sub event) are mentioned.

I will have to dig deeper to see if all prerequisites are full filled, e.g. every event has an artifact, there are no sub events of sub events.
If everything looks ok it could be done with an SQL statement, otherwise we have to think about writing a Python script because this won't be doable just with plain SQL.

I'll keep you updated.

Actions #2

Updated by Alexander Watzinger almost 3 years ago

It took some time but I managed to come up with a pure SQL solution. We can look together into it at the next meeting.
This script can be used by an admin (e.g. me) at admin/SQL execute. The xxx have to be replaced (3 times) with the id of the "super" event and will than replace sub events links with links from the source of the super event to artifacts of the sub events.
It has it's limitation, e.g. that
  • We have to do one super event at a time
  • The super event must have sub events
  • The super event is connected to an (moved) artifact which is connected to a source
  • The sub events are connected to an (moved) artifact

In practice we can just try it out. Because code submitted in the backend SQL interface is in a transaction it will do everything (if it works ok) or nothing.

INSERT INTO model.link (domain_id, property_code, range_id)
SELECT domain.range_id, 'P67', range.range_id FROM

(SELECT event_artifact.range_id
FROM model.link sub_super
JOIN model.link event_artifact
  ON sub_super.domain_id = event_artifact.domain_id
  AND event_artifact.property_code = 'P25'
  AND sub_super.property_code = 'P117'
WHERE sub_super.range_id = xxx) domain
,
(SELECT artifact_source.range_id
FROM model.link artifact_source
JOIN model.link event_artifact 
    ON artifact_source.domain_id = event_artifact.range_id
    AND event_artifact.property_code = 'P25'
    AND artifact_source.property_code = 'P128'    
WHERE event_artifact.domain_id = xxx) range;

DELETE FROM model.link WHERE property_code = 'P117' AND range_id = xxx;

Actions #3

Updated by Alexander Watzinger almost 3 years ago

Actions #4

Updated by Alexander Watzinger almost 3 years ago

After showing how it works in today's meeting with decided:

  • We will run the script on the test server so that we can take a look at the changes without risking unwanted changes in the productive system
  • In cases where transformation isn't possible the "is sub event of" link will be kept and a list of them will be sent to check them
  • Once we are satisfied with the result we can run the script on the productive data
Actions #5

Updated by Alexander Watzinger almost 3 years ago

I wrote a script which transforms super events where possible but leaves the one that couldn't be transformed.
The transformed database was uploaded to the test server and a list of the problematic ones (233 of 627) was sent to the CONNEC team.

Actions #6

Updated by Alexander Watzinger almost 3 years ago

Becca looked into the provided test instance and the script seems to work out ok.
But because of problems with different PostgreSQL server versions we decided to wait for the server move before running the script on the productive data.

Actions #7

Updated by Alexander Watzinger almost 3 years ago

Closing this issue because script was already executed for CONNEC data on new server.

Actions #8

Updated by Alexander Watzinger almost 3 years ago

  • Status changed from In Progress to Closed
Actions

Also available in: Atom PDF