Project

General

Profile

Actions

Bug #2357

closed

Wrong direction for reference links to files

Added by Stefan Eichert 2 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Category:
CRM
Target version:
Start date:
2024-10-08
Estimated time:
Found in version:

Description

When adding a reference to a file this should mark the file to be referred to by the reference. E.g. for citing the publication where an image has been scanned from.
See for example https://thanados.openatlas.eu/entity/196385#tab-reference respectively the query:

SELECT l.id, e.name, e.id, l.property_code, e2.name, e2.id FROM model.link l JOIN model.entity e ON e.id = l.domain_id JOIN model.entity e2 ON e2.id = l.range_id WHERE 196385 IN (domain_id, range_id) AND property_code = 'P67'

this results in:

link_id name1 id1 property_code name2 id2
671041 Portrait of Ferdinand von Hochstetter 196385 P67 Ferdinand von Hochstetter 196330
671042 Hochstetter 1885 196077 P67 Portrait of Ferdinand von Hochstetter 196385
899449 Portrait of Ferdinand von Hochstetter 196385 P67 https://doi.org/10.57756/b291ht 237493

The first link means that the portrait (196385 - file) of Hochstetter shows him (196330) and is correct.
The second means that the book (196077 document) refers to the image (196385 - file) respectively that the image is from the book.
The third one seems buggy as it should like the second: The external reference (in this case a random doi) should refer to the file not the file to the external reference.,

Actions #1

Updated by Alexander Watzinger 2 months ago

  • Subject changed from wrong direction for external references with files to Wrong direction for external references with files
  • Category set to CRM
  • Status changed from New to Assigned
  • Assignee set to Alexander Watzinger
  • Target version set to 8.8.0

Thank you for reporting, especially for providing detailed information.

This is something I have to test und think through but will try to do it soon.

Actions #2

Updated by Stefan Eichert 2 months ago

Thanks for the quick reply.
I have just seen that in the UI the external reference is seen from the file page: https://thanados.openatlas.eu/entity/234690#tab-reference
but the reference page does not show the file: https://thanados.openatlas.eu/entity/237492#tab-file

Actions #3

Updated by Alexander Watzinger 2 months ago

We took a look at it at the developer meeting yesterday. If we understood correctly, the issue is only between file and reference and could be solved with inverting the link direction.
If you (Stefan) could be so kind to confirm this, we would change it in the application and write an SQL update for existing data.
In case we missed the point, we should discuss it further.

And I will also take a look at the missing entries at the reference detail view mentioned in your last comment.

Actions #4

Updated by Stefan Eichert 2 months ago

Yes, I can confirm. Basically there are two different cases: 1. The file/document is a depiction of a certain entity: E.g. The "self portrait of Vincent van Gogh" P67 - refers to - "Actor Vincent van Gogh". The other case would be that the file/document is referred to by "another document". E.g. The document "Gesammelte Werke von Vincent van Gogh - Exhibition Catalog" - refers to - "self portrait of Vincent van Gogh" on page 31, (this works well in OpenAtlas) - but - if the document is an external reference, it does not. E.g. "Wikimedia Commons Url for self portrait of Vincent van Gogh" refers to "self portrait of Vincent van Gogh (file in OA Database)" does not work properly, (it is inversed)

Actions #5

Updated by Alexander Watzinger 2 months ago

I took a closer look at the implementation and have to say, it's definitely a good catch by you, Stefan.
E.g. when at the file detail view in the reference tab depending on if you link to an existing bibliography it's linked in another direction than if you create (and link) a new one. So definitely a bug which we should fix.

I can fix the code and also write an update SQL to fix existing entries but before that I wanted to make sure we get it right, otherwise this could turn really messy.
So my question is, can we define a basic link strategy for refers to, "always used as" refers to combinations with all other entities as well (actor, event, place, ...) e.g.:
  • Bibliography, Edition, Reference Systems (e.g Wikidata, GeoNames) are always used as domain
  • Files are always used as domain except when in combination with Bibliography, Edition
  • External references are always used as domain except when in combination with a file

This should cover all combinations but please take you time to think it through (I might have missed something) and others are welcomed to participate as well.
If we fail to define a basic strategy e.g. sometimes a file and an external reference should be linked with refers to in one direction, and sometimes in the other, this will get a much longer discussion.
Anyway, you raised good points and we should discuss it through before making changes.

Actions #6

Updated by Alexander Watzinger 2 months ago

  • Status changed from Assigned to In Progress

I decided to prepare SQL statements accordingly to last post, check a few existing projects and post the results here.
That might help to decide how to proceed and also give hints if, e.g. presentation sites have to be adapted too.

Actions #7

Updated by Alexander Watzinger 2 months ago

I did some testing, following SQL returns all P67 combinations with OpenAtlas classes for domain/range with counts:

SELECT DISTINCT
    count(l.property_code),
    d.openatlas_class_name AS domain,
    l.property_code,
    r.openatlas_class_name AS range
FROM model.link l
JOIN model.entity d ON l.domain_id = d.id
JOIN model.entity r ON l.range_id = r.id
WHERE l.property_code = 'P67'
GROUP BY l.property_code, d.openatlas_class_name, r.openatlas_class_name
ORDER BY domain, range
Results for THANADOS are below:
  • red ones are invalid for the rules defined before. There are a few but it looks good so far.
  • I also found one file/file link which we should look at separately
  • Don't take my word for granted and check for yourself, it's a little mind boggling when staring too long at this ;)
  • I will try to check other projects next week as well
count domain range
587 bibliography P67 acquisition
102 bibliography P67 activity
1976 bibliography P67 artifact
16 bibliography P67 event
997 bibliography P67 feature
20344 bibliography P67 file
69 bibliography P67 group
53 bibliography P67 human_remains
34 bibliography P67 modification
47 bibliography P67 move
784 bibliography P67 person
11179 bibliography P67 place
1544 bibliography P67 source
1250 bibliography P67 stratigraphic_unit
183 edition P67 acquisition
43 edition P67 activity
47 edition P67 group
1 edition P67 modification
308 edition P67 person
462 edition P67 place
464 edition P67 source
2 external_reference P67 artifact
2 external_reference P67 feature
17 external_reference P67 file
3 external_reference P67 modification
2 external_reference P67 move
15 external_reference P67 person
233 external_reference P67 place
26 external_reference P67 source
3 file P67 acquisition
57 file P67 activity
15682 file P67 artifact
8 file P67 bibliography
3 file P67 creation
3 file P67 event
21 file P67 external_reference
3457 file P67 feature
1 file P67 file
6 file P67 group
45 file P67 human_remains
33 file P67 modification
157 file P67 move
504 file P67 person
6126 file P67 place
1 file P67 production
15 file P67 source
1091 file P67 stratigraphic_unit
26 file P67 type
6 reference_system P67 acquisition
69 reference_system P67 activity
3808 reference_system P67 artifact
18 reference_system P67 creation
16 reference_system P67 event
95 reference_system P67 group
127 reference_system P67 modification
87 reference_system P67 move
128 reference_system P67 person
3639 reference_system P67 place
3 reference_system P67 production
13 reference_system P67 source
1769 reference_system P67 type
1063 source P67 acquisition
239 source P67 activity
3 source P67 artifact
638 source P67 group
1 source P67 modification
2610 source P67 person
3821 source P67 place
Actions #8

Updated by Stefan Eichert 2 months ago

Alexander Watzinger wrote in #note-5:

I took a closer look at the implementation and have to say, it's definitely a good catch by you, Stefan.
E.g. when at the file detail view in the reference tab depending on if you link to an existing bibliography it's linked in another direction than if you create (and link) a new one. So definitely a bug which we should fix.

I can fix the code and also write an update SQL to fix existing entries but before that I wanted to make sure we get it right, otherwise this could turn really messy.
So my question is, can we define a basic link strategy for refers to, "always used as" refers to combinations with all other entities as well (actor, event, place, ...) e.g.:
  • Bibliography, Edition, Reference Systems (e.g Wikidata, GeoNames) are always used as domain
  • Files are always used as domain except when in combination with Bibliography, Edition
  • External references are always used as domain except when in combination with a file

This should cover all combinations but please take you time to think it through (I might have missed something) and others are welcomed to participate as well.
If we fail to define a basic strategy e.g. sometimes a file and an external reference should be linked with refers to in one direction, and sometimes in the other, this will get a much longer discussion.
Anyway, you raised good points and we should discuss it through before making changes.

I think it is even easier. all References as well as Reference Systems should be domain: Reference (Bibliography, Editon, External reference) "P67 - refers to" "whatever entity"

Files can be domain and range depending on a simple decision: is the file showing a certain entity? -> Yes: File "P67 refers to" "certain entity" So all tabs/classes entities in the file view (except for the reference ones) should be linked like described above.
Other case:
Is the file from a certain source-entity? -> Yes: Source entity (Bibliography, External Reference) "P67 refers to" "that one file".

Hope this helps,
best, Stefan

Actions #9

Updated by Alexander Watzinger 2 months ago

Thank you for your feedback. After discussing it with Bernhard we refined the rules:

  • Bibliography, edition, external reference, reference systems (e.g Wikidata, GeoNames) are always used as domain
  • Files are always used as domain except when in combination with above
Actions #10

Updated by Alexander Watzinger about 2 months ago

Just a few notes (mostly for myself).

The issue is less dramatic than assumed. The problem occurs only when being in a file detail view and adding (and linking) a new reference via the "+" buttons.
When adding a reference via the "link" functions it works as expected. When using the "+" function in other views than files it also works as expected.

So to solve the issue we need to
  • Rewrite the "+" function (/insert/<class_>/<int:origin_id>) to behave differently when coming from a file and creating a reference.
  • Write an SQL update that fixes existing P67 links to references which may have the wrong direction caused by this bug (file should be the range).
Actions #11

Updated by Alexander Watzinger about 2 months ago

  • Subject changed from Wrong direction for external references with files to Wrong direction for reference links to files
  • Status changed from In Progress to Closed

The code is fixed in the develop branch and links between references and files should now always been made in the correct direction (file = range).
Also an upgrade SQL is provided (taken care of by the upgrade script) which fixes links with wrong directions caused by this bug.

Once the upgrade script is run it should also fix side effects like e.g. a linked file not being shown at the reference view.

Thanks again Stefan for reporting and providing additional information.

Actions

Also available in: Atom PDF