Feature #1997
closedPerformance: improve type views
Added by Stefan Eichert over 1 year ago. Updated over 1 year ago.
Description
When opening/editing/saving e.g. type entities, that have many connected other entities, the performance is very poor. It takes up to one minute or even more to load the respective page. E.g. https://thanados.openatlas.eu/entity/25
Updated by Alexander Watzinger over 1 year ago
- Tracker changed from Bug to Feature
- Subject changed from Performance Issues to Performance: improve type views
- Status changed from New to In Progress
- Assignee set to Alexander Watzinger
- Target version set to 7.12.0
- Found in version deleted (
7.12.0)
I can confirm this issue and will look into it.
Because THANADOS is using the develop branch I tested it with the main branch but it doesn't solve the issue so we can't just switch to the main branch as a quick fix.
I changed the tracker of this issue from bug to feature because technically it is not an error but a combined effect of adding features and THANADOS adding data. Performance improvement is an ongoing task where we are reacting (and depending) on reports of projects using OpenAtlas.
Nevertheless sorry for the inconvenience and thank you for reporting. We try to find a solution as fast as possible.
Updated by Alexander Watzinger over 1 year ago
Like suspected the reason was a combination of a feature and growing data. Because of #1752 (Add a place column for subunits to type view) and a lot of linked entities the view was slow because looking up the "root" place for every entry takes a while.
As a quick fix I disabled this feature and it's much faster now. I also switched THANADOS to the release_candidate branch which is safer for now.
I understand that the place column is very handy so I will look into it. But I'm not sure if I manage to solve the related performance issues before the next release in the upcoming weekend.
Worst case scenario is that we release without the #1752 feature which is still better than to have an unusable slow system.
I will keep you updated.
Updated by Alexander Watzinger over 1 year ago
- Related to Feature #1752: Add a place column for subunits to type view added
Updated by Alexander Watzinger over 1 year ago
- Target version changed from 7.12.0 to 7.15.0
I looked with Bernhard into this issues. It seems if me manage to get all "root" places for every entity in one go with a recursive function, it would still be fast enough.
While implementing we ran into some SQL troubles when trying a recursive function that only returns the "top" place which should be possible in theory, see: https://stackoverflow.com/questions/66281034/postgresql-select-only-last-row-from-each-recursion
Sadly this issue turned up too close to the release this weekend. So for now we have to leave the function to display root places at type views deactivated but will look into it before the next release.
Updated by Stefan Eichert over 1 year ago
If you could post details on the desired result and the data sources (tables, fields etc.) I will design the recursive query.
Updated by Alexander Watzinger over 1 year ago
What we have so far is something like this:
WITH RECURSIVE items AS (
SELECT range_id, domain_id
FROM model.link
WHERE range_id IN (177676, 145352) AND property_code = 'P46'
UNION
SELECT l.range_id, l.domain_id FROM model.link l
INNER JOIN items i ON
l.range_id = i.domain_id
AND l.property_code = 'P46'
) SELECT range_id, domain_id FROM items;
In this example we only use the id of an artifact (177676) and a human remains (145352) from the THANADOS data set which will than be replaced with all the ids from the table of connected entities at a type view.
The result would look like this:
range_id | domain_id |
---|---|
177676 | 176874 |
176874 | 176734 |
176734 | 176118 |
145352 | 81239 |
81239 | 63533 |
63533 | 47361 |
entity_id | place_id |
---|---|
177676 | 176118 |
145352 | 47361 |
This should be possible, we looked at an working example given here: https://thanados.openatlas.eu/entity/25. But table structure is different and we didn't manage to adapt it to work for our case too.
Help with that would be appreciated. Feel free to ask if more information is needed.
Bonus points if the result also contains the place name, than we could use that to build a link :)
Updated by Stefan Eichert over 1 year ago
This should do the job:
WITH RECURSIVE parent_tree AS (
SELECT p.parent_id, p.child_id, ARRAY [p.child_id] AS path, 1 AS depth
FROM (SELECT domain_id as parent_id, range_id as child_id FROM model.link WHERE property_code = 'P46') p
WHERE p.child_id IN (177676, 145352, 81239, 176118)
UNION ALL
SELECT t.parent_id, t.child_id, pt.path || ARRAY [t.child_id], pt.depth + 1
FROM (SELECT domain_id as parent_id, range_id as child_id FROM model.link WHERE property_code = 'P46') t
JOIN parent_tree pt ON pt.parent_id = t.child_id
),
root_nodes AS (
SELECT DISTINCT ON (path[1]) path[1] AS child_id, parent_id AS top_level
FROM parent_tree
WHERE parent_id IS NOT NULL
ORDER BY path[1], depth DESC
)
SELECT DISTINCT a.child_id AS start_node, r.top_level, e.name
FROM root_nodes r
JOIN parent_tree a ON a.child_id = r.child_id
JOIN model.entity e ON e.id = r.top_level
ORDER BY a.child_id
It gets the top level parent of a respective starting point entity and the name of that top level entity.
The array of child_ids, in this example "(177676, 145352, 81239, 176118)" should be modified to your needs as well as the property code.
Also if the property is inverse e.g. if the domain_id does not mark the parent but the range_id (e.g. for P127) then exchange domain_id and range_id in the subquery:
WITH RECURSIVE parent_tree AS (
SELECT p.parent_id, p.child_id, ARRAY [p.child_id] AS path, 1 AS depth
FROM (SELECT range_id as parent_id, domain_id as child_id FROM model.link WHERE property_code = 'P127') p
WHERE p.child_id IN (26469, 26467)
UNION ALL
SELECT t.parent_id, t.child_id, pt.path || ARRAY [t.child_id], pt.depth + 1
FROM (SELECT range_id as parent_id, domain_id as child_id FROM model.link WHERE property_code = 'P127') t
JOIN parent_tree pt ON pt.parent_id = t.child_id
),
root_nodes AS (
SELECT DISTINCT ON (path[1]) path[1] AS child_id, parent_id AS top_level
FROM parent_tree
WHERE parent_id IS NOT NULL
ORDER BY path[1], depth DESC
)
SELECT DISTINCT a.child_id AS start_node, r.top_level, e.name
FROM root_nodes r
JOIN parent_tree a ON a.child_id = r.child_id
JOIN model.entity e ON e.id = r.top_level
ORDER BY a.child_id
Updated by Alexander Watzinger over 1 year ago
- Status changed from In Progress to Closed
- Target version changed from 7.15.0 to 7.12.0
Thanks a lot Stefan, this is great and also cool that you got it so quick so that we can add it to the release.
It's already in branch release_candidate and also at THANADOS, so https://thanados.openatlas.eu/entity/25 is much faster now.
I implemented it in a way that you can pass a property, but will also implement "inverse" direction and the like when we have a need for it.
Updated by Alexander Watzinger over 1 year ago
I decided to make the function more generalized now when already at it.
So now we have the function get_roots() at model/entity which takes a property code, a list of ids and an inverse flag as parameters.
It returns a dict with the provided ids as keys and a dict with the root id and root name (needed to build links) as values. For root places it would look like this.
Entity.get_roots('P46', [123, 128], inverse=True)
{
123: {'id': 5, 'name': 'first_root_name'},
128: {'id': 8, 'name': 'second_root_name'}
}
Thanks again Stefan for support with the SQL statement, it works great!