Search for similar names
To prevent duplicates or spelling mistakes a search for similar names will be implemented.
- Adding Python library fuzzywuzzy which uses Levenshtein Distance to calculate the differences between sequences
- Option to select ratio
- Option to select class
Ideas for next version
- Search for a manual added string
- Add a check when inserting an entity and warn if a similar name already exists
Updated by Alexander Watzinger about 1 year ago
Or maybe use PostgreSQL with install postgresql-contrib:
CREATE EXTENSION pg_trgm;
CREATE INDEX trgm_idx ON model.entity USING GIST (name gist_trgm_ops);
select (similarity(n1.name, n2.name)) as sim, n1.name, n2.name
from model.entity n1, model.entity n2
n1.id != n2.id
and n1.system_type = 'place'
and n2.system_type = 'place'
and similarity(n1.name, n2.name) > .7
order by sim desc;