feature request #6452open
Deduplicate equal subareas like TDWG level 4 -00 areas
To avoid duplications in distribution representations it is suggested in #4411#note-13 and #5286 to add an option to DescriptionUtility.filterDistributions() that redundant (because equal) subareas with same distribution status should be deduplicated and all annotations, sources, (extensions, ...) should be added to the according higher level area.
This makes sense and shuold be implemented. The problem occurs especially in the TDWG areas vocabulary, where all TDWG level 4 areas ending on -OO are exactly equal (in terms of the area itself) to the according TDWG level 3 area.
The problem now is how to recognize that such areas are equal. An equal label is not enough (and maybe even not a necessary condition as labels may be changed without semantic change).
One solution for the future might be to create term relationships of type "congruent". Term relationships are a planned feature. However, as long as this is not yet implemented we may need other solutions.
The simplest work around might be for the TDWG areas only to simply check if the idInVocabulary ends on "-OO", level is TDWG level 4 and vocabulary is TDWG areas. This can easily be implemented immediately.
Also possible is to have a flag/marker, saying that this area is similar to the next higher area. If handled as a marker, we need to add the marker type, and add a "true" marker of this type to all according areas. An update script is required here.
If handled as a flag for areas a model change is needed (including an update script for existing data).
For now I would suggest to implemented the TDWG work around only as we should wait if this problem regularly applies to other terms/vocabularies too. If not, it is maybe not worth implementing a general but more complex solution.
Once this ticket is solved remove the hotfix implemented for #5286
Updated by Andreas Müller almost 5 years ago
To me there is 1 open question: for maps it sometimes makes sense to have areas only from one level and therefore we may want to prefer TDWG level4 over level3. Is this still true, and if yes will it create problems when computing the areas for both maps and textual representation in the same service call?