Project

General

Profile

Actions

feature request #6794

open

Improve term structure

Added by Andreas Müller over 5 years ago. Updated about 1 month ago.

Status:
In Progress
Priority:
Highest
Category:
cdm
Target version:
Start date:
Due date:
% Done:

20%

Estimated time:
Severity:
normal
Tags:

Description

Terms, terms structures and term collections need to be improved. Currently the only structures we have for terms are vocabularies (ordered and unordered) and part-of and kind-of relations, where for each term only exactly one parent kind-of and one parent part-of is allowed. Further more we do have FeatureTree and -Node as structure but only for features.

Generally what we need are

  1. containers to group terms in collections, where each term might be part of many collections. how ever it is probably a good idea to let a term belong to exactly one vocabulary.
  2. structures to sort terms also in the above containers
  3. structures to build term trees according to relationship types like part-of and kind-of (do we also need mixed trees? does it make sense)
  4. concept relationships between terms similar to taxon relationships with a relationship type based on set semantic (taxon relationship types can be reused) or other semantic

To achieve this we could do the following

  • Open FeatureTree to all terms, rename it TermTree, add termtype to it to define what kind of term is handled (but discuss if a tree could also have different types included, e.g. leaves could be of different type than inner nodes (SDD)), so we could further define termtype, e.g. by being a set or defining different types for root, inner nodes and leaves, ...). These trees can also be used for sorted lists and maybe even for unordered sets of terms. FeatureNodes may be renamed to TermNode (1, 2, 3)
  • Common base class (or at least interface) TermCollection for vocabularies and TermTree to easy retrieve term collections available to choose from. (1)
  • New class TermRelationship that combines terms with a relationship and relationship type, similar to taxon relationships

A general problem is when should a term be referenced directly and when should we use a term node which is more or less a wrapper around a term which is a term within a given collection. We need this for 2 and 3 somehow (even if for 2 we only have a Set/List on TermCollection side on relational level we still need an M:N table as a term can be part of >1 such collections. If such a table is needed we can also make it a class and attache some more attributes if required.

Term relationships could work without such a TermNode but then we have different structures to define relationships between terms, one that uses term nodes as source and target of the relationships and one directly linking to terms, and if we keep the current part-of and kind-of structure a 3rd one that linking directly from one term to another (source information - "who says this is a child" is difficult to handle in this).

The simplest solution would be to only use TermNode for all relations. But this might be an overhead and also the handling in tree structures is still different to undefined relationships as the tree structure has a term as source but a node as target while general relationships have either both terms or nodes.

We could also consider nodes and relationships to be very similar, one handles a structure where each node can only be one time part of a relationship within a given graph (however in theory it is possible that a term is source for many relationships in this graph as terms may have multipe nodes), but these graphs are difficult to travers . Relationships require alternating traversion, from rel to term to rel to term ...
But as they differ only in the target type we may have them with a common base class (but then TermRelationship can not inherit from RelatioshipBase!) and store in the same table.
This makes it easy to query all terms belonging to a TermCollection as such a base class may have the attribute "termCollection" similar to current "featureTree" in FeatureNode or "classification" in TaxonNode. This is a unidirectional relationship as it may be a huge set. An exception might be terms belonging to a vocabulary as they directly point to their vocabulary (or can this be handled differently? We could also use something like TermNode but this is overhead and not necessary

Another open question is if we want to allow any kind of structure within vocabularies. Probably it is better to not allow semantic structure as this could lead to inconsistency. But purely to make terms easier to find it might be helpful to use hierarchies in large vocabularies.
For consistency it might be helpful to also use a term tree for this, algorithms may only be implemented once then. However, the model becomes more complex then with each term being represented by 2 objects instead of one. The first representing the term, the second representing its positions in the vocabulary.

If we don't have a hierarchy structure within vocabularies, vocabularies with existing structure such as TDWG areas may require an extra graph to represent the hierarchie, which is maybe not wanted. Hierarchy might become an optional attribute of a vocabulary, with still each term directly linking to the vocabulary but optionally being part of a tree which the vocabulary holds the root for.

Questions

What about inapplicableif and onlyapplicableif, can this be used for other term types too, or do we still need special FeatureNodes or simply don't use these fields for other term nodes.


Related issues

Related to EDIT - feature request #8123: Make available features a DB preferenceClosedKatja Luther

Actions
Related to EDIT - task #7515: TypeDesignationStatusComparator to sort by vocabulary first and then by term orderResolvedAndreas Müller

Actions
Related to EDIT - feature request #8146: Adapt taxeditor UI to structure and property term typeClosedPatrick Plitzner

Actions
Related to EDIT - task #8166: Adapt dataportal to term structure changesClosedAndreas Kohlbecker

Actions
Related to EDIT - bug #8251: LazyInitializationException (LIE) in featureTree webserviceClosedAndreas Müller

Actions
Related to EDIT - feature request #8241: Rename FeatureTreeEditor to TermTreeEditorClosedPatrick Plitzner

Actions
Related to EDIT - task #8405: Adapt cdm-vaadin to term structure changesRejectedAndreas Kohlbecker

Actions
Related to EDIT - bug #8407: Fix "FeatureTest"NewAndreas Müller

Actions
Related to EDIT - task #8434: Change FeatureTree contollers to TermTree controllers ClosedAndreas Kohlbecker

Actions
Related to EDIT - feature request #8432: Improve "default" feature tree handlingIn ProgressAndreas Müller

Actions
Related to EDIT - feature request #8474: Make TermCollection.orderRelevant usable for TermTreesClosedKatja Luther

Actions
Related to EDIT - feature request #8476: Implement support for TermCollection.isFlat in TaxEditorClosedKatja Luther

Actions
Related to EDIT - feature request #8477: Implement support for TermCollection.allowDuplicates in TaxEditorClosedPatrick Plitzner

Actions
Related to EDIT - task #8466: Delete FeatureTree tablesClosedAndreas Müller

Actions
Related to EDIT - feature request #8647: Allow selecting named areas, ranks, presence-absence terms from term trees/collectionsNewKatja Luther

Actions
Related to EDIT - feature request #6849: [DISCUSS] How to handle "kind of" term creation in TermEditorIn ProgressAndreas Müller

Actions
Related to EDIT - feature request #9409: [Decision] Which terms should be editable in term tree editor (and vocabulary editor)NewAndreas Müller

Actions
Related to EDIT - feature request #9502: Implement subarea preference rule and fallback areas for areas with complex hierarchyNewAndreas Müller

Actions
Related to EDIT - bug #9784: Update script for term order of ordered CDM termsNewAndreas Müller

Actions
Related to EDIT - bug #10198: Termremoval for OrderedTermVocabularies does not work correctly due to incorrect compareTo and orderIndex decrementClosedAndreas Müller

Actions
Related to EDIT - task #10201: Make synonym type an enumClosedAndreas Müller

Actions
Related to EDIT - feature request #10196: Hybrid structure-state termsClosedAndreas Müller

Actions
Related to EDIT - bug #9987: Handle areas of different vocabularies in comparatorNewKatja Luther

Actions
Related to EDIT - bug #6343: TermVocabularies of OrderedTerms must be OrderedVocabulariesIn ProgressAndreas Müller

Actions
Related to EDIT - bug #8270: TermDTOs with an orderIndex are ordered in unordered term vocabularies NewKatja Luther

Actions
Blocks EDIT - feature request #9305: Include additional TypeDesignationStatus sort order (term tree) in CDM termsNewAndreas Müller

Actions
Actions #1

Updated by Andreas Müller over 5 years ago

  • Description updated (diff)
Actions #2

Updated by Andreas Müller over 5 years ago

  • Description updated (diff)
Actions #3

Updated by Andreas Müller over 5 years ago

  • Description updated (diff)
Actions #4

Updated by Andreas Müller over 5 years ago

  • Target version changed from Unassigned CDM tickets to CDM UML 5.0
Actions #5

Updated by Andreas Müller almost 5 years ago

  • Priority changed from New to Highest
Actions #6

Updated by Andreas Müller over 4 years ago

  • Target version changed from CDM UML 5.0 to CDM UML 5.5
Actions #7

Updated by Andreas Müller almost 4 years ago

Actions #8

Updated by Andreas Müller almost 4 years ago

  • Related to task #7515: TypeDesignationStatusComparator to sort by vocabulary first and then by term order added
Actions #9

Updated by Patrick Plitzner almost 4 years ago

Actions #11

Updated by Andreas Müller almost 4 years ago

Current idea for a solution:

Collections:

  • Common base class TermCollection(Base) for all Vocabularies and Graphs (Trees, Lists, Relationship Graphs)
    • TermVocabulary: has a Set of Terms
    • OrderedTermVocabulary: like TermVocabulary but with additional TermTree functionality, where all terms are at the same time terms within the TermTree, allows to define a Vocabulary and its default feature tree at the same time
    • TermTree: like current FeatureTree, for sets, lists and ordered and unordered trees
    • TermGraph: Set of TermRelationships
  • Attributes for trees: orderRelevant, allowsDuplicates, isFlat; for graphs: directed
  • common method: getElements

Relationship:

  • common base class TermRelationshipBase
    • Relates to exactly 1 TermCollection
  • TermNode (or TermTreeNode) for hierarchical and flat structures, relates a child term to the parent/root node, allows tree index
  • TermRelationship for Graph structure, relates 2 terms (child, relatedToTerm)

Terms:

  • remove OrderedTermBase
  • term belongs to single vocabulary
  • term has multiple TermRelationshipBases

Others:

  • Collections and TermNode do have count cache

Open issues:

  • root = collection? => difficult due to inheritance from IdentifiableEntity
  • how to call "child" in TermRelationshipBase (fromTerm, child, baseTerm)
  • need to distinguish Term.fromRelationships and Term.toRelationships?
  • interface for subtrees as they are also term collections
Actions #12

Updated by Andreas Kohlbecker almost 4 years ago

  • Related to task #8166: Adapt dataportal to term structure changes added
Actions #13

Updated by Andreas Müller almost 4 years ago

  • Related to bug #8251: LazyInitializationException (LIE) in featureTree webservice added
Actions #14

Updated by Andreas Müller almost 4 years ago

Actions #15

Updated by Andreas Müller almost 4 years ago

  • Target version changed from CDM UML 5.5 to CDM UML 5.15
Actions #16

Updated by Andreas Müller over 3 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 20
Actions #17

Updated by Andreas Müller over 3 years ago

  • Target version changed from CDM UML 5.15 to CDM UML 5.8
Actions #18

Updated by Andreas Kohlbecker over 3 years ago

  • Related to task #8405: Adapt cdm-vaadin to term structure changes added
Actions #19

Updated by Andreas Müller over 3 years ago

  • Related to bug #8407: Fix "FeatureTest" added
Actions #20

Updated by Andreas Kohlbecker over 3 years ago

  • Related to task #8434: Change FeatureTree contollers to TermTree controllers added
Actions #21

Updated by Andreas Kohlbecker over 3 years ago

Actions #22

Updated by Andreas Kohlbecker over 3 years ago

  • Description updated (diff)
Actions #23

Updated by Andreas Müller over 3 years ago

Actions #24

Updated by Andreas Müller over 3 years ago

Actions #25

Updated by Andreas Müller over 3 years ago

  • Related to feature request #8477: Implement support for TermCollection.allowDuplicates in TaxEditor added
Actions #26

Updated by Andreas Müller over 3 years ago

  • Target version changed from CDM UML 5.8 to CDM UML 5.15
Actions #27

Updated by Andreas Müller over 3 years ago

  • Related to task #8466: Delete FeatureTree tables added
Actions #28

Updated by Andreas Müller over 3 years ago

  • Related to feature request #8647: Allow selecting named areas, ranks, presence-absence terms from term trees/collections added
Actions #29

Updated by Andreas Müller about 3 years ago

Actions #30

Updated by Andreas Müller over 2 years ago

  • Target version changed from CDM UML 5.15 to CDM UML 5.36
Actions #31

Updated by Andreas Müller about 2 years ago

  • Blocks feature request #9305: Include additional TypeDesignationStatus sort order (term tree) in CDM terms added
Actions #32

Updated by Andreas Müller about 2 years ago

  • Related to feature request #9409: [Decision] Which terms should be editable in term tree editor (and vocabulary editor) added
Actions #33

Updated by Andreas Müller almost 2 years ago

  • Related to feature request #9502: Implement subarea preference rule and fallback areas for areas with complex hierarchy added
Actions #37

Updated by Andreas Müller over 1 year ago

  • Related to bug #9784: Update script for term order of ordered CDM terms added
Actions #39

Updated by Andreas Müller 2 months ago

  • Tags set to terms
Actions #41

Updated by Andreas Müller 2 months ago

  • Related to bug #10198: Termremoval for OrderedTermVocabularies does not work correctly due to incorrect compareTo and orderIndex decrement added
Actions #44

Updated by Andreas Müller about 2 months ago

  • Description updated (diff)
Actions #45

Updated by Andreas Müller about 2 months ago

  • Related to task #10201: Make synonym type an enum added
Actions #46

Updated by Andreas Müller about 2 months ago

Actions #47

Updated by Andreas Müller about 2 months ago

  • Related to bug #9987: Handle areas of different vocabularies in comparator added
Actions #48

Updated by Andreas Müller about 1 month ago

Issues with orderindex in existing database can be found by the following queries which search for records that have not direct predecessor (query1) or have an equal orderindex (query2) :

SELECT dtb.DTYPE, dtb.id, dtb.titleCache, dtb.vocabulary_id, dtb.orderindex, voc.titleCache, dtb.*
FROM DefinedTermBase dtb INNER JOIN TermCollection voc ON voc.id = dtb.vocabulary_id
WHERE dtb.orderindex IS NOT NULL AND dtb.orderindex > 1
 AND NOT EXISTS (SELECT * FROM DefinedTermBase dtb2 WHERE dtb2.orderindex = dtb.orderindex -1 AND dtb.vocabulary_id = dtb2.vocabulary_id)
 ORDER BY dtb.vocabulary_id, dtb.orderindex;

SELECT dtb.DTYPE, dtb.id, dtb.titleCache, dtb.vocabulary_id, dtb.orderindex, voc.titleCache, dtb.*
FROM DefinedTermBase dtb INNER JOIN TermCollection voc ON voc.id = dtb.vocabulary_id
WHERE dtb.orderindex IS NOT NULL 
 AND EXISTS (SELECT * FROM DefinedTermBase dtb2 WHERE dtb2.orderindex = dtb.orderindex AND dtb.vocabulary_id = dtb2.vocabulary_id AND dtb2.id < dtb.id)
 ORDER BY dtb.vocabulary_id, dtb.orderindex;

For query2 there are many issues e.g. in cyprus production (but cyprus seems to be an exception, other old DBs do not have so many issues). Query2 is the more critical as in these cases the "correct" order can not be defined via SQL when transforming the data to the new data model.
Query1 is not critical as it keeps the existing order.

Actions #50

Updated by Andreas Müller about 1 month ago

  • Status changed from In Progress to Duplicate
Actions #53

Updated by Andreas Müller about 1 month ago

  • Status changed from Duplicate to In Progress
Actions #56

Updated by Andreas Müller about 1 month ago

  • Related to bug #6343: TermVocabularies of OrderedTerms must be OrderedVocabularies added
Actions #57

Updated by Andreas Müller about 1 month ago

  • Related to bug #8270: TermDTOs with an orderIndex are ordered in unordered term vocabularies added
Actions

Also available in: Atom PDF