Project

General

Profile

Discussion how to handle defined terms in the CDM Library

Discussion

Ticket with emails Ben - Andreas: #598

Existing Terms

|| Class|count|user-defined|ordered|multiple|needed in model|needed where|
|1| Language|485|2|0||DEFAULT is needed||
|2| Continent|9|1|0||0||
|3| Rank|62|1|1|0|1|NonViralNameParserImpl.parseFullName|
|4| TypeDesignationStatus|16|1|1||0||
|5| NomenclaturalStatusType|24|1|1||?|needed in getStatusByAbbrev etc. -> needed by Formatter/CacheStrategy|
|6| SynonymRelationshipType|3|0|1|0|1||
|7| HybridRelationshipType|4|1|1|0|?||
|8| NameRelationshipType|10|1-2|1||1|equals and addBasionym();|
|9| TaxonRelationshipType|27|1|1|0|1|Taxon.getTaxonomicChildren, etc.|
|10| MarkerType|4|2|0||0||
|11| AnnotationType|2|1|0|0|0||
|12| NamedAreaType|2|1|0||?|only in TdwgArea.addTdwgArea(NamedArea)|
|13| NamedAreaLevel|9|1|0||?|only in TdwgArea.addTdwgArea(NamedArea)|
|14| NomenclaturalCode|5|0|0||1|getNomenclaturalCode() in TaxonNameBase-derived classes|
|15| Feature|26|3|0||1|in constructor of some DescriptionElementBase classes|
|16| TdwgArea|1040|1|1|1|0||
|17| NamedArea|0|3|1|2|1||
|18| WaterbodyOrCountry|250|2|0||0||
|19| PresenceTerm|18|2|1||0||
|20| AbsenceTerm|1|2|1||0||
|21| Sex|2|1|1||0||
|22| DerivationEventType|8|2|0||0||
|23| PreservationMethod|0|2|0||0||
|24| DeterminationModifier|0|3|0||0||
|25| StatisticalMeasure|8|1|0|0|0||
|26| RightsTerm|3|?|0||0||
|27| BibtexEntryType|0|?|0|0|0||
|28| ExtensionType|0|?|0||0||
|29| InstitutionType|0|?|0||0||
|30| MeasurementUnit|0|2|0|0|0||
|32| ReferenceSystem|0|2|0|0|0||
|33| TextFormat|0|2|0||0||
|34| Keyword|0|3|1||0||
|35| Modifier|0|3|1||0||
|36| State|0|3|1||0||
|40| Scope|0|2|1||0||
|41| Stage|0|2|1||0||
|42| NameTypeDesignationStatus|9|1|0|0|0||

*count: * Number of existing terms in csv files

*user-defined: * necessity to have user defined instances (0: never; 1: very seldom, 2: sometimes, 3: often)

*who: * who may add a new term (list may be incomplete)

*ordered: * The vocabulary is ordered

*multiple: * multiple vocabularies should be allowed (e.g. different NamedArea vocabularies): 0: same vocabulary in all applications; 1: one vocabulary per application; 2: multiple vocabularies per application

*needed in model: * 1 if the 'static' methods are used in other methods in the model, 0 otherwise

*needed where: * description of how the 'static' methods are used

Solutions

1. Mixed Model

  • Make those terms that need to be updated often and that are not used in the model ordinary classes (saving via @Cascade). No static methods are available for these classes. Instances of the classes may be received via the service layer.

  • Make those terms that do not need to be updated or need to be updated very seldom and that are used in the model an enum. Extension is only possible via changing the code.

    • List of enums:
    • NomenclaturalCode
    • SynonymRelationshipType
    • HybridRelationshipType
    • TaxonRelationshipType
    • List of unclear classes
    • Rank (tendency: make it an enum and think later about possibilities to extend it)
    • NomenclaturalStatusType ()
    • NameRelationshipType (tendency: make it an enum as all other relationships are also enum)
    • List of ordinary classes
    • all other classes
    • Optional: Keep some classes as classes that have to be initialized (see Existing Implementation ). No cascading is realized for these classes to keep objects unique etc.
    • Define an interface IDefinedTerm that both implement
    • Problems to be solved:
    • Representations for the enums
    • Vocabularies for enums including different vocabularies for different codes
    • t.b.c.
  • Discussion on single classes

    • Ranks:
    • Pro enum:
      • Domain logic is based on the ranks. E.g. parsers and formatters need to know if a rank is suprageneric, infrageneric, etc. or what abbreviation is used for them in general
      • Additions to the vocabulary are expected to be very seldom
    • Con enum:

      • the order for ranks may differ slightly for different application. This applies to old ranks or infra-ranks like tax.infrasp. or tax.infragen.

      Possible Solution: Having different vocabularies which store the order separatly like the feature trees. Different vocabularies for the different codes are needed anyway.

  • static methods do not have to be available before either connecting to the database or using the model in an unpersistent way (e.g. a web-service for parsing names)

!NomenclaturalStatusType:

  • Pro enum:

    • Domain logic is based on them. E.g. parsers and formatters need to know their abbreviated representation
  • Con enum:

    • Additions to the vocabulary are expected to occurr very seldom but wanted by the user (experience from the Berlin Model)
    • static methods do not have to be available before either connecting to the database or using the model in an unpersistent way (e.g. a web-service for parsing names)
    • Order is not so important (thus implementation as a class easier)
  • !NameRelationshipType:

    • Pro enum:
    • Domain logic is based on them. E.g. TaxonNameBase and TaxonBase have methods like addBasionym or @ nameRelation.equals(NameRelationshipType.BASIONYM()) @
    • Additions to the vocabulary are expected to be very seldom and not so urgent
    • Other relationship types are also implemented as enums
    • Con enum:
    • Additions to the vocabulary are expected to occurr although very seldom
    • static methods do not have to be available before either connecting to the database or using the model in an unpersistent way (e.g. a web-service for parsing names)
    • Feature:
    • Pro enum:
      • More or less needed in constructor of some subclasses of DescriptionElementBase (e.g. Distribution, CommonName )
    • Con enum:
      • Many terms maybe added, so a pure enum is impossible !!

2. Existing Implementation

  • Keep the defined terms as classes that are not cascaded via @Cascade.

  • Make the application developer responsible for using and initializing defined terms in the right way (e.g. adding new ranks only when defintely no other application is using the library)

  • Write a documentation how to use defined terms in the right way

  • Implement a good working exception handling with meaningful warnings

Problems that occurr when using the existing implementation

not all of them are unsolvable

  • Transient object xxx Exception when creating 2 CdmApplicationControllers in the same JVM (but not at the same time) due to constructor of Distribution which adds a Feature.DISTRIBUTION . The second time hibernate throws the error.

  • Uninitialized collection exception when saving taxon names due to calling getFullTitleCache which needs the nomenclatural status abbreviation

  • t.b.c.

Add picture from clipboard (Maximum size: 40 MB)