Taxonomic Integrity Rule Levels¶
There are four levels of integrity to be checked during or after the import of source checklists into the CDM:
Level 1 - Syntax of terms¶
At the lowest level, the syntactical correctness of terms occurring in the data will be checked. Examples of syntax rules to be applied include:
A genus-group or higher taxon name must start with an upper-case character followed by lower-case characters and can not contain any diacritics.
A URL must follow the syntax defined at http://www.w3.org/Addressing/URL/5_BNF.html.
Date and time information must follow the format specified in ISO8601.
Level 2 - Structural integrity¶
The second level of integrity checks will focus on the completeness and appropriateness of information belonging to individual objects. Examples of rules for detecting structural problems include:
A taxonomic scientific species name must have a genus name and a species epithet.
A scientific name should have an authority.
A bibliographic reference must have a year of publication.
Level 3 â€“ relational integrity¶
At the third level, the correctness of relations between objects will be analysed. Examples of rules enforcing relational integrity include:
A synonym must be linked to an accepted taxon.
A genus must have at least one species.
A URL must refer to accessible content at the given address.
Level 4 â€“ dataset integrity¶
The highest level of integrity checks detects contradictions between different datasets (checklists). The set of rules may include:
The same name appearing in different checklists should not have different status values.
Reference strings should not use different spellings when referring to identical references.
Integrity rules will not necessarily be enforced when a violation is recognized, for example, by rejecting data that do not conform to the given rule. Instead, in many cases the problem (or potential problem) will be highlighted and reported back to the data provider for further consideration.