LevelsExplained » History » Version 1

Version 1/3 - Next » - Current version
Anton Güntsch, 09/18/2009 02:03 PM
Inserted paragraph about integrity levels from PESI data quality report

Integrity Checks

Taxonomic Integrity Rule Levels

There are four levels of integrity to be checked during or after the import of source checklists into the CDM:

Level 1 - Syntax of terms

At the lowest level, the syntactical correctness of terms occurring in the data will be checked. Examples of syntax rules to be applied include:

  • A genus-group or higher taxon name must start with an upper-case character followed by lower-case characters and can not contain any diacritics.

  • A URL must follow the syntax defined at

  • Date and time information must follow the format specified in ISO8601.

Level 2 - Structural integrity

The second level of integrity checks will focus on the completeness and appropriateness of information belonging to individual objects. Examples of rules for detecting structural problems include:

  • A taxonomic scientific species name must have a genus name and a species epithet.

  • A scientific name should have an authority.

  • A bibliographic reference must have a year of publication.

Level 3 – relational integrity

At the third level, the correctness of relations between objects will be analysed. Examples of rules enforcing relational integrity include:

  • A synonym must be linked to an accepted taxon.

  • A genus must have at least one species.

  • A URL must refer to accessible content at the given address.

Level 4 – dataset integrity

The highest level of integrity checks detects contradictions between different datasets (checklists). The set of rules may include:

  • The same name appearing in different checklists should not have different status values.

  • Reference strings should not use different spellings when referring to identical references.

Integrity rules will not necessarily be enforced when a violation is recognized, for example, by rejecting data that do not conform to the given rule. Instead, in many cases the problem (or potential problem) will be highlighted and reported back to the data provider for further consideration.

Add picture from clipboard (Maximum size: 40 MB)