Project

General

Profile

Actions

CdmVersioning » History » Revision 5

« Previous | Revision 5/23 (diff) | Next »
Markus Döring, 02/28/2008 10:45 AM


CDM Versioning

The CdmLibrary is supposed to support versioning of CDM data. The simplest approach to create a new version would be to copy every current object in the CDM and leave the old ones untouched. This would effectively mean a copy of the entire database for every single version, because nearly all objects in the CDM are related to each other in some ways. Obviously this is not a scalable solution. We will have to address versioning therefore on the atomic domain objects. The following graphic shows the complexity and dependencies of a single change to a Person in the CDM:

Version In Time. A View

For a single point in time a consistent complex object can be assembled based on the multiple versions stored. The following example shows different versions of a taxon made up of a name that in turn is based on an author. ????

tbd

Versioning = Denormalisation?

A new version of an object will not automatically be applied to all other previously related objects. A modified name will therefore not be applied to all taxa using that name without manual interaction, i.e. agreement from the user that owns the other taxon.

Versioning and this non-automatic propagation of updated objects therefore leads to very similar effects as denormalisation. With the exception that there is a clear link between the versions so an automatic update could be done at any time.

Accessing versions. DAO Implementations

Just notes so far. Need to be cleaned.

It would be elegant to implement versioning entirely outside of the domain model, only in the Data Access Objects (DAOs).

Access to the latest version should be the default and as fast as possible.

Historic versions can and maybe even should be read-only.

Domain class methods like getTaxonName() should not need to know (i.e. have a parameter) which view they are working on. If a taxon has different name versions in time, how does the taxon object know which to return? The method would have to accept a view parameter or store the view in the taxon object. But as the identical, persistent taxon object is used for several views, it cannot be stored in the persistent taxon object.

It therefore seems best to create transient copies of historic versions and only allow the current view to be persistent, i.e updateable. The assembling of such transient deep copies would have to be done in the DAO layer. As those copies are transient, the complex object boundaries have to be defined as part of the DAO method too, as no further lazy loading will be possible. Domain method calls to related objects which were not immediately loaded will get NULL!

Changes To Datamodel

Many-Many Relations

This solution modifies the referencing side of a domain model and keeps a list of all versions for a certain property. That means converting all properties to lists or adding an additional list property for each existing property.

It allows for relatively fast database access to a complex object view.

It means a lot of extra coding, as all properties have to be duplicated and probably regular setter/getter methods have to be adapted.

Linked Lists

Version Array

Unversioned CDM Classes

Parts of the CommonDataModel would not need to be versioned. This probably applies to Cdm:common:DefinedTermBase and Cdm:common:RelationshipBase

Updated by Markus Döring about 16 years ago · 5 revisions