1 <?xml version=
"1.0" encoding=
"UTF-8"?>
3 xsi:
schemaLocation=
"http://docbook.org/ns/docbook http://docbook.org/xml/5.0/xsd/docbook.xsd"
4 xml:
id=
"cdm-reference-guide" xmlns=
"http://docbook.org/ns/docbook"
5 xmlns:
xsi=
"http://www.w3.org/2001/XMLSchema-instance"
6 xmlns:
xs=
"http://www.w3.org/2001/XMLSchema"
7 xmlns:
xlink=
"http://www.w3.org/1999/xlink"
8 xmlns:
xi=
"http://www.w3.org/2001/XInclude"
9 xmlns:
ns5=
"http://www.w3.org/1999/xhtml"
10 xmlns:
ns4=
"http://www.w3.org/2000/svg"
11 xmlns:
ns3=
"http://www.w3.org/1998/Math/MathML"
12 xmlns:
ns=
"http://docbook.org/ns/docbook">
14 <title>EDIT Common Data Model Library
</title>
16 <subtitle>Reference Documentation (Work in Progress)
</subtitle>
19 <inlinegraphic fileref=
"./resources/images/logo.png" />
22 <!-- Please add your names here -->
26 <personname>Ben Clark
</personname>
30 <releaseinfo>2.1</releaseinfo>
35 <holder>EDIT - European Distributed Institute of Taxonomy -
36 http://www.e-taxonomy.eu
</holder>
40 <para>The contents of this file are subject to the Mozilla Public
41 License Version
1.1. See LICENSE.TXT at the top of this package for the
42 full license terms.
</para>
48 <preface id=
"preface">
49 <title>Preface
</title>
51 <para>EDIT's Internet Platform for Cybertaxonomy is a distributed
52 computing platform that helps taxonomists do revisionary taxonomy and
53 taxonomic field work efficiently and expediently via the web. At the core
54 of the platform lies a common data model to enable interoperability
55 between the different components. The model describes all the commonly
56 used data that is dealt with in the platform, and therefore covers
57 taxonomic names and concepts; literature references; authors; (type)
58 specimen; structured descriptive data; molecular data; related (binary)
59 files such as images or compiled keys; controlled vocabularies and terms;
60 and species related content of any kind like economic use or conservation
63 <para>The cyberplatform consists of interoperable but independent
64 components. Platform components can take the form of software applications
65 (desktop or web-based) for human users or (web) services intended to be
66 used by other software applications. The platform as envisioned does not
67 have a single user interface or website; rather, it is a collection of
68 interacting components which may be combined and assembled according to
69 the task in hand. To facilitate the development of core CDM Applications
70 such as the CDM Community Server, the CDM Dataportals, and the Taxonomic
71 Editor, an implementation of the CDM has been created in the java
72 programming language. In addition to CDM model classes being modelled as
73 plain-old-java-objects (
<link
74 xlink:
href=
"http://en.wikipedia.org/wiki/Plain_Old_Java_Object">pojo's
</link>),
75 a set of java components has been created that provide common services
76 across all java applications using the CDM. In addition to serving as the
77 basis of core components of the Internet Platform for Cyberplatform, they
78 also allow the development of other applications using the CDM by
79 providing basic functionality that can be extended for a particular
82 <para>The CDM Library, as it is known, consists of four major modules that
83 can be used by any java application based on the CDM. These libraries are
84 used as the foundation of the Taxonomic Editor and the CDM Community
85 Server. In addition a web application (the CDM Community Server) is
86 documented here, as its components can be re-purposed or extended by other
87 web applications based on the CDM.
</para>
91 <imageobject role=
"html">
92 <imagedata fileref=
"resources/images/cdmlib-arch3.png" format=
"png" />
95 <imageobject role=
"fo">
96 <imagedata contentwidth=
"160mm"
97 fileref=
"resources/images/cdmlib-arch3.png" format=
"png"
101 <caption>The overall architecture of the EDIT Internet platform for
102 Cybertaxonomy, showing the core components of the CDM Java Library,
103 and their use by desktop (Taxonomic Editor) and web-based (CDM
104 Dataportal, CATE) applications.
</caption>
110 <title>Common Data Model
</title>
113 <para>The Common Data Model (CDM) is the domain model for the core EDIT
114 cyberplatform components. The CDM is primarily based on the
<link
116 xlink:
href=
"http://wiki.tdwg.org/twiki/bin/view/TAG/LsidVocs">TDWG
117 Ontology
</link> and in most cases there is concordance with relevant
118 TDWG standards such as
<link linkend=
"???"
119 xlink:
href=
"http://www.tdwg.org/standards/117/">Taxon Concept Transfer
120 Schema (TCS)
</link>,
<link linkend=
"???"
121 xlink:
href=
"http://www.tdwg.org/standards/117/">Structured Descriptive
122 Data (SDD)
</link> and
<link linkend=
"???"
123 xlink:
href=
"http://www.tdwg.org/standards/115/">Access to Biological
124 Collections Data (ABCD)
</link>.
</para>
126 <para>The CDM differs from the TDWG standards in its purpose: it is
127 intended to serve as the basis of software applications in the
128 cyberplatform (e.g. the taxonomic editor, the CDM Dataportals) rather
129 than being a standard for data exchange between any resource containing
130 biodiversity information. Whilst it is certainly possible to exchange
131 data as CDM domain objects serialized as XML or JSON (the CDM Server and
132 the CDM Dataportals do this), the common data model is not intended to
133 replace existing TDWG standards as a general purpose exchange standard.
134 It is possible to convert data held in a CDM store into a relevant TDWG
135 standard for exchange and in some cases this may be the desired route
136 for data held in the CDM (e.g. for exchange with an application that is
137 not part of the cyberplatform, but which is capable of understanding
138 data in a TDWG standard).
</para>
140 <para>Thus the CDM is intended for use as
</para>
144 <para>A domain model for applications, particularly those that
145 enable taxonomists to do revisionary taxonomy and taxonomic field
150 <para>A standard for exchange between applications that are part of
151 the EDIT Internet Platform for Cybertaxonomy
</para>
155 <para>In terms of scope, the CDM covers information core to the vision
156 of the cyberplatform i.e. descriptive and revisionary taxonomy,
157 including taxonomic fieldwork :-
</para>
161 <para>Taxonomic names and nomenclature, typification
</para>
165 <para>Taxonomic concepts and relationships between accepted names
166 and synonyms, including the placement of the same taxonomic concept
167 in different taxonomic hierarchies.
</para>
171 <para>Specimens and Observations of individual organisms, their
172 collection, location, processing and taxonomic determination.
</para>
176 <para>Structured and unstructured information about names, taxa, and
181 <para>In addition to this core area, the CDM covers some related domains
182 that are important:-
</para>
186 <para>Literature
</para>
190 <para>People, teams of people and institutions in various roles
191 (i.e. as authors, collectors, artists, rights holders etc)
</para>
195 <para>Media (images, video and audio files, plus more
196 taxonomy-specific media such as phylogenies and compiled
201 <para>Molecular data, such as DNA sequences and loci
</para>
205 <para>As you might expect, there are also a number of data entities
206 representing controlled vocabularies, identity of users (and their roles
207 and permissions), and ancillary data common to all major classes such as
208 multilingual text content, annotations and markers.
</para>
211 <title>A UML Package diagram showing the CDM packages and their
215 <imageobject role=
"html">
216 <imagedata fileref=
"resources/images/ModelOverview20.gif" />
219 <imageobject role=
"fo">
220 <imagedata contentwidth=
"160mm"
221 fileref=
"resources/images/ModelOverview20.gif"
228 <xi:include href=
"base-classes.xml" />
230 <!--<xi:include href="annotation-and-markers.xml" />-->
232 <!--<xi:include href="extensions.xml" />-->
234 <!--<xi:include href="identifiable-entities.xml" />-->
236 <!--<xi:include href="validation.xml" />-->
240 <title>Persistence Layer
</title>
243 <para>Even the most basic of taxonomic applications have a requirement
244 for users to be able to save the information that they create. In
245 addition, a common component of taxonomic applications is the use of a
246 database to provide users with the ability to filter or search their
247 data in one way or another. Some applications will require more advanced
248 functionality, such as auditing or versioning of data. All of this logic
249 is contained in the persistence layer, providing clean separation
250 between data access and more taxonomy-centric business logic in the
251 service layer.
</para>
253 <para>Persistence is not a simple problem to solve, especially in
254 application developed in Object-Oriented languages, with large amounts
255 of data, or with many users accessing data at the same time. The CDM
256 Library uses the
<link
257 xlink:
href=
"http://www.hibernate.org">Hibernate
</link> object/relational
258 persistence and query service as the basis of its persistence layer.
259 Several member projects of the Hibernate stable, including
<link
260 xlink:
href=
"http://annotations.hibernate.org">Hibernate
261 Annotations
</link>,
<link linkend=
"???"
262 xlink:
href=
"http://search.hibernate.org">Hibernate Search
</link> and
263 <link linkend=
"???">Hibernate Envers
</link> (part of Hibernate Core)
264 provide the basis of the more advanced persistence-related functionality
265 in the CDM Library. As a consequence some of the behaviour of the CDM
266 Library is constrained by the underlying ORM technology. The advantage
267 of using an ORM is that the same software can be used with multiple
268 database systems with (almost) no changes to the application. Currently
269 the CDM Library has been tested with (version numbers
& platforms in
272 <!--I don't know how many of these have been tested, on which platforms, but it would be good to include some measure of which platform / database combinations
273 have been used and how, so that potential users can evaluate the technology. In an ideal world, we would pick some databases as "supported" and ensure that
274 the test suite runs on that platform / db combination (i.e. you don't release until the tests pass). For the others, we still might want to say: "We tested
275 the CDM on this platform and it seemed to work".-->
280 xlink:
href=
"http://www.ibm.com/software/data/db2/">DB2
</link></para>
284 <para><link xlink:
href=
"???">H2
</link> (default local database used
285 by the Taxonomic Editor,
1.0.73)
</para>
290 <link xlink:
href=
"http://hsqldb.org">HSQLDB
</link>
295 <para><link xlink:
href=
"http://www.mysql.com">MySQL
</link> (
4.1.20:
296 linux;
5.1.32: windows)
</para>
301 <link xlink:
href=
"???">ODBC
</link>
308 xlink:
href=
"http://www.oracle.com/database/index.html">Oracle
309 Database
11<emphasis>g
</emphasis></link>
315 <link xlink:
href=
"http://www.postgresql.org/">PostgreSQL
</link>
321 <link xlink:
href=
"???">Microsoft SQL Server
2000</link>
328 xlink:
href=
"http://www.microsoft.com/sqlserver/2005/">Microsoft
329 SQL Server
2005</link>
335 <link linkend=
"???" xlink:
href=
"http://www.sybase.co.uk/">Sybase
336 Advantage Database Server
</link>
341 <para>In theory, application developers should not need to use the
342 persistence layer directly, but should instead use the
<link
343 linkend=
"api">API
</link>, which provides a
<emphasis>facade
</emphasis>
344 over the persistence layer and extra business logic that most
345 applications using the CDM will require.
</para>
348 <xi:include href=
"basic-persistence.xml" />
350 <!--<xi:include href="listing-sorting-initializing.xml" />-->
352 <!--<xi:include href="versioning.xml" />-->
354 <!--<xi:include href="free-text-search.xml" />-->
358 <title>API Methods
</title>
361 <para>This part discusses the service layer:
</para>
364 <!--<xi:include href="service.xml" />-->
366 <!--<xi:include href="paging-resultsets.xml" />-->
368 <!--<xi:include href="application-controller.xml" />-->
370 <!--<xi:include href="transactions.xml" />-->
372 <!--<xi:include href="guid-resolution.xml" />-->
374 <!--<xi:include href="security.xml" />-->
378 <title>CDM Input / Output Layer
</title>
381 <para>This part describes the input output routines:
</para>
384 <!--<xi:include href="base-io-usage.xml" />-->
386 <!--<xi:include href="cdm-xml-input-output.xml" />-->
388 <!--<xi:include href="abcd-input-output.xml" />-->
390 <!--<xi:include href="berlinmodel-input-output.xml" />-->
392 <!--<xi:include href="excel-input-output.xml" />-->
394 <!--<xi:include href="sdd-input-output.xml" />-->
396 <!--<xi:include href="taxonx-input-output.xml" />-->
398 <!--<xi:include href="tcsrdf-input-output.xml" />-->
400 <!--<xi:include href="tcsxml-input-output.xml" />-->
404 <title>CDM Server
</title>
407 <para>This part describes the cdm-server application:
</para>
410 <!--<xi:include href="cdm-server.xml" />-->
412 <!--<xi:include href="instalation.xml" />-->
414 <!--<xi:include href="configuration.xml" />-->