Part II. Common Data Model

The Common Data Model (CDM) is the domain model for the core EDIT cyberplatform components. The CDM is primarily based on the TDWG Ontology and in most cases there is concordance with relevant TDWG standards such as Taxon Concept Transfer Schema (TCS), Structured Descriptive Data (SDD) and Access to Biological Collections Data (ABCD).

The CDM differs from the TDWG standards in its purpose: it is intended to serve as the basis of software applications in the cyberplatform (e.g. the taxonomic editor, the CDM Dataportals) rather than being a standard for data exchange between any resource containing biodiversity information. Whilst it is certainly possible to exchange data as CDM domain objects serialized as XML or JSON (the CDM Server and the CDM Dataportals do this), the common data model is not intended to replace existing TDWG standards as a general purpose exchange standard. It is possible to convert data held in a CDM store into a relevant TDWG standard for exchange and in some cases this may be the desired route for data held in the CDM (e.g. for exchange with an application that is not part of the cyberplatform, but which is capable of understanding data in a TDWG standard).

Thus the CDM is intended for use as

  • A domain model for applications, particularly those that enable taxonomists to do revisionary taxonomy and taxonomic field work

  • A standard for exchange between applications that are part of the EDIT Internet Platform for Cybertaxonomy

In terms of scope, the CDM covers information core to the vision of the cyberplatform i.e. descriptive and revisionary taxonomy, including taxonomic fieldwork :-

  • Taxonomic names and nomenclature, typification

  • Taxonomic concepts and relationships between accepted names and synonyms, including the placement of the same taxonomic concept in different taxonomic hierarchies.

  • Specimens and Observations of individual organisms, their collection, location, processing and taxonomic determination.

  • Structured and unstructured information about names, taxa, and specimens.

In addition to this core area, the CDM covers some related domains that are important:-

  • Literature

  • People, teams of people and institutions in various roles (i.e. as authors, collectors, artists, rights holders etc)

  • Media (images, video and audio files, plus more taxonomy-specific media such as phylogenies and compiled keys)

  • Molecular data, such as DNA sequences and loci

As you might expect, there are also a number of data entities representing controlled vocabularies, identity of users (and their roles and permissions), and ancillary data common to all major classes such as multilingual text content, annotations and markers.

Figure 2. A UML Package diagram showing the CDM packages and their members.

A UML Package diagram showing the CDM packages and their members.