The CDM supports high-performance free-text ("google-like")
searching of the data that it stores. It uses the hibernate-search library
to integrate the popular apache Lucene search software into the CDM. The
persistence layer includes hibernate-search integration by default, so
objects are added to the lucene index when applications
save
entities, and the indices are updated when
applications update
or
delete
objects. All fields are converted to
lowercase during indexing, and queries are converted to lowercase during
parsing. Several properties are indexed per object type, and it is
possible to search individual fields or combinations of fields. The basic
syntax used for free text queries is described on the lucene
website.
All classes have a default field that is searched when a field is
not specified. In the case of classes that extend
IdentifiableEntity
the
titleCache
field is used. By default, query strings
are broken into individual terms and objects are returned that match any
of the terms (e.g. Acherontia atropos). To return
objects that match all terms, in any order, the an AND operator can be
used (e.g. Acherontia AND atropos). By enclosing
individual terms in double quotes, you can specify that terms must appear
in a certain order (e.g. "Acherontia atropos").
To search a specific property, prepend the name of the property, followed by a colon to the query (e.g. nameCache:"Acherontia atropos"). Properties of related entities can be searched too, provided that they have been indexed, using java-beans-like dot-notation. For example, to return all references written by Schott you could use authorTeam.titleCache:Schott, and to return all publications written in the 1940's you could use either datePublished.start:194* or datePublished.start:[1940* TO 1949*] (to specify a range).