correcting minor problem with jaxb annotation
[cdmlib.git] / src / docbkx / free-text-search.xml
1 <?xml version="1.0" encoding="UTF-8"?>
2 <chapter version="5.0" xml:id="free-text-search"
3 xmlns="http://docbook.org/ns/docbook"
4 xmlns:xlink="http://www.w3.org/1999/xlink"
5 xmlns:ns5="http://www.w3.org/1999/xhtml"
6 xmlns:ns4="http://www.w3.org/2000/svg"
7 xmlns:ns3="http://www.w3.org/1998/Math/MathML"
8 xmlns:ns="http://docbook.org/ns/docbook">
9 <info>
10 <title>Free Text Search</title>
11 </info>
12
13 <section>
14 <para>The CDM supports high-performance free-text ("google-like")
15 searching of the data that it stores. It uses the hibernate-search library
16 to integrate the popular apache Lucene search software into the CDM. The
17 persistence layer includes hibernate-search integration by default, so
18 objects are added to the lucene index when applications
19 <methodname>save</methodname> entities, and the indices are updated when
20 applications <methodname>update</methodname> or
21 <methodname>delete</methodname> objects. All fields are converted to
22 lowercase during indexing, and queries are converted to lowercase during
23 parsing. Several properties are indexed per object type, and it is
24 possible to search individual fields or combinations of fields. The basic
25 syntax used for free text queries is described on the <link xlink:href="http://lucene.apache.org/java/2_4_1/queryparsersyntax.html">lucene
26 website</link>.</para>
27
28 <para>All classes have a default field that is searched when a field is
29 not specified. In the case of classes that extend
30 <classname>IdentifiableEntity</classname> the
31 <parameter>titleCache</parameter> field is used. By default, query strings
32 are broken into individual terms and objects are returned that match any
33 of the terms (e.g. <emphasis>Acherontia atropos</emphasis>). To return
34 objects that match all terms, in any order, the an AND operator can be
35 used (e.g. <emphasis>Acherontia AND atropos</emphasis>). By enclosing
36 individual terms in double quotes, you can specify that terms must appear
37 in a certain order (e.g. <emphasis>"Acherontia atropos"</emphasis>).
38 </para>
39
40 <para>To search a specific property, prepend the name of the property,
41 followed by a colon to the query (e.g. <emphasis>nameCache:"Acherontia
42 atropos"</emphasis>). Properties of related entities can be searched too,
43 provided that they have been indexed, using java-beans-like dot-notation.
44 For example, to return all references written by Schott you could use
45 <emphasis>authorTeam.titleCache:Schott</emphasis>, and to return all
46 publications written in the 1940's you could use either
47 <emphasis>datePublished.start:194*</emphasis> or
48 <emphasis>datePublished.start:[1940* TO 1949*]</emphasis> (to specify a
49 range).</para>
50 </section>
51 </chapter>