eu.etaxonomy.taxeditor.help/.project -text
eu.etaxonomy.taxeditor.help/META-INF/MANIFEST.MF -text
eu.etaxonomy.taxeditor.help/build.properties -text
-eu.etaxonomy.taxeditor.help/html/concepts/maintopic.html -text
-eu.etaxonomy.taxeditor.help/html/concepts/subtopic.html -text
-eu.etaxonomy.taxeditor.help/html/concepts/subtopic2.html -text
eu.etaxonomy.taxeditor.help/html/gettingstarted/a_succesful_parsed_taxon_record.html -text
eu.etaxonomy.taxeditor.help/html/gettingstarted/about_bulk_editing.html -text
eu.etaxonomy.taxeditor.help/html/gettingstarted/about_the_manual.html -text
eu.etaxonomy.taxeditor.help/html/img/fileicon.jpg -text
eu.etaxonomy.taxeditor.help/html/img/orangewarning.jpg -text
eu.etaxonomy.taxeditor.help/html/img/redwarning.jpg -text
-eu.etaxonomy.taxeditor.help/html/reference/maintopic.html -text
-eu.etaxonomy.taxeditor.help/html/reference/subtopic.html -text
-eu.etaxonomy.taxeditor.help/html/reference/subtopic2.html -text
-eu.etaxonomy.taxeditor.help/html/samples/maintopic.html -text
-eu.etaxonomy.taxeditor.help/html/samples/subtopic.html -text
-eu.etaxonomy.taxeditor.help/html/samples/subtopic2.html -text
-eu.etaxonomy.taxeditor.help/html/tasks/maintopic.html -text
-eu.etaxonomy.taxeditor.help/html/tasks/subtopic.html -text
-eu.etaxonomy.taxeditor.help/html/tasks/subtopic2.html -text
-eu.etaxonomy.taxeditor.help/html/toc.html -text
+eu.etaxonomy.taxeditor.help/html/nameparser/authorship_part.html -text
+eu.etaxonomy.taxeditor.help/html/nameparser/name_part.html -text
+eu.etaxonomy.taxeditor.help/html/nameparser/nomenclatural_status_part.html -text
+eu.etaxonomy.taxeditor.help/html/nameparser/overview.html -text
+eu.etaxonomy.taxeditor.help/html/nameparser/reference_part.html -text
eu.etaxonomy.taxeditor.help/original_document/Taxonomic_Editor_User_Manual_Version_4.doc -text
eu.etaxonomy.taxeditor.help/plugin.xml -text
eu.etaxonomy.taxeditor.help/pom.xml -text
eu.etaxonomy.taxeditor.help/src/eu/etaxonomy/taxeditor/help/Activator.java -text
eu.etaxonomy.taxeditor.help/toc.xml -text
eu.etaxonomy.taxeditor.help/tocgettingstarted.xml -text
+eu.etaxonomy.taxeditor.help/tocnameparser.xml -text
eu.etaxonomy.taxeditor.navigation/.classpath -text
eu.etaxonomy.taxeditor.navigation/.project -text
eu.etaxonomy.taxeditor.navigation/META-INF/MANIFEST.MF -text
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Main Topic</title>
-</head>
-
-<body>
-<h1>Main Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic</title>
-</head>
-
-<body>
-<h1>Sub Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic 2</title>
-</head>
-
-<body>
-<h1>Sub Topic 2</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
--- /dev/null
+<h3>Authorship Part</h3>
+
+<p>The authorship part is divided into the original combination authorship and the combination authorship.
+The earlier is put in brackets.</p>
+
+<pre>
+Example (bot.): (L.) Mill.
+Example (zoo.): (XXX, 1830) XXX, 1845
+</pre>
+
+<p>You can use either no authorship (only if not followed by any other part), the original combination
+authorship, the combination authorship or both.</p>
+
+<p>The parser differentiates botanical and zoological authorship. The later has a year following the
+author, separated by a comma. Botanical names only have authors.
+Authorship may include single persons and teams. Team members are separated by <code>&</code> . A placeholder <code>al.</code>
+may be used for further team members. Both authorships may include ex-authors separated by <code>ex</code> or <code>ex.</code>
+Some valid author strings are:</p>
+<pre>
+Example (bot.): (Greuther & L'Hiver & al. ex Müller & Schmidt) Clark ex Ciardelli
+Example (zoo.):
+</pre>
+
+<p>The number of allowed special characters like <code>'</code> or <code>-</code> at the moment is beyond the scope of this
+documentation and will change in the future.</p>
+
--- /dev/null
+<h3>Name Part</h3>
+
+<p>The name part recognizes uninomials, binomials and trinomials. The first epithet must start
+with a capital letter; all other words (except for infrageneric epitheta) can only contain
+lower-case letters. Only latin letters are allowed in names (except for <strong>ï</strong>).
+The name part parser differentiates 6 different syntaxes.</p>
+
+<h4>Uninomials</h4>
+<p>One word starting with a capital letter. As the rank is usually ambiguous for uninomials,
+the rank represents the parser's best guess and a warning is returned to check the rank.</p>
+<pre>Example: Cichorieae</pre>
+
+
+<h4>Infrageneric Names</h4>
+<p>Capital word followed by the infrageneric marker followed by the infrageneric epitheton.
+Valid markers are:
+ <ul>
+ <li><code>subgen.</code></li>
+ <li><code>subg.</code></li>
+ <li><code>sect.</code></li>
+ <li><code>subsect.</code></li>
+ <li><code>ser.</code></li>
+ <li><code>subser.</code></li>
+ <li><code>t.infgen.</code></li>
+ </ul>
+</p>
+<pre>Example: Desmometopa subg. LitoXXX</pre>
+
+
+<h4>Species Aggregates</h4>
+<p>Species aggregates are recognized similarly to species except they are followed by a group
+marker. Valid markers are:
+ <ul>
+ <li><code>aggr.</code></li>
+ <li><code>agg.</code></li>
+ <li><code>group</code></li>
+ </ul>
+</p>
+<pre>Example: XXX</pre>
+
+
+<h4>Species</h4>
+<p>Species names have a genus part (capital letter) and a species part (lower case letter).</p>
+<pre>Examples are: Abies alba</pre>
+
+<h4>Infraspecific names</h4>
+<p>Infraspecific names have four parts: the genus part, the species part, the infraspecific
+marker and the infraspecific part. All but the first may not start with a capital letter.
+Recognized markers are:
+ <ul>
+ <li><code>subsp.</code></li>
+ <li><code>convar.</code></li>
+ <li><code>var.</code></li>
+ <li><code>subvar.</code></li>
+ <li><code>f.</code></li>
+ <li><code>subf.</code></li>
+ <li><code>f.spec.</code></li>
+ <li><code>tax.infrasp.</code></li>
+ <li><code>tax. infrasp.</code></li>
+ </ul>
+
+</p>
+<pre>Example:</pre>
+
+<h4>Infraspecific names (old markers)</h4>
+<p>Some older names (not valid according to the nomenclatural code) use other infraspecific
+markers. The recognition of these older names is not yet implemented.</p>
+
+
--- /dev/null
+<h3>Nomenclatural Status</h3>
+
+<p>The nomeclatural status is separated from the preceding text by a comma. Current valid values for a status:</p>
+
+<ul>
+ <li><code>nom. superfl.</code></li>
+ <li><code>nom. nud.</code></li>
+ <li><code>nom. illeg.</code></li>
+ <li><code>nom. inval.</code></li>
+ <li><code>nom. cons.</code></li>
+ <li><code>nom. alternativ.</code></li>
+ <li><code>nom. subnud.</code></li>
+ <li><code>nom. rej.</code></li>
+ <li><code>nom. rej.</code></li>
+ <li><code>nom. prop.</code></li>
+ <li><code>nom. provis.</code></li>
+ <li><code>orth. var.</code></li>
+</ul>
+
+<p>Multiple values separated by comma are possible.</p>
\ No newline at end of file
--- /dev/null
+<h2>Name Parser Documentation</h2>
+
+
+<p>The taxonomic name parser analyzes a free text taxonomic reference for the following four components:</p>
+
+<ul>
+ <li><a href="name_part.html">Name Part</a></li>
+ <li><a href="authorship_part.html">Authorship Part</a></li>
+ <li><a href="reference_part.html">Reference Part</a></li>
+ <li><a href="nomenclatural_status_part.html">Nomenclatural Status</a></li>
+</ul>
+
+<p>Not all of them are required.</p>
+
+<p>The four parts are separated by the following separators:</p>
+
+<table border="1">
+ <thead>
+ <tr>
+ <td>part</td>
+ <td>separator</td>
+ <td>example</td>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>authorship</td>
+ <td>any whitespace</td>
+ <td><code>Abies alba_L.</code></td>
+ </tr>
+ <tr>
+ <td>reference</td>
+ <td>commata with following whitespace OR whitespace+'in'+whitespace</td>
+ <td><code>Abies alba L.,_Sp. Pl... or Pinus alba_in_Bull. Soc....</code></td>
+ </tr>
+ <tr>
+ <td>nom. status</td>
+ <td>commata with following whitespace</td>
+ <td><code>in Bull. Bot. 3: 99. 1987., nom illeg.</code></td>
+ </tr>
+ </tbody>
+</table>
+
+<p>Some valid name texts fully recognized by the parser are:</p>
+
+<pre>
+Abies alba (L.) Mill., Sp. Pl.: 105. 1846., nom illeg.
+Abies alba (L.) Mill. in Bull. Bot. 3: 99. 1987., nom illeg.
+</pre>
+
+<p>The name part is required. The authorship part is required only if followed by the reference part. The reference part as well as the status part are not required. In the following sections, the four parts are described in detail:</p>
\ No newline at end of file
--- /dev/null
+<h3>Reference Part</h3>
+
+<p>The reference part follows the syntax: <code>{separator}{authorship{,}}{titleEditionVolume}{:}{detail}{.}{year}</code></p>
+
+<p>Zoological new combinations should not have a reference part, since in zoology, it is not common
+to mention the new combination reference.</p>
+
+<h4>Separator</h4>
+
+<p>The separator between the reference part and the preceding authorship may be a comma , or
+an <pre> in </pre> (surrounded by whitespaces). The comma indicates a book whereas the <code>in</code> stands either
+for a journal article or a book section. If the <code>in</code> is not followed by a comma, the parser
+interprets the reference as an article; otherwise, as a book section. Reference type parsing
+should be improved in future.</p>
+
+<h4>Reference Authorship</h4>
+
+<p>An author is only available for book sections. Articles and book sections are differentiated
+from each other by comparing the first four words that follow the separator. If these words
+include a comma and the words before the comma are likely to represent an author, the reference
+is recognized as a book section. Otherwise, it will be treated as an article. In both cases,
+a warning is thrown that differentiation is not possible.</p>
+
+<h4>TitleEditionVolume</h4>
+
+<p>The TitleEditionVolume part includes the title itself as well as optional edition part
+and volume parts. The title itself allows most character combinations but care must be taken
+if a <code>:</code> is included as this is the separator for the subsequent detail part. Special characters
+like <code>&</code> and <code>-</code> are only allowed if preceded and followed immediately by ordinary characters.
+Ordinary brackets are allowed. Edition and volume are separated by whitespace if only one
+of them exists. If both exist the later is separated by a comma. Both are optional, so all
+four of the following formats are valid:</p>
+
+<pre>
+Sp. Pl.
+Sp. Pl. ed. 3
+Sp. Pl. ed. 3, 4
+Sp. Pl. 4
+</pre>
+
+<p>As can be seen, the edition is recognized by a preceding <code>ed.</code>, whereas the volume is just
+a number (or a number followed by another number in brackets - e.g. <code>4(5)</code> ).</p>
+
+<p>The detail part is separated by a column <code>:</code> from the preceding titleEditonVolume part and
+is separated from the year by <code>.</code> (botanical names only). A number of typical detail information
+is recognized as either pure page numbers (<code>345</code>) or ranges (<code>345-348</code>). Page numbers may be
+preceded by <code>p.</code>(e.g. <code>p. 345</code>) or <code>pp.</code>(e.g. <code>pp. 345-348</code>). Abbreviations indicating special parts of a
+reference such as <code>fig.</code> or <code>tab.</code> are recognized as well. Roman numbers are not detected
+at the moment.</p>
+
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Main Topic</title>
-</head>
-
-<body>
-<h1>Main Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic</title>
-</head>
-
-<body>
-<h1>Sub Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic 2</title>
-</head>
-
-<body>
-<h1>Sub Topic 2</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Main Topic</title>
-</head>
-
-<body>
-<h1>Main Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic</title>
-</head>
-
-<body>
-<h1>Sub Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic 2</title>
-</head>
-
-<body>
-<h1>Sub Topic 2</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Main Topic</title>
-</head>
-
-<body>
-<h1>Main Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic</title>
-</head>
-
-<body>
-<h1>Sub Topic</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Sub Topic 2</title>
-</head>
-
-<body>
-<h1>Sub Topic 2</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
+++ /dev/null
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
-
-<html>
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <title>Table of Contents</title>
-</head>
-
-<body>
-<h1>Table of Contents</h1>
-Please enter your text here.
-</body>
-</html>
\ No newline at end of file
primary="true">
</toc>
<toc
- file="tocgettingstarted.xml">
+ file="tocgettingstarted.xml"
+ primary="false">
+ </toc>
+ <toc
+ file="tocnameparser.xml"
+ primary="false">
</toc>
</extension>
<?xml version="1.0" encoding="UTF-8"?>
<?NLS TYPE="org.eclipse.help.toc"?>
-<toc label="EDIT Taxonomic Editor" topic="html/toc.html">
+<toc label="EDIT Taxonomic Editor">
<topic label="Getting Started">
<anchor id="gettingstarted"/>
</topic>
+ <topic label="Name Parser">
+ <anchor id="nameparser"/>
+ </topic>
</toc>
--- /dev/null
+<toc label="Name Parser" link_to="toc.xml#nameparser">
+ <topic href="html/nameparser/overview.html" label="Overview">
+ </topic>
+ <topic href="html/nameparser/name_part.html" label="Name Part">
+ </topic>
+ <topic href="html/nameparser/authorship_part.html" label="Authorship Part">
+ </topic>
+ <topic href="html/nameparser/reference_part.html" label="Reference Part">
+ </topic>
+ <topic href="html/nameparser/nomenclatural_status_part.html" label="Nomenclatural Status Part">
+ </topic>
+</toc>