Actions
CoL Import Dokumentation » History » Revision 8
« Previous |
Revision 8/17
(diff)
| Next »
Andreas Müller, 08/03/2017 09:48 PM
CoL Import Dokumentation¶
Download¶
- The download is available from http://www.catalogueoflife.org/DCA_Export/archive.php (see also http://www.catalogueoflife.org/DCA_Export/index.php for partial downloads)
- Copy the download to \bgbm-pesihpc\CoL or any other place you have access to
Prepare database¶
As the import takes very long (>2 days) it is highly recommended not to run it into production directly, instead use a local database or one of the 2 col instances on edit-test (Note: edit-test is relatively slow)
Launch¶
- The import is launched by ColDwcaImportActivator in cdmlib-apps (https://dev.e-taxonomy.eu/gitweb/cdmlib-apps.git)
- Before launch adapt
- filename (URI) in
ColDwcaImportActivator.dwca_col_All()
- adapt the path to the mapping file
databaseMappingFile
- the mapping file stores the mapping of CoL DwC-A data to CDM, the database based mapping is required for running the import in parts (next step), it is a temporary folder that can be removed once all data is imported
- adapt classificationName
- filename (URI) in
- The import is split in multiple parts, this is for performance and memory reasons, especially the classification creating parts (higher taxa and lower taxa) are memory sensitive therefore it is recommended to run them separately. First you need to run taxa, lower taxa needs to run after higher taxa, everything else is order independent
- taxa
- extensions
- higher taxa
- lower taxa
- synonymy
Configuration¶
- give enough memory e.g. -Xmx9000M
- Consider defining your own log file and log properties e.g. by -Dlog4j.configuration=file:///C:/Users/a.mueller/.cdmLibrary/log/properties/log4j_col.properties
Installation¶
- when ready move DB to edit-database (production) and install
mysql -h localhost -u edit -p cdm_production_col<{filename}
- compute the freetext index by either xxx or using jobber
Updated by Andreas Müller almost 6 years ago · 8 revisions