Actions
CoL Import Dokumentation¶
- Table of contents
- CoL Import Dokumentation
Download¶
- The download is available from http://www.catalogueoflife.org/DCA_Export/archive.php (see also http://www.catalogueoflife.org/DCA_Export/index.php for partial downloads)
- Copy the download to \bgbm-pesihpc\CoL or any other place you have access to
Prepare database¶
As the import takes very long (>7 days) it is highly recommended not to run it into production directly, instead use a local database or one of the 2 col instances on edit-test (Note: edit-test is relatively slow)
Launch¶
- The import is launched by ColDwcaImportActivator in cdmlib-apps (https://dev.e-taxonomy.eu/gitweb/cdmlib-apps.git)
- Before launch adapt
- filename (URI) in
ColDwcaImportActivator.dwca_col_All()
- adapt the path to the mapping file
databaseMappingFile
- the mapping file stores the mapping of CoL DwC-A data to CDM, the database based mapping is required for running the import in parts (next step), it is a temporary folder that can be removed once all data is imported
- adapt classificationName
- filename (URI) in
- The import is split in multiple parts, this is for performance and memory reasons, especially the classification creating parts (higher taxa and lower taxa) are memory sensitive therefore it is recommended to run them separately. First you need to run taxa, lower taxa needs to run after higher taxa, everything else is order independent
- taxa
- extensions
- higher taxa
- lower taxa
- synonymy
Configuration¶
- give enough memory e.g. -Xmx9000M
- Consider defining your own log file and log properties e.g. by -Dlog4j.configuration=file:///C:/Users/a.mueller/.cdmLibrary/log/properties/log4j_col.properties
Installation¶
- when ready move DB to edit-database (production) and install
mysql -h localhost -u edit -p cdm_production_col<{filename}
- archive the file on edit-database in /var/backup/db_mysql_manual
- compute the freetext index by either using http://160.45.63.176/jenkins/job/REINDEX-col-catalogue-services/
- archive index afterwards on production (160.45.63.173)
cd /var/lib/cdmserver // tar -cjf col_2017-08-04.tar.bz2 index/col
- maybe install CoL also on edit-test (not necessarily required)
Tickets¶
Updated by Andreas Müller about 1 year ago · 17 revisions