CoL Import Dokumentation » History » Revision 6
Revision 5 (Andreas Müller, 08/03/2017 09:45 PM) → Revision 6/17 (Andreas Müller, 08/03/2017 09:47 PM)
# CoL Import Dokumentation {{>toc}} ## Download * The download is available from http://www.catalogueoflife.org/DCA_Export/archive.php (see also http://www.catalogueoflife.org/DCA_Export/index.php for partial downloads) * Copy the download to \\bgbm-pesihpc\CoL or any other place you have access to ## Prepare database As the import takes very long (>2 days) it is highly recommended not to run it into production directly, instead use a local database or one of the 2 col instances on edit-test (Note: edit-test is relatively slow) ## Launch * The import is launched by ColDwcaImportActivator in cdmlib-apps (https://dev.e-taxonomy.eu/gitweb/cdmlib-apps.git) * Before launch adapt * filename (URI) in `ColDwcaImportActivator.dwca_col_All()` * adapt the path to the mapping file `databaseMappingFile` * the mapping file stores the mapping of CoL DwC-A data to CDM, the database based mapping is required for running the import in parts (next step), it is a temporary folder that can be removed once all data is imported * adapt classificationName * The import is split in multiple parts, this is for performance and memory reasons, especially the classification creating parts (*higher taxa* and *lower taxa*) are memory sensitive therefore it is recommended to run them separately. **First** you need to run *taxa*, *lower taxa* needs to run after *higher taxa*, everything else is order independent * taxa * extensions * higher taxa * lower taxa * synonymy ## Configuration * give enough memory e.g. -Xmx9000M * Consider defining your own log file and log properties e.g. by -Dlog4j.configuration=file:///C:/Users/a.mueller/.cdmLibrary/log/properties/log4j_col.properties ## Installation * when ready move DB to edit-database (production) and install `mysql -h localhost -u edit -p cdm_production_col<{filename}` cdm_production_col<filename`