CoL Import Dokumentation » History » Revision 4

« Previous | Revision 4/17 (diff) | Next »
Andreas Müller, 08/03/2017 09:43 PM

CoL Import Dokumentation


Prepare database

As the import takes very long (>2 days) it is highly recommended not to run it into production directly, instead use a local database or one of the 2 col instances on edit-test (Note: edit-test is relatively slow)


  • The import is launched by ColDwcaImportActivator in cdmlib-apps (
  • Before launch adapt
    • filename (URI) in ColDwcaImportActivator.dwca_col_All()
    • adapt the path to the mapping file databaseMappingFile (the mapping file stores the mapping of CoL DwC-A data to CDM, the database based mapping is required for running the import in parts (next step), it is a temporary folder that can be removed once all data is imported
  • The import is split in multiple parts, this is for performance and memory reasons, especially the classification creating parts (higher taxa and lower taxa) are memory sensitive therefore it is recommended to run them separately. First you need to run taxa, lower taxa needs to run after higher taxa, everything else is order independent
    • taxa
    • extensions
    • higher taxa
    • lower taxa
    • synonymy


  • give enough memory e.g. -Xmx9000M
  • Consider defining your own log file and log properties e.g. by -Dlog4j.configuration=file:///C:/Users/a.mueller/.cdmLibrary/log/properties/


  • when ready move DB to edit-database (production) and install mysql -h localhost -u edit -p cdm_production_col<filename

Updated by Andreas Müller almost 7 years ago · 4 revisions