Project

General

Profile

task #7808

Further name duplicates

Added by Andreas Kohlbecker almost 2 years ago. Updated about 1 year ago.

Status:
New
Priority:
Highest
Category:
Datacleaning
Target version:
Start date:
09/11/2018
Due date:
% Done:

0%

Tags:

Description

copied from #7420#note-19

existing IAPT data sometimes do have different ranks then the imported data. Therefore the names/taxa are not deduplicated. Example: Cryptophyceae is Division in IAPT but Phylum in Frey + Worms. This needs to be sorted out before the final import.

SELECT tn2.uuid, tn2.titleCache, r.titleCache 
from TaxonName tn2 join DefinedTermBase r on tn2.rank_id = r.id
where tn2.titleCache in (Select titleCache from (select tn.titleCache as titleCache, count(*) as n
FROM TaxonName tn
GROUP BY tn.titleCache
Having n > 1 ) AS TMP_TBL) order by tn2.titleCache

----> duplicate-names.ods

SELECT tn2.uuid, tn2.nameCache, r.titleCache 
from TaxonName tn2 join DefinedTermBase r on tn2.rank_id = r.id
where tn2.titleCache in (Select nameCache from (select tn.nameCache as nameCache, count(*) as n
FROM TaxonName tn
GROUP BY tn.nameCache
Having n > 1 ) AS TMP_TBL) order by tn2.nameCache

----> duplicate-names-2.ods

duplicate-names-2.ods (18.6 KB) Andreas Kohlbecker, 10/08/2018 08:54 AM

duplicate-names.ods (32.7 KB) Andreas Kohlbecker, 10/08/2018 08:54 AM


Related issues

Related to PhycoBank - task #7748: Genus name duplicates, Genus without reference Resolved 09/11/2018
Copied from Edit - task #7420: Import for higher taxon graph for phycobank Closed 05/15/2018

History

#1 Updated by Andreas Kohlbecker almost 2 years ago

  • Copied from task #7748: Genus name duplicates, Genus without reference added

#2 Updated by Andreas Kohlbecker almost 2 years ago

  • Description updated (diff)

#3 Updated by Andreas Kohlbecker almost 2 years ago

#4 Updated by Andreas Kohlbecker almost 2 years ago

  • Copied from task #7420: Import for higher taxon graph for phycobank added

#5 Updated by Andreas Kohlbecker almost 2 years ago

  • Copied from deleted (task #7748: Genus name duplicates, Genus without reference)

#6 Updated by Andreas Kohlbecker almost 2 years ago

  • Related to task #7748: Genus name duplicates, Genus without reference added

#7 Updated by Andreas Kohlbecker over 1 year ago

  • Category changed from Import to Datacleaning

#8 Updated by Andreas Kohlbecker about 1 year ago

  • Target version changed from Registry released to Data cleaning phase 2

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 40 MB)