Project

General

Profile

Actions

task #9364

closed

Cleanup authors with multiple & and protected names and teams

Added by Andreas Müller about 1 year ago. Updated 8 months ago.

Status:
Closed
Priority:
Highest
Category:
data
Target version:
-
Start date:
12/22/2020
Due date:
% Done:

100%

Estimated time:
Severity:
normal
Tags:

Description

(high priority only to first issue as it is related to #2200)

In context of #2200 I already checked names with >1 "&" in nameCache. These are cleanedup.
But there are open issues with TeamOrPerson.nomenclaturalTitle:

Campanula: 43
Diptera: 77
FdAC: 92
Palmae: 266 !

Guianas: 3(?) => unparsed collectors with 2 collectors having the same family name
FM: 59(?) => remaining issues are all MAN with some second author team with unclear semantics needs to be fixed during next import

SELECT ab.DTYPE, ab.id, ab.titleCache, ab.nomenclaturaltitle, ab.protectedtitlecache, ab.protectednomenclaturaltitlecache
FROM AgentBase ab
WHERE ab.nomenclaturaltitle LIKE '%&%&%' OR ab.titleCache LIKE '%&%&%';

In some databases there are many names with protected caches which may lead to problems when searching for names.

SELECT tn.id, tn.titleCache, tn.nameCache, tn.protectedTitleCache, tn.protectedNameCache, tn.protectedFullTitleCache, tn.protectedAuthorshipCache , tn.*
FROM TaxonName tn
WHERE tn.protectednamecache = 1 OR tn.protectedtitlecache = 1 OR tn.protectedFullTitleCache = 1 OR tn.protectedAuthorshipCache ;

Especially interesting might be those having an "&" in the protected nameCache which indicates that there is authorship in the nameCache which should not happen.

Also many Teams have a protected nomenclatural title which is in most cases not necessary.

SELECT ab.id, ab.DTYPE, ab.titleCache, ab.nomenclaturaltitle, ab.protectedtitlecache, ab.protectednomenclaturaltitlecache, ab.*
FROM AgentBase ab
WHERE
-- ab.protectedtitlecache = 1 OR
 ab.protectednomenclaturaltitlecache = 1
ORDER BY ab.nomenclaturaltitle

However, in most cases this is not critical.


Related issues

Related to Edit - task #9648: Remove duplicated extensionsNewAndreas Müller06/03/2021

Actions
Related to Edit - task #9650: Update all titleCaches after upgrade of cache strategies and formattersIn ProgressAndreas Müller06/03/2021

Actions
Copied to Edit - task #9658: Cleanup authors with protected names and authorsNewAndreas Müller06/08/2021

Actions
Actions #1

Updated by Andreas Müller about 1 year ago

Protected Names (still without fulltitlecache and authorshipcache):

Bromeliaceae: 584
Cichorieae: 476
Cuba: 567
Campanula: 28
Caryo_amaranth: 18
Nepenthes: 8
caryo_genrea: 1
caryo_spp: 268
Greece: 3
Palmae: 66
casearia: 7
COL: >500000
Corvidae: 87
Cyprus: 97 (mostly hybrids)
Diptera: 3
E+M: 801 (mostly hybrids)
E+M cauc: 4
Fl dAfrique Central: 5190
myristicacea: 1
Phycobank: 245 => WHK
Salvador: 3 (ok, highest taxa, but ask Walter)

Guianas: 1487 !!
Flora Malesiana: 535
FM-clean: 2
FM propective: 84
Gabon: 90

RL: All, but many in Animalia, GermanSL, Moose, Standardliste + Plantae(?)

Globis: 7483
Edapho: 4
PiB: All, but most in Ants
Vibrant: 24165

Actions #2

Updated by Andreas Müller about 1 year ago

Protected Teams(nomenclaturalTitle):

AlgaTerra: >14000
Cichorieae: 8487
Cuba: 14
Campanula: 1171
Caryo_amaranth: 113
Nepenthes: 65
caryo_genrea: 128
caryo_spp: 126
Palmae: 12592
casearia: 1
COL: >1900000
Corvidae: 487
Cyprus: 1
Diptera: 365
E+M: 16
E+M cauc: 2
Fl dAfrique Central: 25925
Phycobank: 270
Salvador: 19
Mexiko: 101

Guianas: 14465
Flora Malesiana: >121000
FM-clean: 16
FM propective: 55
Gabon: 12940

RL: All, but many in Armeria, Moose, Standardliste + Plantae(?)

Globis: 7483
PiB: few, mostly in chenopodium and spiders
Vibrant: 41202

Actions #3

Updated by Andreas Müller about 1 year ago

  • Subject changed from Cleanup protected names and teams to Cleanup authors with multiple & and protected names and teams
  • Description updated (diff)
  • Target version changed from Unassigned CDM tickets to Release 5.19
Actions #4

Updated by Andreas Müller about 1 year ago

  • Description updated (diff)
Actions #5

Updated by Andreas Müller about 1 year ago

  • Status changed from New to In Progress
  • Priority changed from New to Highest
  • % Done changed from 0 to 10
Actions #6

Updated by Andreas Müller about 1 year ago

  • Description updated (diff)
Actions #7

Updated by Andreas Müller 12 months ago

  • Target version changed from Release 5.19 to Release 5.21
Actions #8

Updated by Andreas Müller 11 months ago

  • Target version changed from Release 5.21 to Release 5.22
Actions #9

Updated by Andreas Müller 9 months ago

  • Target version changed from Release 5.22 to Release 5.25
Actions #10

Updated by Andreas Müller 9 months ago

  • Tags set to formatting
Actions #11

Updated by Andreas Müller 8 months ago

  • Related to task #9648: Remove duplicated extensions added
Actions #12

Updated by Andreas Müller 8 months ago

  • Description updated (diff)
Actions #13

Updated by Andreas Müller 8 months ago

  • Description updated (diff)
  • % Done changed from 10 to 20
Actions #14

Updated by Andreas Müller 8 months ago

  • Description updated (diff)
Actions #15

Updated by Andreas Müller 8 months ago

  • Description updated (diff)
Actions #16

Updated by Andreas Müller 8 months ago

  • Copied to task #9658: Cleanup authors with protected names and authors added
Actions #17

Updated by Andreas Müller 8 months ago

  • Related to task #9650: Update all titleCaches after upgrade of cache strategies and formatters added
Actions #18

Updated by Andreas Müller 8 months ago

  • Status changed from In Progress to Closed
  • Target version deleted (Release 5.25)
  • % Done changed from 20 to 100

protected (title)caches moved to new ticket #9658.

All "multiple &" solved.

Potentially dangerous protected caches will be handled in #9650

Actions

Also available in: Atom PDF