Project

General

Profile

feature request #4383

multiple issues with de-duplication of references in bibliography

Added by Andreas Kohlbecker about 6 years ago. Updated 3 days ago.

Status:
New
Priority:
New
Category:
cdm-dataportal
Target version:
Start date:
09/22/2014
Due date:
% Done:

0%

Severity:
normal

Description

the bibliography is assembled by making use of the cdm datportal footnote system. The FootnoteManager is not fully de-duplicating original sources which differ in some details:

1)

Have different name in source information. This can bee seen at the example of Calamus aruensis in the Palmweb portal.

The bibliography of Calamus aruensis contains the following entries:

 W.J. Baker & R.P Bayton & J. Dransfield & R.A Maturbongs, A revision of the Calamus aruensis (Arecaceae) complex in New Guinea and the Pacific. 2003
 in - Govaerts, R. & Dransfield, J., World Checklist of Palms (as Calamus aruensis Becc.)
 in - Govaerts, R. & Dransfield, J., World Checklist of Palms (as Calamus latisectus Burret)
 in - Govaerts, R., World Checklist of Seed Plants (as Calamus latisectus Burret)
 in - Govaerts, R. & Dransfield, J., World Checklist of Palms (as Palmijuncus aruensis (Becc.) Kuntze)
 in - Govaerts, R. & Dransfield, J., World Checklist of Palms (as Calamus hollrungii Becc.)
 in - Govaerts, R., World Checklist of Seed Plants (as Calamus hollrungii Becc.)
 in - Govaerts, R., World Checklist of Seed Plants (as Calamus aruensis Becc.)

which should be de-duplicated to a form, in which the name used in source information for each citation is assembled, like:

 W.J. Baker & R.P Bayton & J. Dransfield & R.A Maturbongs, A revision of the Calamus aruensis (Arecaceae) complex in New Guinea and the Pacific. 2003
 in - Govaerts, R. & Dransfield, J., World Checklist of Palms (as Calamus aruensis Becc., Calamus latisectus Burret, Palmijuncus aruensis (Becc.) Kuntze, Calamus hollrungii Becc.)
 in - Govaerts, R., World Checklist of Seed Plants (as Calamus latisectus Burret, Calamus hollrungii Becc., Calamus aruensis Becc.)

2)

Duplicate references with same title but differing in details like Reference.uri (see #4383#note-11, #4383#note-12)

TODO:

  • adapt tests, see r21741

picture748-1.png View (30.6 KB) Andreas Kohlbecker, 08/13/2018 01:17 PM

picture260-1.png View (148 KB) Andreas Kohlbecker, 04/03/2020 02:24 PM

picture831-1.png View (168 KB) Andreas Müller, 09/15/2020 10:46 AM

History

#1 Updated by Andreas Kohlbecker almost 6 years ago

tests adapted accordingly and added a new section to test for duplicates: r21741

#2 Updated by Andreas Kohlbecker over 5 years ago

  • Target version changed from cdm_dataportal RELEASE 3.5.0 to cdm_dataportal RELEASE 3.5.1

moving tickets to next milestone

#3 Updated by Andreas Müller over 5 years ago

  • Target version deleted (cdm_dataportal RELEASE 3.5.1)

move open 3.5.1 tickets to next milestone after release

#4 Updated by Andreas Müller over 5 years ago

  • Target version deleted ()

#5 Updated by Andreas Müller about 5 years ago

  • Target version changed from cdm_dataportal RELEASE 3.8 to Reviewed Next Major Release
  • Priority changed from New to Priority12

#6 Updated by Andreas Kohlbecker almost 5 years ago

detaching from parent ticket #4314

#7 Updated by Andreas Kohlbecker about 2 years ago

  • Description updated (diff)
  • Private changed from Yes to No

#8 Updated by Andreas Kohlbecker about 2 years ago

Same or similar problem reported by Walter:

bei http://caryophyllales.org/nepenthaceae/cdm_dataportal/taxon/46325539-e6b9-4db6-949e-ab222b25b775 wird die gleiche Referenz 4x ausgegeben – 2x wäre in diesem Fall genug (ich habe jetzt C auch mit dem Name-in-source versehen – jetzt sind’s 4 gleiche Einträge).

#9 Updated by Andreas Kohlbecker about 2 years ago

  • Tags set to caryophyllales

#10 Updated by Andreas Müller over 1 year ago

  • Priority changed from Priority12 to Priority09

#11 Updated by Andreas Kohlbecker 6 months ago

another case of failing deduplication:

http://portal.cybertaxonomy.org/flora-de-la-republica-de-cuba/cdm_dataportal/taxon/4ee3e6f6-9bc8-441b-849e-95887b6e7a85

this is not reproducible locally even with identical data.

Obviously in case of Micromorfología the source link is not added to the reference markup. This is the case for the failing deduplication but the reason is for this failure is not explainable.

#12 Updated by Andreas Müller 4 days ago

Same problem reported by Sophia:

mir ist gerade aufgefallen dass im Portal der Flora Cuba die Referenzen, die bei den Factual Data Einträgen für „Conservation“, „Chromosome Numbers“ und „Cultivation“ eingetragen sind unten in der Bibliographie nicht zusammengefasst werden, sondern alle einzeln erscheinen.
Walter hatte diese Kategorien gerade neu freigeschaltet, damit sie im Portal erscheinen. Das Problem hatten wir schon mal bei „Micromorphology“, als Walter das neu erstellt hatte, ich hab den Verlauf mal angehängt, ich denke es ist das selbe Problem?
Ein Beispiel wäre Vaccinium leonis (http://portal.cybertaxonomy.org/flora-de-la-republica-de-cuba/cdm_dataportal/taxon/9b7a6c9a-f6a5-4292-969d-ea92ae1af934):

#13 Updated by Andreas Müller 4 days ago

  • Priority changed from Priority09 to New
  • Target version changed from Reviewed Next Major Release to Unassigned CDM tickets

I think we should reasses the priority of this ticket as it is reported from time to time

#14 Updated by Andreas Kohlbecker 3 days ago

  • Subject changed from references in bibliography need de-duplication to multiple issues with de-duplication of references in bibliography
  • Description updated (diff)

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 40 MB)