bug #7829
closedImprove deduplication of parsed names and references
100%
Description
The default matching strategies are to strict for data name and nom. ref. data created by a parser. Also there are some mistakes in the defined matching.
Generally we need to make more often use of EQUAL_OR_SECOND_NULL match mode, because the parsed data is always not very complete while existing data might be more complete. E.g. the authors of a reference might be stored with full name while the full name is usually not available by parsed data. Also place published might be unknown in the parsed data but this might be discussed as it is sometimes part of the parsed string.
We probably need a MatchingStrategyFactory that offers matching strategies specific for parsed data compared to persisted richer data.
===
There were also wrong matching like not matching null and empty datePublished and taking nomenclaturallyRelevant into account.
Related issues
Updated by Andreas Müller about 5 years ago
- Related to feature request #7800: Parse preliminary RefDetails added
Updated by Andreas Müller about 5 years ago
- Status changed from New to In Progress
Updated by Andreas Müller about 5 years ago
- Target version changed from Release 5.4 to Release 5.5
- % Done changed from 0 to 10
Updated by Andreas Müller almost 5 years ago
- Target version changed from Release 5.5 to Release 5.6
Updated by Andreas Müller almost 5 years ago
- Priority changed from New to Highest
- Target version changed from Release 5.6 to Release 5.7
Updated by Andreas Müller over 4 years ago
- Target version changed from Release 5.7 to Release 5.8
Updated by Andreas Müller almost 3 years ago
- Related to feature request #9085: Improve deduplication of parsed names added
Updated by Andreas Müller almost 3 years ago
- Status changed from In Progress to Closed
- % Done changed from 10 to 100
This has been finished (with few exceptions) in #9085.
Updated by Andreas Müller almost 3 years ago
- Related to bug #1119: [PARSER] Duplicate inreferences created during parsing are NOT merged when data gets saved added
Updated by Andreas Müller almost 3 years ago
- Related to bug #9157: Further improve deduplication of names added