Project

General

Profile

Actions

task #4232

closed

Check all Comparators for fulfilling the Comparator contract

Added by Andreas Müller almost 10 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Highest
Assignee:
Category:
cdmlib
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Severity:
critical

Description

Hi Everybody,

    The issue raised with http://dev.e-taxonomy.eu/trac/ticket/4230 <http://dev.e-taxonomy.eu/trac/ticket/4230>  and some other tickets  (#4226, #4114, #3333) is highly critical with Java 7+ and everybody should be aware of it when responsible for writing Comparators.

The Comparator contract requires that

1) compare(o1,o2) == 0 must be true and ONLY be true if o1.equals(o2).

2) sgn(compare(o1,o2)) = -sgn(compare(o2,o1)) (symmetrie)

3) compare function must be transitive

Checking our comparators in cdmlib-model (and some others) I realized that almost none of our comparators does fulfill all of these conditions. So I would like to remind us all to be aware of this. Java 7 is more sensitive on these conditions as the new TimSort sort algorithm throws exceptions otherwise.

Condition 1) for CdmBase objects usually means that compare(o1,o2) MUST return 0 whenever 2 objects do have the same uuid. This can easily be managed by checking if this is the case as a very first step and returning 0 if yes.

However, the second implication is more tricky. You must guarantee that in no other cases 0 is returned.

So having something like

 Compare(o1,02){

    x1 = o1.getX();
    x2 = o1.getX();
    return x1.compare(x2);
 }

which I have seen in many comparators is generally WRONG because x1.compare(x2) will usually not guarantee that o1.getUuid.equals(o2.getUuid()).

A concrete example are taxon comparators which send the comparison to the attached names. In case both taxa do use the exact same name this will definetly return 0 even if name.compare(name) is correctly implemented. However, the taxa do not have the same uuid and therefore they should not return 0.

A possible correct implementation could do

    ...

    int result = name1.compare(name2);
    if (result != 0){
        return result
   }else{
        return taxon1.getUuid().compare(taxon2.getUuid());
   }

Condition 2) and 3) require more individual handling. But be aware that your ordering must never be arbitrary. E.g.

compare(name1, name2){
    rank1 = name1.getRank()
    rank2 = name2.getRank()
    if (rank1 == null) return 1;
    if (rank2 == null) return -1;
    ...
}

will always be INVALID as compare(name1, name2) == compare(name2, name1) == 1 is possible for both ranks == null, but is against the symmetrie contract (2).

So EVERYONE who will write comparators or did so in the past please carefully check your code for correct implementations.

@Katja: we both will need to check this systematically in CdmLib for old code. Also a very critical method is e.g. IdentifiablyEntity.compare()

Cheers,

Andreas M.

P.S:

Be aware that 2) and 3) and partly 1) also hold for the Comparable interface (comparetTo() method).


Files

comparatorlist.xlsx (11.2 KB) comparatorlist.xlsx Katja Luther, 02/09/2017 01:13 PM

Related issues

Copied to EDIT - bug #6311: Refactor IdentifiableEntity.compareTo methodRejectedAndreas Müller

Actions
Actions

Also available in: Atom PDF