task #4232
closedCheck all Comparators for fulfilling the Comparator contract
100%
Description
Hi Everybody,
The issue raised with http://dev.e-taxonomy.eu/trac/ticket/4230 <http://dev.e-taxonomy.eu/trac/ticket/4230> and some other tickets (#4226, #4114, #3333) is highly critical with Java 7+ and everybody should be aware of it when responsible for writing Comparators.
The Comparator contract requires that
1) compare(o1,o2) == 0 must be true and ONLY be true if o1.equals(o2).
2) sgn(compare(o1,o2)) = -sgn(compare(o2,o1)) (symmetrie)
3) compare function must be transitive
Checking our comparators in cdmlib-model (and some others) I realized that almost none of our comparators does fulfill all of these conditions. So I would like to remind us all to be aware of this. Java 7 is more sensitive on these conditions as the new TimSort sort algorithm throws exceptions otherwise.
Condition 1) for CdmBase objects usually means that compare(o1,o2) MUST return 0 whenever 2 objects do have the same uuid. This can easily be managed by checking if this is the case as a very first step and returning 0 if yes.
However, the second implication is more tricky. You must guarantee that in no other cases 0 is returned.
So having something like
Compare(o1,02){ x1 = o1.getX(); x2 = o1.getX(); return x1.compare(x2); }
which I have seen in many comparators is generally WRONG because x1.compare(x2) will usually not guarantee that o1.getUuid.equals(o2.getUuid()).
A concrete example are taxon comparators which send the comparison to the attached names. In case both taxa do use the exact same name this will definetly return 0 even if name.compare(name) is correctly implemented. However, the taxa do not have the same uuid and therefore they should not return 0.
A possible correct implementation could do
... int result = name1.compare(name2); if (result != 0){ return result }else{ return taxon1.getUuid().compare(taxon2.getUuid()); }
Condition 2) and 3) require more individual handling. But be aware that your ordering must never be arbitrary. E.g.
compare(name1, name2){ rank1 = name1.getRank() rank2 = name2.getRank() if (rank1 == null) return 1; if (rank2 == null) return -1; ... }
will always be INVALID as compare(name1, name2) == compare(name2, name1) == 1 is possible for both ranks == null, but is against the symmetrie contract (2).
So EVERYONE who will write comparators or did so in the past please carefully check your code for correct implementations.
@Katja: we both will need to check this systematically in CdmLib for old code. Also a very critical method is e.g. IdentifiablyEntity.compare()
Cheers,
Andreas M.
P.S:
Be aware that 2) and 3) and partly 1) also hold for the Comparable interface (comparetTo() method).
Files
Related issues