Project

General

Profile

task #4232

Updated by Andreas Müller almost 4 years ago

Hi Everybody,



The issue raised with http://dev.e-taxonomy.eu/trac/ticket/4230 <http://dev.e-taxonomy.eu/trac/ticket/4230> and some other tickets (#4226, #4114, #3333) is highly critical with Java 7+ and everybody should be aware of it when responsible for writing Comparators.







The Comparator contract requires that



1) compare(o1,o2) == 0 must be true and ONLY be true if o1.equals(o2).



2) sgn(compare(o1,o2)) = -sgn(compare(o2,o1)) (symmetrie)



3) compare function must be transitive



Checking our comparators in cdmlib-model (and some others) I realized that almost none of our comparators does fulfill all of these conditions. So I would like to remind us all to be aware of this. Java 7 is more sensitive on these conditions as the new TimSort sort algorithm throws exceptions otherwise.



Condition 1) for CdmBase objects usually means that compare(o1,o2) MUST return 0 whenever 2 objects do have the same uuid. This can easily be managed by checking if this is the case as a very first step and returning 0 if yes.



However, the second implication is more tricky. You must guarantee that in no other cases 0 is returned.



So having something like



~~~

Compare(o1,02){



x1 = o1.getX();

x2 = o1.getX();

return x1.compare(x2);

}

~~~



which I have seen in many comparators is generally WRONG because x1.compare(x2) will usually not guarantee that o1.getUuid.equals(o2.getUuid()).





A concrete example are taxon comparators which send the comparison to the attached names. In case both taxa do use the exact same name this will definetly return 0 even if name.compare(name) is correctly implemented. However, the taxa do not have the same uuid and therefore they should not return 0.



A possible correct implementation could do



~~~

...



int result = name1.compare(name2);

if (result != 0){

return result

}else{

return taxon1.getUuid().compare(taxon2.getUuid());

}

~~~



Condition 2) and 3) require more individual handling. But be aware that your ordering must never be arbitrary. E.g.



~~~

compare(name1, name2){

rank1 = name1.getRank()

rank2 = name2.getRank()

if (rank1 == null) return 1;

if (rank2 == null) return -1;

...

}

~~~



will always be INVALID as compare(name1, name2) == compare(name2, name1) == 1 is possible for both ranks == null, but is against the symmetrie contract (2).



So EVERYONE who will write comparators or did so in the past please carefully check your code for correct implementations.



@Katja: we both will need to check this systematically in CdmLib for old code. Also a very critical method is e.g. IdentifiablyEntity.compare()







Cheers,



Andreas M.



P.S:



Be aware that 2) and 3) and partly 1) also hold for the Comparable interface (comparetTo() method).

Back

Add picture from clipboard (Maximum size: 40 MB)