task #4232

Updated by Andreas Müller over 7 years ago

Hi Everybody, 


		 The issue raised with <>    and some other tickets    (#4226, #4114, #3333) is highly critical with Java 7+ and everybody should be aware of it when responsible for writing Comparators. 




 The Comparator contract requires that  


   1) compare(o1,o2) == 0 must be true and ONLY be true if o1.equals(o2).  


   2)    sgn(compare(o1,o2)) = -sgn(compare(o2,o1))     (symmetrie) 


   3) compare function must be transitive 


 Checking our comparators in cdmlib-model (and some others) I realized that almost none of our comparators does fulfill all of these conditions. So I would like to remind us all to be aware of this. Java 7 is more sensitive on these conditions as the new TimSort sort algorithm throws exceptions otherwise. 


 Condition 1) for CdmBase objects usually means that compare(o1,o2) MUST return 0 whenever 2 objects do have the same uuid. This can easily be managed by checking if this is the case as a very first step and returning 0 if yes. 


 However, the second implication is more tricky. You must guarantee that in no other cases 0 is returned. 


 So having something like 




     x1 = o1.getX(); 
     x2 = o1.getX(); 


 which I have seen in many comparators is generally WRONG because will usually not guarantee that o1.getUuid.equals(o2.getUuid()). 


 A concrete example are taxon comparators which send the comparison to the attached names. In case both taxa do use the exact same name this will definetly return 0 even if is correctly implemented. However, the taxa do not have the same uuid and therefore they should not return 0. 


 A possible correct implementation could do  




     int result =; 
     if (result != 0){ 
         return result 
         return taxon1.getUuid().compare(taxon2.getUuid()); 
 Condition 2) and 3) require more individual handling. But be aware that your ordering must never be arbitrary. E.g. 


 compare(name1, name2){ 
     rank1 = name1.getRank() 
     rank2 = name2.getRank() 
     if (rank1 == null) return 1; 
     if (rank2 == null) return -1; 


 will always be INVALID as compare(name1, name2) == compare(name2, name1) == 1 is possible for both ranks == null, but is against the symmetrie contract (2). 


 So EVERYONE who will write comparators or did so in the past please carefully check your code for correct implementations. 


 @Katja: we both will need to check this systematically in CdmLib for old code. Also a very critical method is e.g. 






 Andreas M.  




 Be aware that 2) and 3) and partly 1) also hold for the Comparable interface (comparetTo() method).