A New and Consistent Way of Implementing Order for Comparable Java Objects | Part 1


Introduction

In Java it is often important to implement natural order for classes. While it isn’t necessary to define natural order for all of one’s own classes, it definitely makes sense for so-called value classes, i.e., classes whose characteristics are more related to data than to behavior.

Defining a specific natural order becomes relevant as soon as such classes are used in algorithms or data structures (esp. collections) that perform some kind of sorting. This might obviously be the case for methods like Arrays.sort, Collections.sort, or List#sort (since Java 8), but it might be less obvious for collections like TreeSet or TreeMap because these data structures perform their sorting (and even weight balancing) internally and usually hidden to the API user.

Fortunately, a programmer cannot accidentally use a method or collection that requires such a comparable object without the object actually being comparable. Either one’s own class implements the generic Comparable<T> interface, which in turn forces the programmer to implement its single method compareTo(T), or—if Comparable can’t or hasn’t been implemented—the programmer can implement an external generic Comparator<T> that gets handed over to the corresponding method or data structure together with the actual class objects. In other words, implementing Comparable provides a “built-in”, “internal” definition of a class’s order, whereas implementing a Comparator provides an “add-on”, “external” definition of it.

The Old Way up to Java 6

Having pointed out the main differences between Comparable and Comparator, the actual implementation in order to define a class’s natural order remains similar for both. Here, one has to distinguish between the way Comparable’s compareTo(T) and Comparator’s compare(T, T) methods were implemented in the past, up to and including Java 6, then in Java 7, and then how they can and should be implemented today, since Java 8.

In order to demonstrate the different ways, let me introduce the following class. It represents an artificial value class that contains all possible data types for its fields, i.e., there is a field for each of Java’s primitive types, like short or double (having the names shortField and doubleField, respectively), and a field for a reference type (having the name comparableClassField, in this case referencing another comparable class called OtherComparableClass). I’ve refrained from also referencing arrays, since implementing order on arrays usually requires looping through all of its elements, blows up the whole code, usually isn’t very common in practice, and therefore doesn’t contribute much to the main message of this blog post. So here is the class:

public final class ComparableJava6DemoClass implements Comparable<ComparableJava6DemoClass> {

    private boolean booleanField;

    private byte byteField;

    private char charField;

    private short shortField;

    private int intField;

    private long longField;

    private float floatField;

    private double doubleField;

    private OtherComparableClass comparableClassField;

    /* [...] Constructors omitted. */

    @Override
    public int compareTo(ComparableJava6DemoClass otherDemoClass) {
        int difference;

        /* 1. booleanField */
        if (!this.booleanField && otherDemoClass.booleanField) return -1;
        if (this.booleanField && !otherDemoClass.booleanField) return +1;

        /* 2. byteField */
        if (this.byteField < otherDemoClass.byteField) return -1;
        if (this.byteField > otherDemoClass.byteField) return +1;

        /* 3. charField */
        if (this.charField < otherDemoClass.charField) return -1;
        if (this.charField > otherDemoClass.charField) return +1;

        /* 4. shortField */
        if (this.shortField < otherDemoClass.shortField) return -1;
        if (this.shortField > otherDemoClass.shortField) return +1;

        /* 5. intField */
        if (this.intField < otherDemoClass.intField) return -1;
        if (this.intField > otherDemoClass.intField) return +1;

        /* 6. longField */
        if (this.longField < otherDemoClass.longField) return -1;
        if (this.longField > otherDemoClass.longField) return +1;

        /* 7. floatField */
        difference = Float.compare(this.floatField, otherDemoClass.floatField);
        if (difference != 0) return difference;

        /* 8. doubleField */
        difference = Double.compare(this.doubleField, otherDemoClass.doubleField);
        if (difference != 0) return difference;

        /* 9. comparableClassField */
        if (this.comparableClassField == null) {
            difference = (otherDemoClass.comparableClassField == null ? 0 : +1);
        } else {
            difference = (otherDemoClass.comparableClassField == null ? -1
                          : this.comparableClassField
                                  .compareTo(otherDemoClass.comparableClassField));
        }
        if (difference != 0) return difference;

        return 0;
    }

    /* [...] Other methods omitted. */
}
public abstract class OtherComparableClass implements Comparable<OtherComparableClass> {

    /* [...] Body omitted. */
}

A few things bear noting:

  • The order in which the fields are compared definitely matters. In this artificial demo class, the class members are compared in the same order as they have been declared as fields. There is no reason to do so. In real-world classes, it’ll be much more likely to use a different order than the “declaration order”. The first comparison (commented here with /* 1. booleanField */) corresponds to the “most significant field”. Only if the two boolean values prove to be equal will the second most significant member (in this case byteField, commented with /* 2. byteField */) be examined and compared. You see that in this demo class there is a total of nine comparison steps. If all nine fields prove to be equal, then, and only then, will the compareTo method return 0, indicating equality.

  • Implementing an external Comparator is similar, but not exactly the same. Its compare(T, T) method requires two parameters: the two classes to be compared. Comparable’s compareTo(T) method does only need one parameter, since it already “knows itself” and thus only needs another reference to the “other” object. The following code excerpt thus uses demoClass1 and demoClass2 instead of this and otherDemoClass:

    import java.util.*;
    
    public final class DemoComparator implements Comparator<ComparableJava6DemoClass> {
    
        @Override
        public int compare(ComparableJava6DemoClass demoClass1, ComparableJava6DemoClass demoClass2) {
            int difference;
    
            /* 1. booleanField */
            if (!demoClass1.isBooleanField() && demoClass2.isBooleanField()) return -1;
            if (demoClass1.isBooleanField() && !demoClass2.isBooleanField()) return +1;
    
            /* 2. byteField */
            if (demoClass1.getByteField() < demoClass2.getByteField()) return -1;
            if (demoClass1.getByteField() > demoClass2.getByteField()) return +1;
    
            /* [...] Other comparisons omitted. */
    
            /* 9. comparableClassField */
            if (demoClass1.getComparableClassField() == null) {
                difference = (demoClass2.getComparableClassField() == null ? 0 : +1);
            } else {
                difference = (demoClass2.getComparableClassField() == null ? -1
                              : demoClass1.getComparableClassField()
                                      .compareTo(demoClass2.getComparableClassField()));
            }
            if (difference != 0) return difference;
    
            return 0;
        }
    }

    Another difference is the use of getter methods, since an external Comparator usually cannot directly access fields of the classes that need to be compared (this is called encapsulation). Whenever possible, one should try to implement the “internal” compareTo method rather than writing the “external” compare method.

  • There are different ways of comparing the fields, depending on their type:

    • Primitive integral types (i.e., byte, char, short, int, and long, see comparisons 2 to 6) seem to be the simplest, but should be changed starting with Java 8 nevertheless (see below). Understanding their functionality is easy. In case of the “internal” compareTo method, put yourself in perspective of the class that implements it. Then say:

      “If my (i.e., this) value is less than the value of the other (i.e., the parameter), return a negative number. If my value is greater than the value of the other, return a positive number.”

      You can do similarly for the “external” compare method:

      “If the value of the first parameter is less than the value of the second parameter, return a negative number. If the value of the first parameter is greater than the value of the second parameter, return a positive number.”

    • Comparing booleans (see comparison 1) is less obvious, but still easy to follow if you read the source code. In this case, false “comes before” true, so an instance of ComparableDemoClass that has booleanField set to false is “less than” an instance of the same class that has booleanField set to true. Note that this is just a matter of definition and not carved in stone.

    • Floating-point numbers (i.e., float and double, see comparisons 7 and 8) should use the static compare methods of the their wrapper classes Float and Double, respectively. floats and doubles are much more complicated than integral types because they can have special values: positive and negative zeros, positive and negative infinities, and NaN (“not a number”) as the result of certain invalid operations such as dividing zero by zero. Since you probably don’t want to deal with all those specialties, simply use the compare methods mentioned above, which do all the hard work for you.

    • The old way of comparing reference fields (see comparison 9) is verbose and error-prone. I won’t go through the details of the code block, because it isn’t relevant anymore. The main idea is that reference types get recursively called by their compareTo methods. Since “my” reference field, the other reference field (i.e., the reference field of the provided parameter), or both may be null, a complicated case differentiation has to take place, which blows up the code. In the code provided above, null is “greater than” any other (valid) value. In other words, null comes last in order.

The Intermediate Way Since Java 7

Starting with Java 7, the wrapper classes for Boolean values (Boolean for boolean) and all integral primitive types (Byte for byte, Character for char, Short for short, Integer for int, and Long for long) have been equipped with static compare methods. Their usage is consistent with Float’s and Double’s usage. Joshua Bloch, the author of the great book “Effective Java”, states that the use of “the relational operators < and > in compareTo methods is verbose and error-prone and no longer recommended.”

I don’t fully agree with his statement. As you can see in the code example below, code verbosity—if one could ever call it “verbosity” at all—hasn’t really changed. Regarding the alleged error-proneness, I had a look at the JDK implementations of the wrapper classes’ static compare methods in order to find out whether I’ve missed some pitfalls with respect to the relational operators < and > for integral data types, but no, these methods don’t perform any magic. So from this perspective, I don’t see any advantage of using the static compare methods of the integral and Boolean wrapper classes, but from a perspective of consistency, it totally makes sense. Using the static compare methods is now the same for all primitive types in Java, which is reason enough to stick to them. Now here is the code, adapted for Java 7:

public final class ComparableJava7DemoClass implements Comparable<ComparableJava7DemoClass> {

    private boolean booleanField;

    private byte byteField;

    private char charField;

    private short shortField;

    private int intField;

    private long longField;

    private float floatField;

    private double doubleField;

    private OtherComparableClass comparableClassField;

    /* [...] Constructors omitted. */

    @Override
    public int compareTo(ComparableJava7DemoClass otherDemoClass) {
        int difference;

        /* 1. booleanField */
        difference = Boolean.compare(this.booleanField, otherDemoClass.booleanField);
        if (difference != 0) return difference;

        /* 2. byteField */
        difference = Byte.compare(this.byteField, otherDemoClass.byteField);
        if (difference != 0) return difference;

        /* 3. charField */
        difference = Character.compare(this.charField, otherDemoClass.charField);
        if (difference != 0) return difference;

        /* 4. shortField */
        difference = Short.compare(this.shortField, otherDemoClass.shortField);
        if (difference != 0) return difference;

        /* 5. intField */
        difference = Integer.compare(this.intField, otherDemoClass.intField);
        if (difference != 0) return difference;

        /* 6. longField */
        difference = Long.compare(this.longField, otherDemoClass.longField);
        if (difference != 0) return difference;

        /* 7. floatField */
        difference = Float.compare(this.floatField, otherDemoClass.floatField);
        if (difference != 0) return difference;

        /* 8. doubleField */
        difference = Double.compare(this.doubleField, otherDemoClass.doubleField);
        if (difference != 0) return difference;

        /* 9. comparableClassField */
        if (this.comparableClassField == null) {
            difference = (otherDemoClass.comparableClassField == null ? 0 : +1);
        } else {
            difference = (otherDemoClass.comparableClassField == null ? -1
                          : this.comparableClassField
                                  .compareTo(otherDemoClass.comparableClassField));
        }
        if (difference != 0) return difference;

        return 0;
    }

    /* [...] Other methods omitted. */
}

The New Way Since Java 8

As you can see, things still haven’t changed for the comparison of the reference type (see comparison 9). Its code block is still really verbose and error-prone. Java 8 introduced Optionals, which can be understood as little containers for single objects that are either empty or contain the object. Their advantage is based on the fact that their content can never be null. This concept could probably help simplifying the case differentiation a little bit. I’m thinking of using Optional’s ifPresent or ifPresentOrElse (since Java 9) methods, which could reduce the amount of if-else statements. However, since I’m presenting a completely different approach instead, we no longer have to follow this idea.

Starting with Java 8, interfaces can now have static and default implementations. This is exactly what Comparator does: it has been equipped with numerous methods that can roughly be seen as “factory methods”. They allow the step-by-step construction of Comparators, which themselves can be called either directly or within a class’s compareTo method. Let me show you the artificial demo class, this time implemented the new way since Java 8:

import java.util.*;

public final class ComparableJava8DemoClass implements Comparable<ComparableJava8DemoClass> {

    private static final Comparator<ComparableJava8DemoClass> DEMO_CLASS_COMPARATOR
            = Comparator.comparing((ComparableJava8DemoClass dc)
                                   -> Boolean.valueOf(dc.booleanField))               /* 1. */
            .thenComparingInt(dc -> dc.byteField)                                     /* 2. */
            .thenComparingInt(dc -> dc.charField)                                     /* 3. */
            .thenComparingInt(dc -> dc.shortField)                                    /* 4. */
            .thenComparingInt(dc -> dc.intField)                                      /* 5. */
            .thenComparingLong(dc -> dc.longField)                                    /* 6. */
            .thenComparingDouble(dc -> dc.floatField)                                 /* 7. */
            .thenComparingDouble(dc -> dc.doubleField)                                /* 8. */
            .thenComparing(dc -> dc.comparableClassField,
                           Comparator.nullsLast(OtherComparableClass::compareTo));    /* 9. */

    private boolean booleanField;

    private byte byteField;

    private char charField;

    private short shortField;

    private int intField;

    private long longField;

    private float floatField;

    private double doubleField;

    private OtherComparableClass comparableClassField;

    /* [...] Constructors omitted. */

    @Override
    public int compareTo(ComparableJava8DemoClass otherDemoClass) {
        return DEMO_CLASS_COMPARATOR.compare(this, otherDemoClass);
    }

    /* [...] Other methods omitted. */
}

Again, a few things bear noting:

  • The class now contains a static and final Comparator field (having the name DEMO_CLASS_COMPARATOR). It’s called inside the class’s compareTo method, which reduces the compareTo method to one single line. If the Comparator is declared as a nested class (declared best as a static member class), it can access all (usual private) fields of the enclosing class. If the Comparator is declared somewhere outside, then it’s usually dependend on getter methods in order to retrieve the field values. However, whether such a Comparator is now used “internally” or “externally” doesn’t make a big difference anymore—its use became much more consistent.

  • The Comparator is constructed step by step. Start with one of the static Comparator.comparing[XXX] methods for the “most significant field”, and then concatenate the other fields in the desired order using the thenComparing[XXX] methods. For booleans and all reference fields, use comparing and thenComparing. For bytes, chars, shorts, and ints, use [then]comparingInt. For longs, use [then]comparingLong, and for floats and doubles, use [then]comparingDouble. If you don’t use the primitive type versions of theses methods wherever possible, Java will perform autoboxing, which can be error-prone and affect performance significantly. Unfortunately, Java doesn’t provide [then]comparingBoolean methods, so you have to stick with the autoboxing feature (or do the boxing manually, as I did in the example above).

  • Each [then]comparing[XXX] method requires a so-called key extractor function as a parameter, usually implemented using lambdas. The input to such a key extractor function is the class to be compared with (here the demo class referenced by dc), and the output is the “extracted field”. If the lambda expression has direct access to the private fields, this is straightforward; if not, one needs to call the corresponding getter methods, as mentioned above.

  • The Java compiler is usually very good at determining the correct types of the lambda expressions. However, it shows to have some problems here. Thus, specify the type of the input explicitly at least in the first statement, as you can see in the example: (ComparableJava8DemoClass dc) -> Boolean.valueOf(dc.booleanField)). The types of the following lambdas should then be determined automatically.

  • The [then]comparing methods (not for int, long, or double) allow the specification of a different Comparator as the second method parameter. If you look at comparison 9 in the code example, you see the specification Comparator.nullsLast(OtherComparableClass::compareTo). Comparator.nullsLast is another static method that equips any specified Comparator with the functionality of dealing with nulls—instead of throwing NullPointerExceptions and without verbose case differentiation as seen in the code examples before. When using Comparator.nullsLast, nulls “come last” in order. If you want nulls to “come first”, simple use Comparator.nullsFirst.

Summary and Outlook

That’s it! The Java 8 way of implementing natural order looks modern and concise. Its use is consistent, save, and—most importantly—simple.

In Part 2 of this topic, I’d like to show you a real life example of implementing order for and performing sorting on more realistic value classes. I don’t assume that this blog already has many visitors or followers yet, but if you are one of those few, feel free to leave a comment or ask any questions.

Shortlink to this blog post: link.simplexacode.ch/vq292019.01

Leave a Reply