Why is there no default equals method?

Hello,

I would like to know what the rationale behind equals is, specifically why we are allowed to implement it freely instead of it being generated by the compiler. (I realize that the “default” equals I ask for technically exists in the form of the default behavior of equals for any class that does not override it. However that notion is not what most people want when they call equals)

Now if in IntelliJ I create a class Main with fields:

private int var1;
private byte var2;
private String var3;
private Test var4;

and generate equals and hashCode using the Generate... -> equals() and hashCode(), choose the template java.util.Objects.equals and hasCode (java7+), check none of the boxes in the first window, select all fields in the second and third window, none in the fourth window (select all non-null fields) and press create I will receive this:

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        Main main = (Main) o;
        return var1 == main.var1 && var2 == main.var2 && Objects.equals(var3, main.var3)
            && Objects.equals(var4, main.var4);
    }

This to me seems like boilerplate, you always do the following:

  1. Check if they refer to the same object in memory.
  2. Enforce that even if both are null they are not considered equal (weird but I guess that is convention and you can always use Objects.equals to wrap this equals method)
  3. Check that both objects are of the same dynamic type.
  4. Cast.
  5. Compare fields by using == on primitives and equals on proper classes.

However this is not enforced by the documentation of equals at all, rather it just dictates (among some technicalities) that equals should define an equivalence relation on members of this class.

Basically I have to trust (or verify) that every class I use actually tests equality the way I assume, i.e. by comparing all primitives via == and all proper classes via their equals methods (which in turn need to act as I expect them too)

Even worse I need to not only manage that but also make sure that their hashCode methods are compliant with their equals methods or I risk elements being treated as unequal even if all of their fields are the same.

Why does Java not simply offer the above equality by default (and ship it with a compliant hashCode implementation while we are at it) and let the developer opt into changing them if they think it is sensible? This should come with a warning that a class defines its equals method different from the default (and of course this warning should propagate through all classes that touch this equals method) that you have to suppress manually.

Also it seems weird that you can pass a way of ordering objects as a parameter to a sorting algorithm in Java but you are locked into one notion of equals (i.e. an equivalence relation that the ordering function has to comply with) and hashCode per class.

You are asking a great question: What is equality?

Leibniz came up with an answer that is still good enough for modern mathematics. Equality a = b can be axiomatized in second-order logic as

  • \forall x.\ x = x
  • \forall x y P.\ x = y \to P\, x \to P\, y

In other words, equality is the smallest reflexive binary relation :stuck_out_tongue:. More on equality in ICL.

Basically, whenever two objects are equal, we can not find a distinguishing context. Or conversely, if you can find a distinguishing context, your objects must be different.

Now, objects in Java have identity. You can tell that two objects are the same using a == b. Conversely, if two objects are not the same, you will have a != b. Crucially, this (and the dynamic type) are the only things you can not change by mutating the fields of the object. Now, if you mutate a field, does the instance become a different instance? Java defaults to “no”. For example, a User could change their gender and still be the same person.

Thus the default equality uses this, especially since being any coarser would make equals() return true for distinguishable objects.

When you overwrite equals(), you basically add an invariant to your program that you never use objects of the type you implemented equals() for in such a way that equal objects (under the equals() equality) can be distinguished. This requirement is kind of imposed onto you by other classes. For example, comparing Lists using == would be considered very bad style, even if you actually need identity. For some classes it even is explicitly unsupported.

However, Java can not simply assume that you write your code such that you do not care about identity. Thus, the default equals() does not assume this.

If you do not care about identity, use a record and say that you have a value-based class. However, you then (should) give up mutability.

3 Likes

Regarding hashCode:
The hashcode is tightly linked to equality.
\forall a~b.~a.equals(b) \Rightarrow a.hashCode()==b.hashCode()
(The converse does not necessarily hold.)
It also can be hard to find a good hashing function in general. (more in [Gr]AlgoDat)

But you can actually tell the JVM to use different default hashcode.
You can see the flags with

# java 8
java -XX:+PrintFlagsFinal 
# java 17 (more flags)
java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version -XX:+JVMCIPrintProperties

The hashCode property is 5 by default meaning the Xorshift by Marsaglia is used.

Other options for hashcode are:

  1. Park-Miller RNG (also called Lehmer random number generator)
  2. global memory address
  3. 1
  4. sequential incrementing counter
  5. heap address
  6. thread-seeded Xorshift

If a class overwrote hashCode, you can use System.indentityHashCode to get the default hashing function result.

Note: Memory-based hashcode need special care (from the JVM) so as to not interfere with garbage collection.

2 Likes

Thanks for the great answers to both of you!

Now, objects in Java have identity. You can tell that two objects are the same using a == b. Conversely, if two objects are not the same, you will have a != b. Crucially, this (and the dynamic type) are the only things you can not change by mutating the fields of the object. Now, if you mutate a field, does the instance become a different instance? Java defauts to “no”. For example, a User could change their gender and still be the same person.

When you overwrite equals(), you basically add an invariant to your program that you never use objects of the type you implemented equals() for in such a way that equal objects (under the equals() equality) can be distinguished. This requirement is kind of imposed onto you by other classes. For example, comparing Lists using == would be considered very bad style, even if you actually need identity. For some classes it even is explicitly unsupported.

This is exactly the thing/terminology I was looking for. Basically this type of class is defined by “value” and you can not even surmise anything about the way it handles instantiation (short of using one of those deprecated constructors).

But this also means that any class going down this road is a hazard when it comes to using == (or maybe == is the actual hazard) because it may be using an instance pool internally to save memory. I only just learned this by chance today and it is not in the lecture notes from what I can tell.
So basically what some classes like e.g. String do is, that whenever you assign two variables with the same String value it actually does not create a new instance but rather it stores the first instance in an internal “pool” of strings and then once the second variable comes along it just tells it where it stored the value of the first one. But this means that they are not just equals but also ==-equals. I.e. the code:

        String s1 = "Hello World";
        String s2 = "Hello World";
        String s3 = new String("Hello World");
        String s4 = new String("Hello World");
        System.out.println(s1 == s2);
        System.out.println(s1 == s3);
        System.out.println(s1 == s4);
        System.out.println(s2 == s3);
        System.out.println(s2 == s4);
        System.out.println(s3 == s4);

will output:

true
false
false
false
false
false

As a corollary it is impossible to retain “ownership” of a String and track its path through a program because two people might declare the very same string as “theirs”, leading to conflicts.

So I guess this is a trade-off Java makes to make an immutable class more efficient as hinted by this short teaser at the end of section 8.6 in the lecture notes:

Remark 8.6.2. Many important classes in Java are immutable and use tricky implementations to reduce the inefficiency introduced by copying objects. A prominent example is String .


The hashCode property is 5 by default meaning the Xorshift by Marsaglia is used.

Actually I can not find the hashCode property if I execute either the first or the second command locally (and they work, they produce a long list of flags). I have the latest JDK (openjdk version "18-ea" 2022-03-22) so I can’t tell what went wrong exactly.

When you say that memory-based hashing needs extra care because of GC I assume that you mean that a garbage collector can relocate my objects without warning and hence even their memory address isn’t an invariant?

As Marcel mentioned, the hashCode needs to remain consistent when the object is moved in memory by the garbage collector. Thus modern Java basically adds a field to each object, which is then used to track the identity hashCode. More information here: Java Objects Inside Out

1 Like

Here a web explorer:
https://chriswhocodes.com/hotspot_options_openjdk17.html

It is marked as unstable and experimental.
Therefore, one has to also enable experimental option printing:

java -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version