Effective Java: Equals and HashCode

Posted on Posted in Java

Effective Java: Equals and HashCode

“No Class is an Island” —Joshua Bloch

Have you encountered having duplicate values in a Set? How about adding an object to a list and then the list’s contains method say that it did not contain that object? How would you solve this? Where do you start tracing?

Have you ever thought that your Class may not implement the equals and hashCode methods? When you override the equals and hashCode methods, have you complied with their general contracts?

General Contract. Yes, when you override the equals method, you must adhere to its general contract as follows.

  1. It is reflexive: For any reference value x, x.equals(x) must return true.
  2. It is symmetric: For any reference values x and y, x.equals(y) must return true if and only if y.equals(x) return true.
  3. It is transitive. For any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) must return true.
  4. It is consistent. For any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.

For more information regarding this general contract, please refer to Chapter 3 of Effective Java Programming Language Guide By Joshua Bloch

Overriding equals and hashCode methods is a common practice that some programmers failed to observe. Many classes, including all collections classes, depend on the objects passed to them obeying the equals contract.

Always override hashCode when you override equals. A common source of bugs is the failure to override the hashCode method. Failure to do so will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and HashTable.

When you override the hashCode method, you must also adhere to its general contract as follows.

  1. Whenever it is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  2. If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  3. It is not required that if two objects are unequal according to the equals (Object) method, then calling the hashCode method on each of the two Objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

Working with Hibernate Detached Objects.

The detached object state in a Hibernate application exposes you again to the equality or Object Identity when you fail to comply with the general contracts of equals and hashCode methods. For example, please refer to the code snippet below:

//acquire session1
Object a = session1.get(Item.class, new Long(1234));
Object b = session1.get(Item.class, new Long(1234));

//close session1
//some code
//acquire session2

Object c = session2.get(Item.class, new Long(1234));
//close session2

Object references a and b have not only the same database identity, but also the same Java identity, because they’re obtained in the same Session. Reference c is obtained in a second Session and thus it refers to a different instance on the heap. It’s very important to understand the difference of Java Identity which is a == b and database identity which is a.getId().equals(b.getId()). Consider the following extension of the code, after session2 has ended:

Set allObjects = new HashSet();
allObjects.add(a);
allObjects.add(b);
allObjects.add(c);

First, no duplicate elements are allowed in a Set. Whenever you add an object, its equals method is called automatically. If you have not implemented the equals method for the Item class, by default it will inherit the equals() method of java.lang.Object. This implementation uses a double-equals(==) comparison. We may guess that the number of elements in the collection is two because you have three references to two instances. However, we obviously expect that the Set has exactly one element, because a, b and c represent the same database row.

Now, the question is, how are we going to implement our equals (and hashCode) method that involves persistent object? Here are some important tips you need to consider in implementing your equal method:

  1. Do NOT use database identifier property (surrogate primary key). Identifier values aren’t assigned by Hibernate until an object becomes persistent. If a transient object is added to a Set before being saved, its hash value may change while it’s contained by the Set.
  2. Include all persistent properties of the persistent class, apart from any database identifier property.
  3. Do NOT include collections. Collection state is associated with a different table, so it is wrong to include it. More important, you don’t want to force the entire object graph to be retrieved just to perform equals.
  4. Identify a business key in your classes. A business key is a property, or some combination of properties, that is unique for each instance with the same database identity. Every attribute that has a UNIQUE database constraint is a good candidate for the business key. Please refer to Java Persistence with Hibernate by Christian Bauer and Gavin King for more tips that will help you identify a business key in your classes.
  5. Use getter methods instead of direct access. This is because the object instance passed as other may be a proxy object, not the actual instance. To initialize this proxy, you need to access it with a getter method.

Below is the equals and hashCode method of a User class.

public boolean equals(Object other){
     if (this == other) return true;
     if (!(other instanceof User)) return false;
     final User that = (User) other;
     return this.getUsername().equals(that.getUsername());
}

public int hashCode(){
     return this.getUsername().hashCode();
}

For the User class, username is a great candidate business key. It’s never null, it’s unique with a database constraint, and it changes rarely.

Some programmers do not override the equals and hashCode methods until such time comes that they need the instances of their Class to serve as map keys or set elements. As a good practice, always override the equals and hashCode methods right after writing or designing your Class. Always remember, “No Class is an Island”.

If you’re a client looking for an extra hand for your IT project, please don’t hesitate to contact us today.
Send us a message

 

References:
— Java Persistence with Hibernate by Christian Bauer and Gaving King
— Effective Java Programming Language Guide (Second Edition) by Joshua Bloch


7 thoughts on “Effective Java: Equals and HashCode

  1. Nice article, I would just like to share my way (which is not far from yours):

    I skip `(this == other)`. For most cases, this will save only a negligible amount cycle times and something about having several `return` statements just doesn’t seem right to me (even for #equals(..)) …just a personal quirk of mine. 🙂

    I also prefer the ‘that’ local variable name 🙂 Makes the code more readable.

    And lastly, I use apache common’s EqualsBuilder & HashCodeBuilder whenever possible to avoid silly things like NPE; This one I highly recommend 🙂

    Cheers,
    Franz

  2. Hi Franz,

    Thank you very much for sharing your idea here 🙂

    The (this == other) is used to optimize the equals operation if it happens that the two objects you are comparing refer to the same instance on the heap.

    There are performance overheads when using the EqualsBuilder. First, it does not support short-circuit evaluation. Second, in every call of the equals method, it creates a new instance.

    For me, I am using the Eclipse equals & hashCode generator :). Again, the issues of the several ‘return’ statements, readability & maintainability arise.

    I even made some performance testing using the ‘Eclipse’ generated vs the common’s EqualsBuilder & HashCodeBuilder. The test is a simple Class with 3 strings, 1 float and 1 Date fields. I found out that the ‘Eclipse’ generated executes about 50% faster than the EqualsBuilder & HashCodeBuilder. How much more if you execute a recursive equals & hashCode using those common’s builders?

    Bottom line is you have to choose between performance and readability & maintainability. If you can live with these performance overheads, use the common’s builders for readability & maintainability :).

    Finally, you may check out the Google Guava. This will address both the performance and readability & maintainability.

    Regards!
    -Mohammad

  3. I agree with Mohammad. Commons EqualsBuilder and HashCodeBuilder do have potential for slower performance since these builders allocate objects every time you call equals() and hashCode() methods and thus complicating the job of the Garbage Collector.

    However, as JVMs get better, the performance hit may soon become negligible.

    Also one more note about using instanceof inside equals method.

    If you want to allow equality check with subclasses, use instanceof but if you want a check with the real class itself, you shouldn’t use instanceof, for non-hibernate managed objects, compare the values returned by obj.getClass(). For Hibernate entities use Hibernate.getClass() since obj.getClass() may not return the real class and instead return a proxy. Hibernate.getClass() ensures strict class level equality checking.

    To clarify:

    Non-Hibernate Managed objects:

    this.getClass() == that.getClass()

    For Hibernate Managed objects:

    Hibernate.getClass(that).equals(Hibernate.getClass(this))

  4. Thank you Jeff!

    The issue about using instanceof and getClass brings up the fundamental problem of equivalence relations in object-oriented languages — which I should have address in this blog. 🙂

    Using instanceof may violate the “Transitivity” in the equals contract if you allow equality check with subclasses. While using getClass may solve this problem, however the consequence is unacceptable. For example, suppose we have a Square class that extends Rectangle. This is a good candidate for Inheritance because every Square “is a” Rectangle. Consider the code snippet below:

    public class Rectangle {
    private double length;
    private double width;

    public Rectangle (double length, double width){
    this.length = length;
    this.width = width;
    }
    //getters and other functions
    }

    public class Square extends Rectangle {
    public Square(double side){
    super(side, side);
    }
    }

    //
    Rectangle r = new Rectangle (2, 2);
    Square s = new Square(2);

    Using getClass, executing r.equals(s) will return false. However, we obviously think that this is indeed equal — a rectangle with a length = 2 and width = 2 is equal to a Square with side = 2. Another example is the classes Employee, Manager and Supervisor — Manager and Supervisor extend Employee. If the equals method was overridden in Employee class say, ID (or a combination of firstname and lastname) as your business key, which is unique and not null. The Manager and Supervisor classes don’t need to override the equals because the equals method inherited from Employee is appropriate for these classes. Again, if you compare an instance of employee with the ID = 123 and a Manager with ID = 123 using getClass, this will return false because they have different classes.

    As Josh Bloch said “There is simply no way to extend an instantiable class and add a value component while preserving the equals contract.” But how do we solve this problem? Follow the advice “Favor composition over inheritance”.

    I hope this helps!

    Regards!
    -Mohammad

  5. Yes, but exactly when does one decide to do it & what are cases when you MUST override equals & hashcode? And SHOULD they both always be done just for “sake” & safety of it. Does this ever need to happen outside the context of using collections ? thx

  6. Hi slade,

    You should always override it, when you are sure that your Class’s object will interact with other objects.. Especially if you export it in API and other programmers may use it..

    But if your class “is an island”, that is no other classes interact with it then you may not override the equals & hashCode methods. 🙂

    Does this happen outside the context of using collections?
    – Yes of course, whenever you need equality check of your objects.

  7. Slade, to be more specific, these are the conditions where you don’t need to override your equals method as stated in Effective Java of Josh Bloch:
    1. Each instance of the class is inherently unique. This is true for classes that represent active entities rather than values, such as Thread.
    2. You don’t care whether the class provides a “logical equality” test. For example, java.util.Random could have overridden equals to check whether two Random instances would produce the same sequence of random numbers going forward, but the designers didn’t think that clients would need or want this functionality.
    3. A superclass has already overridden equals, and the behavior inherited from the superclass is appropriate for this class.
    4. The class is private or package-private, and you are certain that its equals method will never be invoked.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.