Sunday, June 17, 2007

SCJP Prep - surprising autoboxing behaviour

One of the features introduced in Java 5 is autoboxing. Boxing means that we take a primitive such as a char, byte, int, etc and wrap it using it's corresponding wrapper class such as Char, Byte, Integer, etc so that we can use the object. Unboxing means retrieving the value from a wrapper. In the code fragment below, the Integer object is first unboxed, the value is summed with 10, and the summed value is boxed into a new Integer object.

[java, Y, 1]
public Integer addTen(Integer num)
{
return new Integer(10+num.intValue());
}


With autoboxing, all this boxing and unboxing is transparent to you. Therefore you can write simpler code like

[java, Y, 1]
public Integer autoboxAddTen(Integer num)
{
return 10+num;
}


While it allows developer to write succinct code, it can be a problem if we are not careful. For example, we'll have to be careful of NullPointerExceptions when we do autoboxing, which we will hit in the code below

[java, Y, 1]
public void doSomething()
{
Integer num=null;
System.out.println(autoboxAddTen(num));
}


Another snag we might hit is if we try to autobox a primitive into a non-corresponding wrapper type. In the code below, we can assign b to i according to conversion rules. But we'll have to take note that autoboxing only occurs primitives and it's corresponding wrapper type, eg int and Integer, byte and Byte, double and Double, etc

[java, Y, 1]
public void doSomethingElse()
{
int i=1;
byte b=7;
i=b;
Integer iObj=i;
iObj=b; //error. Type mismatch: cannot convert from byte to Integer
}


The most surprising behaviour though, is this. I saw some bloggers posted this problem, but I couldn't see what the problem was until someone explained it. Can you see where the problem is?

[java, Y, 1]
public class SurpriseAuoboxingTest1 {

public static void main(final String[] args) {

Integer i1 = new Integer(2);
Integer i2 = new Integer(2);
System.out.println(i1 == i2); // false

Integer j1 = 2;
Integer j2 = 2;
System.out.println(j1 == j2); // true

Integer k1 = 150;
Integer k2 = 150;
System.out.println(k1 == k2); // false
}
}


It might be clearer if we test some more values.

[java, Y, 1]
public class SurpriseAuoboxingTest2 {
public static void main(final String[] args) {
Integer l1 = 127;
Integer l2 = 127;
System.out.println(l1 == l2); // true

Integer m1 = 128;
Integer m2 = 128;
System.out.println(m1 == m2); // false

Integer n1 = -128;
Integer n2 = -128;
System.out.println(n1 == n2); // true

Integer o1 = -129;
Integer o2 = -129;
System.out.println(o1 == o2); // false
}


The second set of values are the edge cases, for something...but what? Basically, it's the range of integral values for cached objects.

Here's what Sun has to say on this:

The primitives are equal and the values of the boxed ints are equal. But this time the ints point to different objects. What you have discovered is that for small integral values, the objects are cached in a pool much like Strings. When i and j are 2, a single object is referenced from two different locations. When i and j are 2000, two separate objects are referenced. Autoboxing is guaranteed to return the same object for integral values in the range [-128, 127], but an implementation may, at its discretion, cache values outside of that range. It would be bad style to rely on this caching in your code.

In fact, testing for object equality using == is, of course, not what you normally intend to do. This cautionary example is included in this tip because it is easy to lose track of whether you are dealing with objects or primitives when the compiler makes it so easy for you to move back and forth between them.


It's good to remember what is happening beneath all this autoboxing, we're not getting the convenience for free. For a more in-depth look at the possible repercussions of autoboxing, take a look at this post on murphee's Rant.


# Newbies will write slow code because of it: Well, one of the problems that might occur, is newbies that write horribly slow code, because they never really grasped the difference between primitives and their reference types. If the wrapper types are overused, a lot of useless AutoBoxing and Unboxing will happen. Experienced Java developers will know when that is the case and can avoid this. One of the reasons why Sun introduced this feature, is their initiative to make Java easier to use to attract more and more developers (at the last JavaOne the number 10 million was mentioned as a goal). These would mostly be taken from the MS languages like VB. So... we are talking about a lot of newbies coming along... and a lot of newbies writing code that is slow, because of the above mentioned issues.
# Memory usage will increase tremendously: Another issue concerning inexperienced developers. A newbie might think "Why should I use some weird int[] numbers = new int[x] when I could use a more flexible List numbers = new ArrayList(x);?". Well... one the main reason is the fact, that the latter version will use up way more memory than the former one. In the worst case, it could mean a 400 % increase over the array solution (why 400%? An Integer object will use 16 bytes of memory, which is four times the 4 bytes that the int value would use. It is an 400 % increase because the int[] only uses an array of 32 bit values to hold its contents. The Integer version would need an array of references (32 bits or 64 bits, depending on your CPU) plus the memory for each object).

No comments:

Post a Comment