Press enter to see results or esc to cancel.

Unsafe Harbor

One of the most contradictory discussion topics in the JVM community is the sun.misc.Unsafe API, or simply Unsafe. Originally, only the core platform classes were supposed to use that private API, but it swiftly became a widespread way to bypass the JVM’s safety limitations, usually for the sake of improving application performance characteristics. Officially it still remains an internal, undocumented API that can be changed or even removed at any moment. (The removal is actually happening right now, albeit at a slower pace than was originally proposed.) At the same time, too many libraries and applications heavily rely on Unsafe, making it a de facto indispensable part of the JVM world, at least until a better solution will be presented and widely adopted. That is why JVMs other than HotSpot commonly support sun.misc.Unsafe, despite the name of its package.

Excelsior JET is not an exclusion, we also support Unsafe, but things don’t always go smoothly. That is actually not too surprising, as that API was originally designed as a backdoor to HotSpot VM internals and is therefore obviously HotSpot-specific. Here is a short story that illustrates how nasty support of sun.misc.Unsafe can get for third-party JVM vendors.

Symptoms

There is a really cool and popular Java library named Kryo, intended for fast and efficient object graph serialization. Recently, several clients of ours reported problems with their applications that used the Kryo library. The applications worked fine on HotSpot, but the native binaries produced by Excelsior JET started throwing unexpected exceptions, breaking the applications completely.

The stacktraces led to this method from the Kryo library, which threw a KryoException instead of returning the field value (the comment in the code is mine – IU):

public Object getField (Object object)
                 throws IllegalArgumentException, IllegalAccessException {
    // offset is a field in the base class that contains a cached value
    // returned by the sun.misc.Unsafe.objectFieldOffset() method
    if (offset >= 0) {
        return unsafe().getObject(object, offset);
    } else
        throw new KryoException("Unknown offset");
}

In an application compiled with Excelsior JET,  sun.misc.Unsafe.objectFieldOffset() indeed returns strange negative values for certain fields. At first sight, it would seem logical for field offsets within objects to be positive, meaning that we have a critical bug in our implementation of Unsafe methods that should be fixed ASAP, right? Not so fast.

Unsafe Optimization

Actually, there can be numerous reasons for the offset of a field to be negative. The most obvious one is the bidirectional object layout, in which some fields are placed before the object’s header, i.e. they actually do have negative offsets. But in our case the reason is a little bit trickier: the latest versions of Excelsior JET can “borrow” several bits from reference-type variables for internal needs in certain special cases.

The motivation and details of this aspect of Excelsior JET inner workings are beyond the scope of this post, but the two important things worth noting here are that:

  • In our JVM, a value of a reference type may contain more information than just an address of an object, and, what’s more,
  • Such meta-information in references can be changed at run time when certain conditions are met.

And here is where things get really interesting. It turns out that the underlying raw representations of two references to the same object do not necessarily match bit-by-bit anymore!

Of course, we had to change many different parts of our JVM to handle this new property correctly, from reference comparison to, you guessed it right, sun.misc.Unsafe. One Unsafe method that required particular attention in that regard is compareAndSwapObject(). It should atomically compare the given reference value with the value stored in the field. Taking into account the fact that values with different raw bit patterns can point to the same object may result in a significant performance degradation.

Fortunately, it is guaranteed that such additional meta-information is either the same or absent in all values of some reference types. For a field of such type, compareAndSwapObject() can be implemented as efficiently as all other atomic primitives. And we can actually tell whether the field is of such a “good” type (and we are lucky) or not. The only problem here is that in this method we are dealing not with the field itself, but with its offset. Yes, we could have done it in a straightforward way: get the j.l.Class of the host object, find the field with the given offset, and check its type. But such run time checks would have dramatically worsen the performance instead of improving it, whereas we are trying to optimize this unsafe method.

And here is where another optimization appears naturally. We have the field offset at our complete disposal, right? Let’s borrow a bit from the offset as well and use it to encode whether the references stored in that field can contain variable meta-info or not!

These offsets are gathered by other methods of sun.misc.Unsafe, such as objectFieldOffset(), so we should change our implementation of those methods to return special “tagged” offsets. We should also modify several other Unsafe methods to make them work with offsets that have that special bit set. But that’s all! Since checking a bit value only takes a couple of fast instructions, we now have well-optimized implementations of all sun.misc.Unsafe methods, including compareAndSwapObject().

The only side effect here is that such offsets do not actually look like offsets anymore. We use the most significant bit as the mark bit, so the marked offsets look like very big negative values. But that should not be a problem as long as such offsets are only used inside the methods of Unsafe classes (and why would anyone use their values anywhere else for anything?).

Let’s consider the following code snippet. Here we have exactly the same situation that provokes problems with Kryo: we gather the offset of a field with the objectFieldOffset() method and then use it to write some data into that field.

public static interface Bar {
    public void print();
}
public static class Foo {
    Bar barField;
}

public static class Baz implements Bar {
    String place;
    public Baz(String place) { this.place = place; }
    public void print() { System.out.println("Hello from " + place + "!"); }  
}
//...

Field f = Foo.class.getDeclaredField("barField");
long offset = unsafe.objectFieldOffset(f);
Foo host = new Foo();        
unsafe.putObject(host, offset, new Baz("unsafe"));

System.out.println("offset = " + offset);
host.barField.print();
//...

The offsets are indeed very different when running this sample on Hotspot and Excelsior JET:

> javac *.java
> java UnsafeTest
offset = 12
Hello from unsafe!

> jc UnsafeTest.class
   .  .  .
> ./UnsafeTest
offset = -2147483632
Hello from unsafe!

But the objectFieldOffset() method in Excelsior JET handles such special offsets correctly, so the reference to the given instance of Baz does get written into barField and the end result is the same.

Great! At the end we have an absolutely correct and well optimized implementation of sun.misc.Unsafe methods and that is the thing that everyone needs from Unsafe, right?

Not quite.

The Power of Open Source

Unfortunately, rather often the internal JVM information obtained via sun.misc.Unsafe gets misused and treated incorrectly. That becomes especially noticeable upon attempts to run such code on a third-party JVM, but is actually a latent problem with HotSpot as well, because the format of that internal information can be changed in any future version. So, the best solution here is for everyone to stop peeking into the data returned from sun.misc.Unsafe, or at the very least be very careful when left with no other options.

The story about the Kryo library has a happy ending. As the problem looked like a random mistake local to a single line in Kryo source code, we simply submitted a pull request with a small fix and detailed reasoning. It was accepted and successfully merged, so the Kryo library now works correctly when compiled with Excelsior JET and that is great! Thanks again the authors of Kryo, you rock!

However, the above is just one example of unsafe usage of Unsafe. There are still many other libraries out there that may exhibit similar symptoms.

Of course, we understand that sun.misc.Unsafe is an incredibly important part of the Java world and using it can be the most efficient way, or even the only viable way to solve many practical problems. But we also hope that over time the developers targeting the JVM will switch over to safer, officially supported APIs and instruments to achieve the same goals. May this small story be yet another argument for getting rid of sun.misc.Unsafe uses in your projects.

That’s all for today, thanks for reading and stay tuned for more posts!

Links

  1. The issue about the problem in the Kryo repository on GitHub.
  2. UnsafeTest example and running scripts.