by Carlos Raphael
A faster alternative to Java Reflection
In the article Specification Pattern, for the sake of sanity, I didn’t mention about an underlying component to nicely make that thing happen. Now, I’ll elaborate a little bit more around the JavaBeanUtil class, that I put in place to read the value for a given
fieldName from a particular
javaBeanObject, which in that occasion turned out to be FxTransaction.
You can easily argue I could’ve basically used Apache Commons BeanUtils or one of its alternatives to achieve the same result. But I was interested in getting my own hands dirty with something different that I knew would be way faster than any library built on top of the widely known Java Reflection.
The enabler of the technique used to avoid the very slow reflection is the
invokedynamic bytecode instruction. Briefly,
invokedynamic (or “indy”) was the greatest thing introduced in Java 7 in order to pave the way for implementing dynamic languages on top of the JVM through dynamic method invocation. It also later allowed lambda expression and method reference in Java 8 as well as string concatenation in Java 9 to benefit from it.
In a nutshell, the technique I’m about to better describe below leverages LambdaMetafactory and MethodHandle in order to dynamically create an implementation of Function. Its single method delegates a call to the actual target method with a code defined inside of lambda body.
The target method in question here is the actual getter method that has direct access to the field we want to read. Also, I should say if you are quite familiar with the nice things that came up within Java 8, you will find the below code snippets fairly easy to understand. Otherwise, it may be tricky at a glance.
A peek at the homemade JavaBeanUtil
The following method is the utility used to read a value from a JavaBean field. It takes the JavaBean object and a single
fieldA or even nested field separated by periods, for example,
For optimal performance, I’m caching the dynamically created function that is the actual way of reading the content of a given
fieldName. So inside the
getCachedFunction method, as you can see above, there’s a fast path leveraging the ClassValue for caching and there’s the slow
createAndCacheFunction path executed only if nothing has been cached so far.
The slow path will basically delegate to the
createFunctions method that is returning a list of functions to be reduced by chaining them using
Function::andThen. When functions are chained, you can imagine some sort of nested calls like
getNestedJavaBean().getNestedJavaBean().getFieldA(). Finally, after chaining we simply put the reduced function in the cache calling
Drilling a bit more into the slow path of function creation, we need to individually navigate through the field
path variable by splitting it as per below:
createFunctions method delegates the individual
fieldName and its class holder type to
createFunction method, which will locate the needed getter based upon
javaBeanClass.getDeclaredMethods(). Once it’s located, it maps to a Tuple object (facility from Vavr library), that contains the return type of the getter method and the dynamically created function in which will act as if it was the actual getter method itself.
This tuple mapping is done by
createTupleWithReturnTypeAndGetter in conjunction with
createCallSite method as follows:
In the above two methods, I make use of a constant called
LOOKUP, which is simply a reference to MethodHandles.Lookup. With that, I can create a direct method handle based on the previously located getter method. And finally, the created MethodHandle is passed to
createCallSite method whereby the lambda body for the function is produced using the LambdaMetafactory. From there, ultimately, we can obtain the CallSite instance, which is the function holder.
In order to measure the gains of performance, I used the ever-awesome JMH (Java Microbenchmark Harness), which will likely be part of the JDK 12. As you may know, results are bound to the platform, so for reference I’ll be utilizing a single
1x6 i5-8600K 3.6GHz and
Linux x86_64 as well as
Oracle JDK 8u191 and
GraalVM EE 1.0.0-rc9.
Benchmark scenario is set as follows:
The benchmark is driven by how deep we are going to retrieve some value as per the four different levels specified above. For each
fieldName, JMH will perform 5 iterations of 3 seconds each to warm things up and then 5 iterations of 1 second each to actually measure. Each scenario will then repeat 3 times to reasonably gather the metrics.
Let’s start with the results gathered from the
JDK 8u191 run:
The worst scenario using
invokedynamic approach is much faster than the fastest scenario from the other two libraries. That’s a huge difference, and if you’re doubting the results, you can always download the source code and play around as you like.
Now, let’s see how the same benchmark performs with
GraalVM EE 1.0.0-rc9
Full results can be viewed here with the nice JMH Visualizer.
The huge difference is because JIT-compiler knows
MethodHandle very well and knows how to inline them quite well as opposed to the reflection approach. Also, you can see how promising GraalVM is. Its compiler does a truly awesome job being capable of a great performance enhancement for the reflection approach.
If you’re curious and want to play further, I encourage you to pull the source code from my Github. Bear in mind I’m not encouraging you to do your own homemade
JavaBeanUtil and use in production. Rather my aim here is to simply showcase my experiment and the possibilities we can get from