Profile-Guided Optimization

Debunking yet another JIT myth.

It's no secret that Oracle HotSpot and other advanced JIT compilers use live application execution profile data to optimize code more effectively. A lesser known fact is that all popular C++ compilers, free and commercial, implement profile-guided optimization (PGO) as well.

One major difference is that a JIT compiler profiles the application at run time and optimizes its code gradually. That process has to be repeated every time the application gets launched on each end-user system. A PGO-enabled AOT compiler completely removes those run time overheads, so the end users only enjoy the benefits of PGO.

Starting from version 12, PGO is available in all 64-bit versions of Excelsior JET and Excelsior JET Embedded and in the Linux/ARM version of the latter.

An AOT compiler designed to run on a developer's system can implement resource-heavy advanced optimizations that JIT compilers simply cannot afford. Problem is, it often does not know exactly where to apply that power, and there is a good chance to find the missing pieces of information in an actual application execution profile. Let's consider one situation in which PGO can make a big difference.

Most calls in Java applications are de-facto monomorphic, i.e. they have a single target method. The compiler can effectively optimize such call sites by inlining the respective methods. But what about the really polymorphic calls?

One typical example is a hot loop iterating over a Java collection and invoking methods on the retrieved objects. If the collection contains objects of multiple types, one call site in the loop body can dispatch to different instance methods on different iterations. Profile-guided optimization can improve performance in this case: if the profile data shows that one of the target methods is called much more frequently than others, the compiler can optimize its invocation.

To illustrate the above, we have written a microbenchmark that models a polymorphic collection. Compiled with PGO, it shows a 8x performance boost compared to the default build.

OpenJDK timing included for reference.

JET 12

6.56s

JET 12 +pgo

0.84s

OpenJDK 8

1.27s

Of course, useful Java programs do not spend CPU time exclusively in polymorphic calls of tiny methods, so the effect of PGO on their overall performance charateristics is usually less significant. Still, in our tests PGO has improved the performance/throughput of a handful of real-world applications provided by our customers by factors from 1.1x and up to 2.6x.

As of version 12, profile-guided optimization is supported by the following products:

  • 64-bit Excelsior JET, Enterprise Edition for Windows, OS X, and Linux
  • 64-bit Excelsior JET Embedded for Windows and Linux
  • Excelsior JET Embedded for Linux/ARM
Show detailed instructions for: Bundled tools | Maven plugin | Gradle plugin

To find out whether PGO may considerably improve the performance of your application, do the following:

Step 1. Collect an execution profile

Launch your natively compiled application with the jet.profiler property set, and run it through some performance-challenging scenarios, either manually or by supplying the necessary input data.
How do I set system properties at application launch?

Setting System Properties - A Quick Refresher

The system property setting mechanism varies depending on application type and build settings.

Example 1 (single-app executable):

On Windows, use a .bat file:

rem profileMyApp.bat
set JETVMPROP=-Djet.profiler
MyApp.exe TonsOfInputData.txt

On macOS/Linux, use a shell script:

#!/bin/bash
export JETVMPROP=-Djet.profiler
./MyApp TonsOfInputData.txt

Example 2 (multi-app executable)

MyMultiApp -Djet.profiler MainClass42 TonsOfInputData.txt

For details, refer to the Excelsior JET User's Guide, Chapter "Runtime Considerations", section "Java System Properties".

Note: Applications run approximately 10% slower than usual when profiling is enabled.

On application exit, an AppName.jprof file will be created in the current directory.

First of all, make sure that you are using at least version 1.1.0 of the plugin:

    .  .  .
    <plugin>
        <groupId>com.excelsiorjet</groupId>
        <artifactId>excelsior-jet-maven-plugin</artifactId>
        <version>1.1.0</version>
    .  .  .

Invoke the jet:profile Maven task:

mvn jet:profile

Upon application exit, the gathered profile (.jprof file) will be placed into the ${project.basedir}/src/main/jetresources directory.

First of all, make sure that you are using at least version 1.1.0 of the plugin:

    .  .  .
buildscript {
    ext.jetPluginVersion = '1.1.0'
    .  .  .

Invoke the jetProfile Gradle task:

gradlew jetProfile

Upon application exit, the gathered profile (.jprof file) will be placed into the project.projectDir/src/main/jetresources directory.

Important For PGO to be effective, it is paramount that you collect a profile representative of the future behaviour of your application in a production environment.

Step 2. Enable PGO and re-build the project

For projects created with a prior version of Excelsior JET:

Add the following two options to your project file, or to the command line of the Excelsior JET AOT compiler (jc):

-pgo+
-jprofile=AppName.jprof

and re-build your project.

For new projects:

By default, the JET Control Panel adds the above settings to all newly created projects and creates an empty AppName.jprof file in the build directory. All you need to enable PGO is replace that empty file with the one created on Step 1 and re-build your project.

In most cases, this step takes much less time than a full build, as the compiler only needs to process a few classes selected by the PGO engine.

Build the project as usual, for instance:

mvn jet:build

The plugin will enable PGO automatically upon detecting the presence of the .jprof file.

Build the project as usual, for instance:

gradlew jetBuild

The plugin will enable PGO automatically upon detecting the presence of the .jprof file.

Step 3 (optional). Try applying PGO to platform classes as well

Your application may be spending a lot of CPU time in the code of Java Platform API classes, such as standard collections. Adding the platform classes to the scope of PGO could therefore improve the performance of your application further, and that improvement can be quite significant. Enabling the Global Optimizer will do exactly that.

You can enable the Global Optimizer on the Classes page of the JET Control Panel, or by adding the option -global+ to the project file or the jc command line.

To enable the Global Optimizer, add the following configuration parameter:

<globalOptimizer>true</globalOptimizer>

and rebuild your project.

To enable the Global Optimizer, add the following configuration parameter:

globalOptimizer = true

and rebuild your project.

Notice There is no need to re-collect the execution profile as it already contains the necessary information for the platform classes.

From version 12, the Global Optimizer is available for all target platforms supported by Excelsior JET.

  1. If the file AppName.jprof already exists, it gets overwritten without a warning, and profile merging is not implemented yet. This means that an execution profile must be collected in a single session.
  2. You cannot supply more than one .jprof file to the compiler, because yes, profile merging is not implemented yet.
  3. The executables that were built with PGO enabled cannot themselves be run in the profiling mode.

The above limitations will be removed in the next major release of Excelsior JET.

  1. The 32-bit versions of Excelsior JET for Windows and Linux support profiling, but do not support PGO. More precisely, they recognize the -pgo+ option, but it has no effect. This enables you to use the same project files for 32- and 64-bit builds, but only the latter would benefit from PGO.

This limitation will not be removed in the next major release, as that requires more than one development cycle.