Skip to content

Truffle OptimizedCallTarget profile counters do not decay #11045

@JohnTortugo

Description

@JohnTortugo

Describe GraalVM and your environment

  • GraalVM version: mainline
  • CE or EE: CE, possibly EE as well
  • JDK version: JDK21 or Tip
  • OS and OS Version: macOS Catalina, Linux
  • Architecture: amd64, aarch64
  • The output of java -Xinternalversion:
OpenJDK 64-Bit Server VM (25-internal-adhoc.dcsl.labs-openjdk) for bsd-aarch64 JRE (25-internal-adhoc.dcsl.labs-openjdk), built on 2025-04-14T21:24:30Z with clang Apple LLVM 16.0.0 (clang-1600.0.26.3)

Describe the issue

  • Problem 1: The profile for a call target does not decay. Specifically, the CallCount and CallAndLoopCount fields are never decremented and therefore once a method is considered Hot it will be forever be considered hot.

    • Consequence: In an application that lives long enough most methods will eventually become "Hot" which defeats the definition of what "Hot" is.
  • Problem 2: If a call target reached the hotness threshold at any point in the past, every time the call target is executed and there is no compiled version of it present in the code cache, Truffle will recompile the call target.

    • Consequence 1 : HotSpot flushes a nmethod associated with a Truffle compilation from the code cache. The very next time the call target is invoked Graal will compile the method and insert it into the code cache again. This happens in a cycle in some cases because the code cache is under pressure - not necessarily “Full”.
    • Consequence 2: If the call target is considered Tier2 hot at the moment of its invocation, Truffle will compile the method at Tier 1 on its next invocation and then at Tier 2 on the subsequent invocation.
  • Problem 3: A concern that I have regarding the MaximumCompilations option is that if the application executes long enough many methods will reach this threshold and never be compiled again before the application restarts. The counter is increased in every compilation of a method, independent of the reason that made the previous compiled version of the method obsolete. I.e., we should have some policy to reset this threshold.

    • Consequence 1: Application performance will drastically degrade overtime because as more and more methods reach the maximum compilation threshold, therefore less and less methods are compiled.

Steps to reproduce the issue

  • There is an attached example that uses GraalJS to illustrate the issues mentioned above.
  • Just build and run the attached code with the latest version of Graal/GraalJS.
  • Apply this patch to JDK nmethod::purge method so you're able to see the names of the methods being flushed from code cache. May not apply cleanly on different JDK versions but should be easy to adjust.
  • Run the program with a small enough code cache and heap configuration -XX:+UseG1GC -Xms2g -Xmx2g -XX:ReservedCodeCacheSize=10M "-Xlog:codecache=debug:stdout:time"
    • HotSpot code cache flushing heuristics are dependent on GC cycles..

Expected behavior

  • Method profile information decay over time - the hotness of a method is a temporal property, not a definite quality. A method may be hot now but not hot later on, etc.
  • A method shouldn't be blocked from compilation because it was evicted from the code cache a given number of times.
  • Methods shouldn't be recompiled on the next invocation after they are evicted from the hotspot code cache.

Proposed Solution

I'd like to propose the following solutions to address these issues.

  • For problem 1 I propose that we reset the method profile counters when the method is evicted from the code cache by hotspot. I know that currently Truffle doesn't know when a nmethod is evicted from the code cache. To address that I'm proposing a change to JVMCI to make it possible to identify why a InstalledCode was changed. With the proposed change we'll be able to identify if a nmethod was evicted from the cache because it was "cold" or because it was "deoptimized".

  • For problem 2 I propose that we limit the number of compilations to a configurable time window, i.e., maximum number of compilations per hour. I understand that this limit was added as a temporary measure but, to be honest, I think we should keep it in place at least in some form, to make sure that the system uses a limited number of resources - i.e., don't at some random moment enters a recompilation cycle for whatever reason. Note that HotSpot has something similar - see here and here.

  • Problem 3 will be indirectly fixed by using the proposed solution to problem 1.

I'm looking forward to your comments on these issues and in the proposed changes to JVMCI!

Code snippet or code repository that reproduces the issue

import java.util.ArrayList;
import java.io.IOException;

import org.graalvm.polyglot.Context;
import org.graalvm.polyglot.Engine;
import org.graalvm.polyglot.PolyglotException;
import org.graalvm.polyglot.Source;
import org.graalvm.polyglot.Value;

public class SimpleCompilation {
  private static int method_idx_counter = 0;
  private static ArrayList<char []> global_list = new ArrayList<char []>();
  private static ArrayList<Value> keep_alive = new ArrayList<Value>();

  public static void main(String[] args) throws Exception {
    int wait_duration_ms = 30_000;
    int execution_id = 99;
    Engine engine = Engine.newBuilder("js")
      .allowExperimentalOptions(true)
      .option("engine.TraceCompilationDetails", "true")
      .option("engine.BackgroundCompilation", "false")
      .option("engine.DynamicCompilationThresholds", "false")
      .build();

    try (Context context = Context.newBuilder().engine(engine).build()) {
        System.out.println("\n\n\nWill trigger compilation of target method.");
        System.in.read();
        Value target = force_compile(11000, context);

        while (true) {
          System.out.println("\n\n\nWill trigger compilations & cache flushing of 10k methods.");
          System.in.read();
          for (int i=0; i<10000; i++) {
            force_compile(500, context);
          }
          allocate();

          System.out.println("\n\n\nSleeping for " + wait_duration_ms + "ms");
          Thread.sleep(30_000);


          System.out.println("\n\n\nWill trigger compilation of target method.");
          System.in.read();
          target.execute(execution_id++);

          System.out.println("\n\n\nStarting next iteration.");
        }
    }
  }

  private static Value force_compile(int iterations, Context context) throws IOException {
      String js = "function target" + method_idx_counter + " (param) { return 1; } ";

      Source source = Source.newBuilder("js", js, "test.js").build();

      context.eval(source);
      Value jsBindings = context.getBindings("js");
      Value method = jsBindings.getMember("target" + method_idx_counter);

      for (int j = 0; j < iterations; j++) {
        method.execute(0);
      }

      method_idx_counter++;
      keep_alive.add(method);
      return method;
  }

  private static void allocate() {
    global_list.add(new char[1024 * 1024 * 256]);
  }
}

Metadata

Metadata

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions