-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8344332: (bf) Migrate DirectByteBuffer away from jdk.internal.ref.Cleaner #25289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back kbarrett! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@kimbarrett The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
/reviewers 2 reviewer |
@AlanBateman |
* questions. | ||
*/ | ||
|
||
package jdk.internal.nio; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation/internal classes for this area are in sun.nio (for historical reasons). Probably best to keep them together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Done.
BTW, I'm fine with a suggestion to wait until JDK 26 before integrating this. |
This change makes java.nio no longer use jdk.internal.ref.Cleaner to manage
native memory for Direct-X-Buffers. Instead it uses bespoke PhantomReferences
and a dedicated ReferenceQueue. This differs from PR 22165, which proposed to
use java.lang.ref.Cleaner.
This change is algorithmically similar to the two previous versions:
JDK-6857566 and JDK-8156500 (current mainline). The critical function is
Bits::reserveMemory(). For both of those versions and this change, a thread
calls that function and tries to reserve some space. If it fails, then it
keeps trying until all cleaners deactivated (cleared) by prior GCs have been
cleaned. If reservation still fails, then it invokes the GC to try to
deactivate more cleaners for cleaning. After that GC it keeps trying the
reservation and waiting for cleaning, with sleeps to avoid a spin loop,
eventually either succeeding or giving up and throwing OOME.
Retaining that algorithmic approach is one of the goals of this change, since
it has been successfully in use since JDK 9 (and was originally developed and
extensively tested in JDK 8).
The key to this approach is having a way to determine that deactivated
cleaners have been cleaned. JDK-6857566 accomplished this by having waiting
threads help the reference processor until there was no available work.
JDK-8156500 waits for the reference processor to quiesce, relying on its
immediate processing of cleaners. java.lang.ref.Cleaner doesn't provide a way
to do this, which is why this change rolls its own Cleaner-like mechanism from
the underlying primitives. Like JDK-6857566, this change has waiting threads
help with cleaning references. This was a potentially undesirable feature of
JDK-6857566, as arbitrary allocating threads were invoking arbitrary cleaners.
(Though by the time of JDK-6857566 the cleaners were only used by DBB, and
became internal-only somewhere around that time as well.) That's not a concern
here, as the cleaners involved are only from DBB, and we know what they look
like.
As noted in the discussion of JDK-6857566, it's good to have DBB cleaning
being done off the reference processing thread, as it may be expensive and
slow down enqueuing other pending references. JDK-6857566 only did some of
that, and JDK-8156500 lost that feature. This change moves all of the DBB
cleaning off of the reference processing thread. (So does PR 22165.)
Neither JDK-6857566 nor this change are completely precise. For both, a thread
may find there is no available work while other threads have work in progress.
Making this change more precise seems to cost complexity and performance.
JDK-8156500 is precise in this respect, so we're losing that. But this
imprecision wasn't known to cause problems for JDK-6857566, and there hasn't
been any evidence of problems with this change either.
During the development of JDK-6857566 it was noticed that parallel cleaning
didn't seem to have much (if any) performance benefit. That seems to be true
for this change as well.
PR 22165 uses java.lang.ref.Cleaner to manage cleaning. That class doesn't
provide a good way to detect progress toward or completion of cleaning of
deactivated cleaners from prior GCs. So PR 22165 uses a somewhat clumsy and
unreliable mechanism (the canaries) to try to do that. A proposal for such
functionality was discussed (in PR 22165) but deemed (probably rightly so) too
intrusive. An unpublished alternative was less intrusive, but still might
raise questions. The change being proposed here avoids changing or using that
class, and performs at least as well.
Another issue with PR 22165 is that if we are indeed out of memory and on our
way to OOME, each allocating thread may come up against the slow path lock in
Bits::reserveMemory, and in turn perform 9 full GCs and then OOME. That seems
kind of pathological. For JDK-6857566, JDK-8156500, and this change, an
allocating thread only performs 1 full GC before OOME.
One issue with this change is that it incorporates a near-copy of the
CleanableList class from java.lang.ref.Cleaner. Possible future work would
merge the two into a common utility. There's another potential client for
this: java.desktop/share/classes/sun/java2d/Disposer.java. I tried using a
hashtable for this change (as with Disposer), but the CleanableList performed
significantly better.
A well-known issue with all of these approaches is -XX:+DisableExplicitGC. If
used, then the GCs to request reference processing don't happen. That will
likely lead to OOME, though the sleeps might provide an opportunity for
automatic GCs to occur, maybe sometimes dodging OOME that way.
https://mail.openjdk.org/pipermail/core-libs-dev/2013-October/021547.html
Thread for discussion of development of JDK-6857566
Testing: mach5 tier1-6
Many runs of new tests micro/org/openjdk/bench/java/nio/DirectByteBuffer{GC,Churn}
(thanks for those @shipilev), and jdk/java/nio/Buffer/DirectByteBufferAlloc
for various versions of this change.
The test java/nio/Buffer/DirectByteBufferAlloc.java can be run explicitly as a
benchmark. But the arguments suggested in that file cause the measurements to
be dominated by full GC times, swamping any other differences. Increasing the
value of
XX:MaxDirectMemorySize
from 128m to 1024m provides a more usefulcomparison.
Result of running that test with
-XX:MaxDirectMemorySize=1024m
, with otheroptions as suggested in the file, and comparing the periodic per thread
ms/allocation
outputs, produces results like this:Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25289/head:pull/25289
$ git checkout pull/25289
Update a local copy of the PR:
$ git checkout pull/25289
$ git pull https://git.openjdk.org/jdk.git pull/25289/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25289
View PR using the GUI difftool:
$ git pr show -t 25289
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25289.diff
Using Webrev
Link to Webrev Comment