Skip to content

Introduce Runtime object allowng to detect outer runtime context #543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

tnull
Copy link
Collaborator

@tnull tnull commented May 19, 2025

Closes #491

Instead of holding an Arc<RwLock<Option<Arc<tokio::runtime::Runtime>>>
and dealing with stuff like tokio::task::block_in_place at all
callsites, we introduce a Runtime object that takes care of the state
transitions, and allows to detect and reuse an outer runtime context.

We also adjust the with_runtime API to take a tokio::runtime::Handle
rather than an Arc<Runtime>.

We then also go ahead an reuse said Runtime object for VssStore.

(cc @andrei-21)

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented May 19, 2025

👋 Thanks for assigning @andrei-21 as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@tnull tnull marked this pull request as draft May 19, 2025 15:10
@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch from d42f762 to 2f43096 Compare May 22, 2025 09:27
@tnull tnull changed the title Introduce Runtime object and allow to take a tokio::runtime::Handle Introduce Runtime object allowng to detect outer runtime context May 22, 2025
@tnull tnull marked this pull request as ready for review May 22, 2025 09:29
Copy link
Contributor

@andrei-21 andrei-21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested with my prototype, everything works ok.

src/runtime.rs Outdated

pub fn block_on<F: Future>(&self, future: F) -> Result<F::Output, RuntimeError> {
let handle = self.handle()?;
Ok(tokio::task::block_in_place(move || handle.block_on(future)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our offline chat, I was looking up the code for block_in_place: https://github.com/tokio-rs/tokio/blob/17d8c2b29d94550f504d8fd76d8d8aaf66095864/tokio/src/runtime/scheduler/multi_thread/worker.rs#L358

It indeed uses a thread local context inside.

If you use this anyway, isn't the only correct way to always also use tokio::runtime::Handle::try_current() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We spoke about it, but seeing this again it feels a bit undefined to mix things in this non-transparent way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned the case where there is no current runtime though. But that is detectable too, and in that case the call doesn't need to be wrapped in block_on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code comments exactly explaining why this block_in_place -> block_on chaining is needed would be helpful too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, please take a look at the approach I just pushed: now went with preferring the Handle/spawned runtime everywhere, except in block_on where the outer context will take precedence.

@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch from 2f43096 to 2f32e15 Compare May 23, 2025 11:42
@tnull
Copy link
Collaborator Author

tnull commented May 23, 2025

Now pushed a new approach that initializes the Runtime in the builder, which allows to clean up a lot of the error cases. Kinda makes sense to go this way, as starting/stopping the runtime is out of our control anyways, if the user gives us a handle to use.

@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch 3 times, most recently from 3d7c8a6 to 48a42f1 Compare May 23, 2025 12:00
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean up of the error cases indeed.

// during `block_on`, as this is the context `block_in_place` would operate on. So we try
// to detect the outer context here, and otherwise use whatever was set during
// initialization.
let handle = tokio::runtime::Handle::try_current().unwrap_or(self.handle());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is used here, shouldn't it be used everywhere (spawn, spawn_blocking)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confused: below you argue against invisible capture, here you say we should invisibly capture the context from any entry point? No, I think generally using what was set on startup and only making an exception for block_on makes sense.

Copy link
Contributor

@joostjager joostjager May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not the same. In new the handle is saved and there's the implicit requirement for it to stay alive.

Here it isn't saved. It's just used if present, and otherwise we fall back to what the user configured and knows they need to keep alive.

Why make an exception for block_on? I'd think it is better to be consistent, and if it can't be avoid in block_on it should be done everywhere?

impl Runtime {
pub fn new() -> Result<Self, std::io::Error> {
let mode = match tokio::runtime::Handle::try_current() {
Ok(handle) => RuntimeMode::Handle(handle),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still makes me feel a bit uncomfortable that that handle is capture here invisibly, and that there's the implicit assumption that the user will keep this alive.

Removing it here, and letting the user pass it in themselves via the builder seems to be a more transparent way to signal that it is used and needs to remain available.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see your concern, but I'm not sure. I think the current behavior is the expected default behavior that just makes it work transparently for any upstream users. And we need to auto-detect for block_on at the very least anyways. Not sure if @TheBlueMatt has an opinion here, since he (as a user) requested auto-detection in #491?

log_error!(logger, "Failed to setup tokio runtime: {}", e);
BuildError::RuntimeSetupFailed
})?)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code reuse opportunity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK, IMO it's much more readable to keep short blocks like this inlined, instead of having the reader jump all over the file, losing context.

Copy link
Contributor

@joostjager joostjager May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree. I think eliminating the risk of future code changes not being applied in both places is more important than having the reader jump to a function. The function can have a descriptive name too such as build_runtime. I don't think it is bad for readability at all.

Your argument of jumping all over the file would also apply to code that isn't necessarily duplicated. Because also in that case, the reader would have to jump. I think that would lead to long function bodies, of which there are already too many in ldk, and those absolutely do not help with readability.

) -> Self {
Self { runtime, logger }
pub(crate) fn new(runtime: Arc<Runtime>) -> Self {
Self { runtime }
}
}

impl FutureSpawner for RuntimeSpawner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be implemented directly onto the new Runtime?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought, unfortunately no as GossipVerifier::new takes FutureSpawner by value, and of course we can't impl FutureSpawner for Arc<Runtime> as both sides of it would include non-local types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really no way to do this? Implement an interface on a local type and then pass it to the other crate? Or would it require a different type on the LDK side for FutureSpawner?

Perhaps longer term moving Runtime to rust-lightning could be a direction too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really no way to do this? Implement an interface on a local type and then pass it to the other crate?

There is a way to do this, which is the newtype pattern, which is essentially what we have.

Copy link
Collaborator Author

@tnull tnull May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or would it require a different type on the LDK side for FutureSpawner?

The easiest way to avoid this would be to have GossipVerifier::new take a Deref<Target = FutureSpawner> or Borrow<FutureSpawner>, but honestly a newtype isn't too bad in this case, IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I might, for the async kv store pr, use Deref<Target=FS> then already as a preparation. But ofc the wrapper is no big deal.

@tnull tnull requested a review from andrei-21 May 23, 2025 12:34
@tnull
Copy link
Collaborator Author

tnull commented May 23, 2025

I have tested with my prototype, everything works ok.

@andrei-21 Thanks! Mind reconfirming this works as expected with the new approach?

Copy link
Collaborator Author

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, seems switching VssStore over to use the same runtime has the integration test hang. Will need some more debugging.

@@ -1046,8 +1033,7 @@ impl Node {
/// Will also remove the peer from the peer store, i.e., after this has been called we won't
/// try to reconnect on restart.
pub fn disconnect(&self, counterparty_node_id: PublicKey) -> Result<(), Error> {
let rt_lock = self.runtime.read().unwrap();
if rt_lock.is_none() {
if !*self.is_running.read().unwrap() {
Copy link
Contributor

@joostjager joostjager May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this PR, but curious: we talked about the type-state pattern for Runtime and that it may not be ideal. Could the pattern work for Node though? So start returning a type that exposes the method, and that type being consumed by stop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we had considered that previously, too. Maybe, although I'm not sure if we want to force this pattern on our users, i.e., they'd need to create something akin to an enum wrapper that could hold variants StoppedNode/StartedNode, or would pipe through different versions of the node in their app, depending on the node state. Also, in the future we'd like to handle (re-)starting on persistence failure for the users, and if they hold a StartedNode object, there is really no way to force them to drop it.

Copy link
Contributor

@joostjager joostjager May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I have no experience with the pattern. It looks pretty powerful in combination with Rust's type system. Don't know if the pipe through can be done with a wrapper around the state types. Restart could be a method on the StartedNode perhaps. Either way, this was just a side remark.

@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch 3 times, most recently from f65ad68 to 838c1bc Compare May 23, 2025 13:17
@andrei-21
Copy link
Contributor

I have tested with my prototype, everything works ok.

@andrei-21 Thanks! Mind reconfirming this works as expected with the new approach?

Tried again, seems to work as expected.

@tnull
Copy link
Collaborator Author

tnull commented May 23, 2025

Tried again, seems to work as expected.

Thanks again!

@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @andrei-21! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch 6 times, most recently from 936982f to 01d3818 Compare May 26, 2025 12:34
tnull added 7 commits May 26, 2025 15:52
Previously, we'd reuse the `log` facade macros which would always log
the line number in `logger.rs`, not at the line number given at the
record.

Here, we rectify this by utilizing `log`'s `RecordBuilder` to craft a
record and directly giving it to the configured logger.
Instead of holding an `Arc<RwLock<Option<Arc<tokio::runtime::Runtime>>>`
and dealing with stuff like `tokio::task::block_in_place` at all
callsites, we introduce a `Runtime` object that takes care of the state
transitions, and allows to detect and reuse an outer runtime context.

We also adjust the `with_runtime` API to take a `tokio::runtime::Handle`
rather than an `Arc<Runtime>`.
.. which now gives us cleaner reuse/handling of outer runtime contexts,
cleanup on `Drop`, etc.
@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch 2 times, most recently from 59868b9 to 92d81c2 Compare May 26, 2025 14:02
@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch 7 times, most recently from 556f3cf to e8522d5 Compare May 27, 2025 08:33
@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch 2 times, most recently from 0cf45af to 15f300b Compare May 27, 2025 11:08
@tnull tnull force-pushed the 2025-05-allow-to-use-runtime-handle branch from 15f300b to 47207f5 Compare May 27, 2025 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto-detect existing tokio runtime?
4 participants