-
Notifications
You must be signed in to change notification settings - Fork 16
Execution timeouts without running in a separate thread #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey, thanks for reading the paper. That paragraph is wrong in saying that you're forced to run in a separate thread, however in practice people do. Using fuel or instruction counting is not free, and has a fixed overhead. I am the author of libriscv which is like a low-latency version of WebAssembly although I never intended to compete with anything. In it, I implement execution timeout through instruction counting and the overhead can be made fairly low, but definitely measurable. Because of this there's a separate dispatch mode where instruction counting is not enabled, including for binary translation. And I use that a lot for programs where the program is already behaving well and there is no distrust. (Eg. I wrote the program) I would say disabling it can be around ~15% performance boost on real-world programs. So, you would only use fuel- or instruction-counting-based execution timeout if the run-time of your function call was less than the 15% overhead caused by having it enabled. With the thread-based solution you can interrupt the thread and get the sandbox that is in fact running in the same address space as you, to stop running. If a really slow thread required 10us to receive your function call task, you'd have to run for only 67us before the head start has been recovered. In a game engine I would struggle to find any scripting function that ran for 67us or more, so there it makes a lot of sense to target the lowest latency. But for data processing nobody will enable something that incurs a 15% performance overhead. Might as well also add that fuel- or instruction-counting-based timeouts cannot interrupt blocking calls that are called from system call handlers / host-functions. So there's a drawback in that if you're stuck blocking on something, you're really stuck. With a timer-based solution the timer can keep interrupting until everything eventually unblocks and unravels. As for TinyKVM, it doesn't have to be run in another thread, but in practice you want to because KVM has to migrate its own kernel data-structures if you start using it from another thread. In the Drogon implementation I try to avoid threads, but in the end the pipelining from having more threads than load-balanced epoll cores means that I probably have to use cooperative fiber-threads and some kind of scheduling logic if I wanted to avoid thread-task overheads. And I like the simplicity of the thread-task solution. So far anyhow. If you don't have particular scaling needs you can always just create every VM or particular forks on every epoll thread. And then just call into the fork as needed. I believe that's a mode in the Drogon implementation, but impossible in Varnish due to the thread architecture. The execution timeout in TinyKVM is quite interesting, actually. It uses the fact that Linux has a hidden feature in signals that lets you direct it to a thread, and then sets a timer which triggers any signal you like, really. Because that signal will cause a KVM_EXIT, and you only have to set a thread_local boolean "oh no we timed out" to true: thread_local bool timer_was_triggered = false; Using that you have a perfectly working timeout mechanism that has yet to fail, and whose overhead is only measured in time spent in In any case, the execution timeout mechanism in TinyKVM can be improved, which will help reduce the fixed overhead of calling functions and resuming. But no lower than the overhead of entering the VM itself, of course. EDIT: The TCP streams in VMOD TinyKVM don't access a VM in another thread. They are calling directly from the epoll thread. It's only for Varnish frontend/backend requests that it makes sense to use a thread-task. |
Just a thought after reading your excellent paper:
This isn't strictly accurate:
Wasmtime has a timeout mechanism that makes running on the same thread safe: https://docs.rs/wasmtime/latest/wasmtime/struct.Config.html#method.epoch_interruption
While with V8 one could avoid the thread communication overhead by running on the main thread and calling
v8::Isolate::TerminateExecution()
with a timer from a watchdog thread. https://v8.github.io/api/head/classv8_1_1Isolate.html#ad212b2e0b66ff5d586cd79cfa0b555fbSo I wonder if TinyKVM implement a similar mechanism and I see you have already explored setting a timer in the virtual hardware here but found it to be slow. https://stackoverflow.com/questions/68590696/timeout-for-kvm-userspace-guest)
As I understand it an interrupt will cause a VMExit which will return control back to the VMM. This suggests installing a no-op signal handler on the thread running the kvm guest will give back control: https://gist.github.com/mcastelino/df7e65ade874f6890f618dc51778d83a
Perhaps this could even just be set with
setitimer
orSIGALRM
so you wouldn't need a watchdog thread.Of course for most many cases using a thread pool will still be desirable but the lower latency of running in thread like this could be useful for sandboxing routing logic where the thread switching overhead might be noticeable.
The text was updated successfully, but these errors were encountered: