-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Is your feature request related to a problem?
There are situations in which uncaught exceptions (mainly Error
such as NoSuchMethodError
) can occur within interceptors and those can lead to hanging RPCs. Typically, applications catch Exception
rather than Error
or Throwable
. When the latter occurs, it results in hanging RPCs. Those particular errors are obviously bad and the application is unhealthy. However, these paths may only trigger in certain situations and resulting in a hanging RPC fully exasperates the issue worse. Allowing gRPC to properly close RPCs when these occur would at least avoid making things worse for the app.
Describe the solution you'd like
Ideally, gRPC would catch any Throwable
within the call chain including interceptors and result in a failed RPC and properly release any resources attached.
Describe alternatives you've considered
The alternative here is to have each app, each interceptor, etc catch Throwable
themselves to protect against these cases and abide by the ClientCall
and Listener
specs. This can easily be missed or forgotten.
Additional context
The following is sample blocking code and interceptor which produces the hanging effect.
int port = 12000;
NettyServerBuilder.forPort(port)
.addService(ServerInterceptors.intercept(new TestServiceGrpcImpl("testservice")))
.build()
.start();
ManagedChannel channel = NettyChannelBuilder.forAddress("localhost", port)
.usePlaintext()
.intercept(new ClientInterceptor() {
@Override
public <ReqT, RespT> ClientCall<ReqT, RespT> interceptCall(MethodDescriptor<ReqT, RespT> method, CallOptions callOptions, Channel next) {
return new ForwardingClientCall.SimpleForwardingClientCall<ReqT, RespT>(next.newCall(method, callOptions)) {
private ReqT request;
@Override
public void start(Listener<RespT> responseListener, Metadata headers) {
super.start(new ForwardingClientCallListener.SimpleForwardingClientCallListener<RespT>(responseListener) {
@Override
public void onClose(Status status, Metadata trailers) {
//normal flow is doing some work and calling super.onClose
//super.onClose(status, trailers);
// bad flow is an Error happens such as NoSuchMethodError
throw new NoSuchMethodError();
}
}, headers);
}
};
}
})
.build();
TestResponse response = TestServiceGrpc.newBlockingStub(channel).echoMessage(TestRequest.getDefaultInstance());
Assert.assertEquals("testservice", response.getName());
The successful path using super.onClose
will invoke UnaryStreamToFuture.onClose
as the parent delegate and blocking stub. This will mark the GrpcFuture
as completed with a value or exception. That closes the future and results in ClientCalls.blockingUnaryCall
finishing.
The failing path skips that and results in waitAndDrain
on the ThreadlessExecutor
. That calls that onClose
listener as expected, but runQuietly
catches Throwable
, logs it, swallows it, and returns. This results in the future never being closed and results in the blocking call hanging and never completing.
This is by design as onClose
states nothing should throw. However, Throwable
seems excessive to avoid and protecting against hanging RPCs would be beneficial. I'm not sure precisely where that would even occur to ensure the future is canceled or failed appropriately, so marking this as a feature request more than a bug.
For now, we are protecting by updating various interceptors to catch Throwable
in specific cases.