-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Description of the bug:
We recently setup a bazel remote cache on our local network for faster download speeds (as opposed to a remote server before).
We're facing a strange bug where after some time, all requests to the remote cache server fail with a No route to host
error. It appears that the grpc connexion is in a broken state, but bazel keeps trying to use it, resulting in a failure every time. Restarting the bazel daemon always solves the issue.
The root issue that causes this error is probably not directly related to bazel (maybe it's an incorrect configuration of the local network itself, maybe it's a macOS bug, maybe it's a Java networking or Netty or GRPC bug). But since restarting the bazel process solves the issue, it sounds like the bazel process could detect this problem and re-establish a "clean" connexion.
Here is how the remote cache is configured (we're using bazel 8.2.1) :
build --experimental_remote_scrubbing_config=remote_scrub_config.txt
build --experimental_remote_cache_eviction_retries=10
build --experimental_remote_cache_lease_extension
build --experimental_remote_cache_ttl=1h
build --bes_results_url=http://192.168.21.190:8080/invocation/
build --bes_backend=grpc://192.168.21.190:1985
build --remote_cache=grpc://192.168.21.190:1985
build --noremote_upload_local_results
build --remote_timeout=5
build --remote_cache_compression
build --remote_download_toplevel
Here is some examples of the errors and stack traces that are being logged when this happens :
ERROR: Failed to query remote execution capabilities: UNAVAILABLE: io exception
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:535)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:542)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985
Caused by: java.net.NoRouteToHostException: No route to host
at java.base/sun.nio.ch.Net.connect0(Native Method)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:91)
at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:88)
at java.base/java.security.AccessController.doPrivileged(Unknown Source)
at io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:322)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:653)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:632)
at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
at io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:157)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:655)
at io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:61)
at io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:637)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
com.google.devtools.build.lib.remote.common.RemoteExecutionCapabilitiesException: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985
at com.google.devtools.build.lib.remote.GoogleChannelConnectionFactory$1.onFailure(GoogleChannelConnectionFactory.java:163)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1117)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:108)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:560)
at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:218)
at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:194)
at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:146)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:108)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:568)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:538)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985
Caused by: java.net.NoRouteToHostException: No route to host
at java.base/sun.nio.ch.Net.connect0(Native Method)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.Net.connect(Unknown Source)
at java.base/sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:91)
at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:88)
at java.base/java.security.AccessController.doPrivileged(Unknown Source)
at io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:322)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:653)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:632)
at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
at io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:157)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:655)
at io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:61)
at io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:637)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
Here is what additional logs I have when using --remote_grpc_log
{
"metadata": {
"toolDetails": {
"toolName": "bazel",
"toolVersion": "8.2.1"
},
"actionId": "capabilities",
"toolInvocationId": "77987a72-ca5b-4aff-9455-f50b55c045eb",
"correlatedInvocationsId": "a3491516-678b-499f-8b71-a466bd6ed0a7"
},
"status": {
"code": 14,
"message": "io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985"
},
"methodName": "build.bazel.remote.execution.v2.Capabilities/GetCapabilities",
"details": {
"getCapabilities": {
"request": {}
}
},
"startTime": "2025-05-02T22:49:15.248Z",
"endTime": "2025-05-02T22:49:15.250Z"
}
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
No response
Which operating system are you running Bazel on?
macOS 15.3.2 (24D81)
What is the output of bazel info release
?
release 8.2.1
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
This issue looks very similar to #20868 solved by #23150
Any other information, logs, or outputs that you want to share?
No response