Skip to content

GRPC connexion stuck in "No route to host" state #25989

@ghugues

Description

@ghugues

Description of the bug:

We recently setup a bazel remote cache on our local network for faster download speeds (as opposed to a remote server before).
We're facing a strange bug where after some time, all requests to the remote cache server fail with a No route to host error. It appears that the grpc connexion is in a broken state, but bazel keeps trying to use it, resulting in a failure every time. Restarting the bazel daemon always solves the issue.

The root issue that causes this error is probably not directly related to bazel (maybe it's an incorrect configuration of the local network itself, maybe it's a macOS bug, maybe it's a Java networking or Netty or GRPC bug). But since restarting the bazel process solves the issue, it sounds like the bazel process could detect this problem and re-establish a "clean" connexion.

Here is how the remote cache is configured (we're using bazel 8.2.1) :

build --experimental_remote_scrubbing_config=remote_scrub_config.txt
build --experimental_remote_cache_eviction_retries=10
build --experimental_remote_cache_lease_extension
build --experimental_remote_cache_ttl=1h

build --bes_results_url=http://192.168.21.190:8080/invocation/
build --bes_backend=grpc://192.168.21.190:1985
build --remote_cache=grpc://192.168.21.190:1985
build --noremote_upload_local_results
build --remote_timeout=5
build --remote_cache_compression
build --remote_download_toplevel

Here is some examples of the errors and stack traces that are being logged when this happens :

ERROR: Failed to query remote execution capabilities: UNAVAILABLE: io exception
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
	at io.grpc.Status.asRuntimeException(Status.java:535)
	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:542)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985
Caused by: java.net.NoRouteToHostException: No route to host
	at java.base/sun.nio.ch.Net.connect0(Native Method)
	at java.base/sun.nio.ch.Net.connect(Unknown Source)
	at java.base/sun.nio.ch.Net.connect(Unknown Source)
	at java.base/sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
	at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:91)
	at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:88)
	at java.base/java.security.AccessController.doPrivileged(Unknown Source)
	at io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
	at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:322)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
	at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:653)
	at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:632)
	at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
	at io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:157)
	at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:655)
	at io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:61)
	at io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:637)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Unknown Source)
com.google.devtools.build.lib.remote.common.RemoteExecutionCapabilitiesException: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985
	at com.google.devtools.build.lib.remote.GoogleChannelConnectionFactory$1.onFailure(GoogleChannelConnectionFactory.java:163)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1117)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:108)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
	at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:560)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:218)
	at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:194)
	at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:146)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:108)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1004)
	at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:767)
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:516)
	at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:568)
	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:538)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:564)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:729)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:710)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985
Caused by: java.net.NoRouteToHostException: No route to host
	at java.base/sun.nio.ch.Net.connect0(Native Method)
	at java.base/sun.nio.ch.Net.connect(Unknown Source)
	at java.base/sun.nio.ch.Net.connect(Unknown Source)
	at java.base/sun.nio.ch.SocketChannelImpl.connect(Unknown Source)
	at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:91)
	at io.netty.util.internal.SocketUtils$3.run(SocketUtils.java:88)
	at java.base/java.security.AccessController.doPrivileged(Unknown Source)
	at io.netty.util.internal.SocketUtils.connect(SocketUtils.java:88)
	at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:322)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:248)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1342)
	at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:653)
	at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:632)
	at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:54)
	at io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:157)
	at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:655)
	at io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:61)
	at io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:637)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Unknown Source)

Here is what additional logs I have when using --remote_grpc_log

{
  "metadata": {
    "toolDetails": {
      "toolName": "bazel",
      "toolVersion": "8.2.1"
    },
    "actionId": "capabilities",
    "toolInvocationId": "77987a72-ca5b-4aff-9455-f50b55c045eb",
    "correlatedInvocationsId": "a3491516-678b-499f-8b71-a466bd6ed0a7"
  },
  "status": {
    "code": 14,
    "message": "io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /192.168.21.190:1985"
  },
  "methodName": "build.bazel.remote.execution.v2.Capabilities/GetCapabilities",
  "details": {
    "getCapabilities": {
      "request": {}
    }
  },
  "startTime": "2025-05-02T22:49:15.248Z",
  "endTime": "2025-05-02T22:49:15.250Z"
}

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

macOS 15.3.2 (24D81)

What is the output of bazel info release?

release 8.2.1

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

This issue looks very similar to #20868 solved by #23150

Any other information, logs, or outputs that you want to share?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2We'll consider working on this in future. (Assignee optional)team-Remote-ExecIssues and PRs for the Execution (Remote) teamtype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions