TCP doesn't terminate gracefully if node is down

## What happened?
Before a pod terminates we make the pod unready so that new connections doesn't get routed to it. So, only nodes which NAT Service ExternalIP to pod IP will have the pod IP entry in the IPVS table. During this time if the node which did the NAT of ExternalIP to pod goes down then there is no way to reach the terminating pod.

## What did you expect to happen?
Even if other nodes go down as long as the pod is not terminated there should be a way to reach it.

## How can we reproduce the behavior you experienced?

1. Create a cluster with 2 nodes which are in two different regions.
2. Service has DSR and maglev enabled
```yaml
apiVersion: v1
kind: Service
metadata:
  annotations:
    kube-router.io/service.dsr: "tunnel"
    kube-router.io/service.scheduler: "mh"
    kube-router.io/service.schedflags: "flag-1,flag-2"
```
3. There are 3 pods behind this service. All the pods are running on `eqx-sjc-kubenode1-staging`
```
root@gce-del-km-staging-anupam:~/anupam/manifests $ kubectl get svc,endpoints
NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP     PORT(S)    AGE
service/debian-server-lb   ClusterIP   192.168.97.188   199.27.151.10   8099/TCP   6d7h

NAME                         ENDPOINTS                                      AGE
endpoints/debian-server-lb   10.36.0.3:8099,10.36.0.5:8099,10.36.0.6:8099   6d7h

root@gce-del-km-staging-anupam:~/anupam/manifests $ kubectl get pods -o wide
NAME                            READY   STATUS    RESTARTS   AGE    IP              NODE
debian-server-8b5467777-cbwt2   1/1     Running   0          18m    10.36.0.6       eqx-sjc-kubenode1-staging 
debian-server-8b5467777-vts6l   1/1     Running   0          2d5h   10.36.0.3       eqx-sjc-kubenode1-staging
debian-server-8b5467777-wxfrv   1/1     Running   0          19m    10.36.0.5       eqx-sjc-kubenode1-staging 
```

4. IPVS entries are successfully applied by kube-router
```
root@eqx-sjc-kubenode1-staging:~ $ ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn   
TCP  192.168.97.188:8099 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Masq    1      0          0         
  -> 10.36.0.5:8099               Masq    1      0          0         
  -> 10.36.0.6:8099               Masq    1      0          0         
FWM  3754 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Tunnel  1      0          0         
  -> 10.36.0.5:8099               Tunnel  1      0          0         
  -> 10.36.0.6:8099               Tunnel  1      0          0 

root@tlx-dal-kubenode1-staging:~ $ ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn       
TCP  192.168.97.188:8099 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Masq    1      0          0         
  -> 10.36.0.5:8099               Masq    1      0          0         
  -> 10.36.0.6:8099               Masq    1      0          0         
FWM  3754 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Tunnel  1      0          0         
  -> 10.36.0.5:8099               Tunnel  1      0          0         
  -> 10.36.0.6:8099               Tunnel  1      1          0   
```
5. In all the 3 pods start a TCP server on port 8099 using `nc -lv 0.0.0.0 8099`
6. Create a session from client which is closer to `tlx-dal-kubenode1-staging` using `nc <service-ip> 8099`
7. Make a pod unready. This keeps pod IP entry in IPVS for `tlx-dal-kubenode1-staging` only
```
NAME                            READY   STATUS    RESTARTS   AGE    IP              NODE
debian-server-8b5467777-cbwt2   0/1     Running   0          18m    10.36.0.6       eqx-sjc-kubenode1-staging 
debian-server-8b5467777-vts6l   1/1     Running   0          2d5h   10.36.0.3       eqx-sjc-kubenode1-staging
debian-server-8b5467777-wxfrv   1/1     Running   0          19m    10.36.0.5       eqx-sjc-kubenode1-staging 

root@tlx-dal-kubenode1-staging:~ $ ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn       
TCP  192.168.97.188:8099 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Masq    1      0          0         
  -> 10.36.0.5:8099               Masq    1      0          0         
  -> 10.36.0.6:8099               Masq    1      0          0         
FWM  3754 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Tunnel  1      0          0         
  -> 10.36.0.5:8099               Tunnel  1      0          0         
  -> 10.36.0.6:8099               Tunnel  0      1          0   

root@tlx-dal-kubenode1-staging:~/anupam/kr-ecv $ ipvsadm -Lcn 
IPVS connection entries
pro expire state       source             virtual            destination
TCP 14:58  ESTABLISHED 103.35.125.24:41876 199.27.151.10:8099 10.36.0.6:8099

root@eqx-sjc-kubenode1-staging:~/anupam/kr-ecv $ ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn  
TCP  192.168.97.188:8099 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Masq    1      0          0         
  -> 10.36.0.5:8099               Masq    1      0          0         
FWM  3754 mh (mh-fallback,mh-port)
  -> 10.36.0.3:8099               Tunnel  1      0          0         
  -> 10.36.0.5:8099               Tunnel  1      0          0   
```
8. shutdown `tlx-dal-kubenode1-staging`. Now the connection is completely broken

## System Information (please complete the following information)

- Kube-Router Version (`kube-router --version`): [2.5.0, built on 2025-02-14T20:20:43Z, go1.23.6]
- Kube-Router Parameters: [e.g. --run-router --run-service-proxy --enable-overlay --overlay-type=full etc.]
```
--kubeconfig=/usr/local/kube-router/kube-router.kubeconfig 
--run-router=true 
--run-firewall=true 
--run-service-proxy=true 
--v=3 
--peer-router-ips=103.35.124.1 
--peer-router-asns=65322 
--cluster-asn=65321 
--enable-ibgp=false 
--enable-overlay=false 
--bgp-graceful-restart=true 
--bgp-graceful-restart-deferral-time=30s 
--bgp-graceful-restart-time=5m 
--advertise-external-ip=true 
--ipvs-graceful-termination 
--runtime-endpoint=unix:///run/containerd/containerd.sock 
--enable-ipv6=true 
--routes-sync-period=1m0s 
--iptables-sync-period=1m0s 
--ipvs-sync-period=1m0s 
--hairpin-mode=true 
--advertise-pod-cidr=true
```
- Kubernetes Version (`kubectl version`) : 1.29.14
- Cloud Type: [e.g. AWS, GCP, Azure, on premise] onprem
- Kubernetes Deployment Type: [e.g. EKS, GKE, Kops, Kubeadm, etc.] manual
- Kube-Router Deployment Type: [e.g. DaemonSet, System Service] on host
- Cluster Size: [e.g. 200 Nodes] 2 nodes
- kernel version: 5.10.0-34-amd64


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TCP doesn't terminate gracefully if node is down #1865

What happened?

What did you expect to happen?

How can we reproduce the behavior you experienced?

System Information (please complete the following information)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TCP doesn't terminate gracefully if node is down #1865

Description

What happened?

What did you expect to happen?

How can we reproduce the behavior you experienced?

System Information (please complete the following information)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions