Skip to content

feat: convert execs to ip to netlink calls #1697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

aauren
Copy link
Collaborator

@aauren aauren commented Jul 5, 2024

Not making direct exec calls to user binary interfaces has long been a principle of kube-router. When kube-router was first coded, the netlink library was missing significant features that forced us to exec out. However, now netlink seems to have most of the functionality that we need.

This converts all of the places where we can use netlink to use the netlink functionality.

The current state of this PR is untested and still needs to undergo significant testing:

  • Ensure IPv4 routes are getting populated correctly
  • Ensure IPv4 source routing is being added to custom table
  • Ensure IPv6 routes are getting populated correctly
  • Ensure IPv6 source routing is being added to custom table
  • Ensure IPv4 Service VIPs get added to the dummy interface
  • Ensure IPv6 Service VIPs get added to the dummy interface
  • Ensure DSR works
  • Ensure ipip encapsulation works
  • Ensure fou encapsulation works

@aauren aauren force-pushed the migrate_to_netlink_library branch from 38733d4 to 7a58331 Compare July 5, 2024 20:36
Copy link

github-actions bot commented Sep 4, 2024

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Sep 4, 2024
@aauren aauren added override-stale Don't allow automatic management of stale issues / PRs and removed Stale labels Sep 8, 2024
@aauren aauren force-pushed the migrate_to_netlink_library branch 2 times, most recently from 4657546 to cc734e5 Compare May 6, 2025 22:14
@aauren aauren force-pushed the migrate_to_netlink_library branch from 3a4526c to 08f07c7 Compare May 10, 2025 21:00
aauren added 11 commits June 20, 2025 16:41
Not making direct exec calls to user binary interfaces has long been a
principle of kube-router. When kube-router was first coded, the netlink
library was missing significant features that forced us to exec out.
However, now netlink seems to have most of the functionality that we
need.

This converts all of the places where we can use netlink to use the
netlink functionality.
The rt_tables list is already in an ordered form in terms of priority.
Once one is found, it should be considered the optimal one and stop
looking for additional tables.
Previously we were accidentally deleting all routes that were found,
this mimics the previous functionality better by only deleting external
IPs that were found in the externalIPRouteTable that are no longer in
the activeExternalIPs map.

Also improves logging around any routes that are deleted as this is
likely of interest to all kube-router administrators.
In order for a local route to be valid it needs to have the scope set to
host. When we were executing ip commands iproute2 just did this for us
to make the command accurate. Now that we're communicating with the
netlink socket, we need to do this conversion for ourselves.

Without this we get an error that says "invalid argument" from the
netlink subsystem. But if the route isn't local, then most of the
routing logic for services doesn't work correctly because it acts upon
external traffic as well as local traffic which isn't correct.
It has proven to be tricky to insert new rules without calling the
designated NewRule() function from the netlink library. Usually attempts
will fail with an operation not supported message.

This improves the reliability of rule insertion.
Consolidate IP utility functions into a new file and add proper unit
testing. Additionally consolidate logic and references to default route
subnets.
When ip rules are evaluated in the netlink library, default routes for
src and dst are equated to nil. This makes it difficult to evaluate
them and requires additional handling in order for them.

I filed an issue upstream so that this could potentially get fixed:
vishvananda/netlink#1080 however if it doesn't
get resolved, this should allow us to move forward.
Removes repeated logic of calculating IP address subnets for single
subnet hosts and consolidates it in one place.
It used to be when we were using iproute2's CLI we needed to have the
fwmark as a hex number so we were passing it as a string in that format.

However, now that we use the netlink library directly, we already have
the fwmark in the condition that we need it. So instead of doing all of
these string <-> int conversions, lets just keep this simpler.
Instead of deleting and just hoping for the best, this change makes it
so that we check first whether or not a route exists. This helps to
reduce needless warnings that the user receives and is just all around
more accurate.
@aauren aauren force-pushed the migrate_to_netlink_library branch from 08f07c7 to b0246ef Compare June 20, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
override-stale Don't allow automatic management of stale issues / PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant