-
Notifications
You must be signed in to change notification settings - Fork 236
IPIP-0476: Delegated Routing DHT Closest Peers API #476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
a6bf57c to
7e4489e
Compare
Adds a new HTTP endpoint that can be used to request records for the closest peers to a given key that the routing implementation knows about. The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients to perform the walk which can be undesirable given all the connection/processing overhead that entails.
7e4489e to
0a3d8e3
Compare
src/routing/http-routing-v1.md
Outdated
|
|
||
| ## Peer Routing API | ||
|
|
||
| ### `GET /routing/v1/closest-peers/{peer-id}?[closerThan]&[count]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achingbrain I think we could prototype this in https://github.com/ipfs-shipyard/someguy and deploy to https://delegated-ipfs.dev.
Priority-wise, and for the purpose of IPIP we will write for this at some point, does the utility here stops at finding Circuit Relay servers viable in browsers without being a DHT client?
Would it be useful to have API where the entire hash space could be queried for closest value, not just PeerIDs?
I wonder if this could be generalized into delegated /routing/v1/closest/{cid}?[than]&[count] where behind the scenes we prefix and turn peerids and cids to amindht hashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the utility here stops at finding Circuit Relay servers viable in browsers without being a DHT client?
No, there are other uses - you may wish to use this to populate your routing tables, or to find peers that can host provider records. It would be useful to perform these operations from environments that have limited resources or cannot open many concurrent connections, for example.
Would it be useful to have API where the entire hash space could be queried for closest value, not just PeerIDs?
Yes, plus here the PeerIDs are represented as CIDs, so the above is something that can be performed.
To make this clearer it might be better specced as {cid} rather than {peer-id} I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients
does the utility here stops at finding Circuit Relay servers
Note: In many cases you'd want relay servers you'd also want peer routing (i.e. finding your address based on your peerID) which in the case of Amino means having recently pinged the closest peers to your peerID in XOR space. Ideally one of those (20) nodes is also acting as your relay which means the number of connections and queries you'd need are even smaller. So we don't really want a random walk we want to talk to specific peers when possible and only broaden the search if we have to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case yes, they can query for their own peer id as a CID to get better results.
src/routing/http-routing-v1.md
Outdated
|
|
||
| - `closerThan` is an optional [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md) represented as a CIDv1 encoded with `libp2p-key` codec. | ||
| - Returned peer records must be closer to `peer-id` than `closerThan`. | ||
| - If omitted the routing implementation should use it's own [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the query is expected to return the absolute closest peers to the target CID, the only use case I can see for closerThan is to find peers that are under eclipse attacks, but there are no current plans to implement it. Otherwise it will just limit the number of returned peers.
If there are other reasons to keep this parameter, I would recommend against using the receiver's (routing implementation) own peer id, when no value is provided. The default value should be no filter at all, return all the count closest peers. Otherwise, if a peer is looking for the 20 closest peers to some CID, and it reaches the 5th closest peer, this one will respond only with a list of 4 peers (because all the other are further away compare to the 5th closest peer). So in order to get the 20 closest peers, the requester would have to come up with a closerThan parameter that is far enough so that at least 20 peers are closer to the target CID.
src/routing/http-routing-v1.md
Outdated
| - Returned peer records must be closer to `peer-id` than `closerThan`. | ||
| - If omitted the routing implementation should use it's own [Peer ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md). | ||
| - `count` is an optional number that specifies how many peer records the requester desires. | ||
| - Minimum 1, maximum 100, default 20. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the closerThan parameter is set, and count is omitted, the default value should be the maximum, otherwise some peers closer than closerThan may be filtered out. Both parameters could however be combined if both are set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my assumption here is that the peers would be sorted by closeness value before the limit was applied.
Would this not be the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my assumption here is that the peers would be sorted by closeness value before the limit was applied.
Yes that is correct. My point is that both closerThan and count can limit the count of peers that are returned, the parameters may conflict with each other.
E.g peer-id=key0&closerThan=key1 should return all the peers between key0 and key1, but if there are more than default=20 peers between key0 and key1 this query will only return the default closest, and not all of them up to the maximum.
If none of count or closerThan parameters is set, the default count should be default=20. However, if closerThan is set and not count, the default count should be set to the maximum=100. This allows to return as many peers as possible between key0 and key1, and if the caller wants to set a limit, they should combine closerThan and count.
Also, when setting count and not closerThan using the node's peer id as default closerThan parameter can be problematic. E.g I want to provide a CID to the 20 closest nodes, and I am the 12th closest peer to this CID. I will receive only 11 closer nodes, and will be unable to find the 20 closest nodes. Hence, if count is set and not closerThan the closerThan default should be the opposite of the target peer-id (peer-id XOR 111...11). Maybe the opposite of the target peer-id is a better default for closerThan than the requester's peer id, allowing the default response to contain a constant number of peers.
src/routing/http-routing-v1.md
Outdated
|
|
||
| ## Peer Routing API | ||
|
|
||
| ### `GET /routing/v1/closest-peers/{peer-id}?[closerThan]&[count]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this makes sense at the /routing/v1 namespace since it's fairly specific to systems like the Amino DHT.
Having an HTTP API for this kind of thing makes sense though, perhaps as /amino or /kad or something. It could be nested under /routing/v1/<amino, etc.> as well if that makes folks happier.
- Note: Given that this is quite close to being the HTTP equivalent of the
FIND_NODERPC something like this could potentially end up being worked into that protocol itself (e.g. to enable http-based request/response).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hope was that we could have it be routing-agnostic but maybe that's not possible, since the actual results would very much depend on what was being queried and how.
I'm happy with:
GET /routing/v1/amino/closest-peers/{peer-id}?[closerThan]&[count]
though if it's all amino, perhaps we should just have done with it and use the RPC names?
GET /routing/v1/amino/find-node/{multibase}
...then maybe later:
GET /routing/v1/amino/get-value/{multibase}GET /routing/v1/amino/get-providers/{cid}
though the second is redundant given the existing HTTP endpoints and the first is only partially redundant since we could use it to search for more than just IPNS records.
...and even later PUT_VALUE and ADD_PROVIDER?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aschmahmann what do you think? If we're going to expose "amino" endpoints, should they reflect the actual KAD-DHT RPC methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @aschmahmann, /routing/v1/amino/ would make more sense, since the distance metric to define the closeness is specific to kademlia.
IMO we shouldn't expose the actual KAD-DHT RPC methods, but the HTTP client should expect the HTTP server to do the DHT client's work.
e.g the ADD_PROVIDER RPC asks 1 DHT server to add the provided entry as provider. The API should expose a function where the HTTP server will find the 20 closest peers to the key (FIND_NODE), and then send them all a ADD_PROVIDER RPC.
The interface exposed should be the same as what is exposed by a kademlia implementation to its user, but not directly kademlia RPCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about this via lens of /routing/v1 and what is mandatory and what is optional. We already have /routing/v1/ipns namespace, which somehow optional (IPNI actually only implements /routing/v1/providers. With that in mind, the proposal here is not that controversial.
Next step here is a Someguy PR that exposes /routing/v1/kad-dht/closest-peers/{peer-id}?[closerThan]&[count] and then if we are happy with API we could deploy it to delegated-ipfs.dev – added to weekly planning to triage priorities:
(not feeling strongly about name, but leaning towards kad-dht because "dht" makes it self-explanatory + people may want to deploy Someguy in non-Mainnet context, and that instance would no longer be "Amino" – following distinction suggested in #497)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On supporting other RPC methods it's hard to see a short / medium term reason why they'd be interesting to someone given:
- ADD_PROVIDER doesn't really add much unless delegated providing is enabled
- The other RPC methods have equivalents within routing/v1 that are more generic (PUT/GET is not supported for non-IPNS records in Amino and nobody else is here asking for this)
GET /routing/v1/amino/find-node/{multibase}
We can still call this get-closest-peers or whatever we want, but taking a multibase (or even a specific multibase) encoded input here seems like it'd be better since it's more helpful than requiring the parameter to be a peerID.
Hopefully we can also get a version of FIND_NODE added to the DHT spec that queries based on key space rather than the data before it gets transformed into the key space and then that would probably result in an additional function being added here (with probably another name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not feeling strongly about name, but leaning towards kad-dht because "dht" makes it self-explanatory + people may want to deploy Someguy in non-Mainnet context, and that instance would no longer be "Amino" – following distinction suggested in #497)
If someguy is hooked up to multiple dht swarms (e.g Amino and private swarms), then it doesn't make sense to return a global list of closest peers, the caller would be interested in the closest peers for a specific dht swarm (if specific swarm is supported by that someguy instance).
We could use /routing/v1/dht/{swarm}/closest-peers/{peer-id}?[closerThan]&[count]. swarm needs to be a unique identifier for a swarm of DHT peers, e.g amino. Which means that we probably need to add /routing/v1/dht/list-swarms (or similar) to get the list of supported swarm identifiers (and descriptions?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triage notes from the public Boxo/GO triage today (cc @hsanjuan):
- Add IPFS Kademlia DHT Specification #497 exists now
- it defines AminoDHT as instance of kad-dht with specific libp2p protocol (
/ipfs/kad/1.0.0) - people may use this API for delegating DHT instance different than the AminoDHT, so we may want to go with generic
kad-dhtand notamino - do not overengineer the MVP, go with
/routing/v1/dht/closest/{peerid}, rationale:- less is more
- peerid already conveys information we mean closest peers
- we wont have more than one public DHT any time soon, and if people run private ones, unlikely they need public one as well, so ok to just go with
/routing/v1/dht/ - if we ever need to support more than one, we can always have different routing servers for each swarm:
dht1.example.com/routing/v1/dhtandother-dht.example.com/routing/v1/dht - or use explicit query parameter that requires explicit swarm using its libp2p protocol name (we dont invent any extra dictionaries/mappings):
?swarm=/custom/ipfs/kad/1.0.0 - if we have something that does not follow the above KAD-DHT spec, that would be a different endpoint anyway
- while
/closest/{peerid}is nice, the fact peerid can be a CID makes it tricky if we ewant to allow querying entitre keyspace. lets go with explicit/closest/peers/{peerid}to have space for generic queries in the future
- it defines AminoDHT as instance of kad-dht with specific libp2p protocol (
@hsanjuan if you want to prototype this in ipfs/boxo#1004, let's go with simple MVP:
/routing/v1/dht/closest/peers/{peer-id}
src/routing/http-routing-v1.md
Outdated
|
|
||
| ## Peer Routing API | ||
|
|
||
| ### `GET /routing/v1/closest-peers/{peer-id}?[closerThan]&[count]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients
does the utility here stops at finding Circuit Relay servers
Note: In many cases you'd want relay servers you'd also want peer routing (i.e. finding your address based on your peerID) which in the case of Amino means having recently pinged the closest peers to your peerID in XOR space. Ideally one of those (20) nodes is also acting as your relay which means the number of connections and queries you'd need are even smaller. So we don't really want a random walk we want to talk to specific peers when possible and only broaden the search if we have to.
Co-authored-by: Guillaume Michel <guillaumemichel@users.noreply.github.com>
|
Just connecting some dots. This problem this solves seems somewhat related to libp2p/specs#222 |
update endpoint path to /routing/v1/dht/closest/{peer-id}
as agreed in PR review comments
- create new "DHT Routing API" section for DHT-specific operations
- move /routing/v1/dht/closest/{peer-id} to the new DHT section
- keep general peer lookup in "Peer Routing API" section
- update cache value to 172800 (48h) for consistency
- fix typo: "Query Paramters" -> "Query Parameters"
rename /routing/v1/dht/closest/{peer-id} to
/routing/v1/dht/closest/peers/{peer-id} for future-proofing,
as we may add API for querying entire keyspace in the future
also update document date to 2025-08-19
documents the new /routing/v1/dht/closest/peers/{peer-id} endpoint
that enables lightweight peer discovery for browser nodes and other
resource-constrained clients without requiring full DHT participation
keep 2025-08-19 date from PR branch
🚀 Build Preview on IPFS ready
|
reorganize sections for better logical flow: - Content Routing API - Peer Routing API - IPNS API - DHT Routing API (moved here)
|
@achingbrain FYSA to make this bit more formal, generated IPIP-0476 documenting why we need this: it includes the motivation from the PR (browser nodes needing lightweight peer discovery), the specification, rationale, and future-proofing considerations discussed in the PR comments. Feel free to add any context to the document you find useful. |
This adds the SERVER-side for GetClosestPeers. Since FindPeers also returns PeerRecords, it is essentially a copy-paste, minus things like addrFilters which don't apply here, plus `count` and `closerThan` parsing from the query URL. The tests as well. We leave all logic regarding count/closerThan to the ContentRouter (the DHT, or the Kubo wrapper around it). Spec: ipfs/specs#476
This adds the SERVER-side for GetClosestPeers. Since FindPeers also returns PeerRecords, it is essentially a copy-paste, minus things like addrFilters which don't apply here, plus `count` and `closerThan` parsing from the query URL. The tests as well. We leave all logic regarding count/closerThan to the ContentRouter (the DHT, or the Kubo wrapper around it). Spec: ipfs/specs#476
| - If omitted, the routing implementation should use its own peer ID | ||
| - `count` (optional): Number of peer records to return | ||
| - Minimum 1, maximum 100, default 20 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike the FindPeers endpoint, the GetClosestPeers endpoint does not support protocol or address filters.
I understand this is a DHT-specific method and everything-DHT is unknown, but if that might change in the future, we may introduce support for the filters now from the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and everything-DHT is unknown
Not strictly true since the endpoint may know more information. For example https://github.com/ipfs/someguy learns more about peers on the network which helps it limit sending unhelpful peers to web browsers.
The implementation of ipfs/specs#476 suggests that content routers should support a DHT-specific operations (GetClosestPeers). Content routers depend on routing interfaces defined in the `routing` package and some decorator interfaces defined here. So it seems like the natural place to add yet another interface for this type of router.
|
|
||
| Currently, to find peers close to a particular key in the DHT keyspace, a node must: | ||
| 1. Be a full DHT client with all the associated overhead | ||
| 2. Maintain connections to many peers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory this isn't necessary.
It is true that many connections are necessary to perform a lookup, but a client technically doesn't need to maintain any connection when not performing a lookup actively. It should remember at least some addresses of DHT servers (e.g bootstrappers are enough).
I think we just miss light DHT client implementations.
|
|
||
| - `peer-id`: The target peer ID to find closest peers for, represented as a CIDv1 encoded with `libp2p-key` codec | ||
|
|
||
| #### Query Parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any specific use cases in mind that would require query parameters?
IIUC Go and JS DHT implementations only support returning the 20 closest peers. Since the FIND_NODE RPC returns at most 20 peers, it is quite complex to lookup for more than 20 peers.
The query parameters could be used to filter the results and return less than 20 peers, but not to get more peers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achingbrain @hsanjuan would it be ok to remove closer-than and count?
The only implementation (kad-dht) doesn't support these parameters:
- go-libp2p-kad-dht
- GetClosestPeers only takes (ctx, key string)
- Always returns K (20) closest peers
- No filtering by closer-than
Imo there is no point in listing them in spec and implementing filtering in userland.
This allows Kubo to respond to the GetClosestPeers() http routing v1 endpoint as spec'ed here: ipfs/specs#476 It is based on work from ipfs/boxo#1021 We let IpfsNode implmement the contentRouter.Client interface with the new method. We use our DHTs to get the closest peers. We try to respect the count/closerThan options here. We then trigger FindPeers lookups to fill-in information about the peers (addresses) and return the result. Tests are missing and will come up once discussions around the spec and the boxo pr have settled.
This allows Kubo to respond to the GetClosestPeers() http routing v1 endpoint as spec'ed here: ipfs/specs#476 It is based on work from ipfs/boxo#1021 We let IpfsNode implmement the contentRouter.Client interface with the new method. We use our DHTs to get the closest peers. We try to respect the count/closerThan options here. We then trigger FindPeers lookups to fill-in information about the peers (addresses) and return the result. Tests are missing and will come up once discussions around the spec and the boxo pr have settled.
ref: #476 (comment) Co-authored-by: Hector Sanjuan <code@hector.link>
This adds the SERVER-side for GetClosestPeers. Since FindPeers also returns PeerRecords, it is essentially a copy-paste, minus things like addrFilters which don't apply here, plus `count` and `closerThan` parsing from the query URL. The tests as well. We leave all logic regarding count/closerThan to the ContentRouter (the DHT, or the Kubo wrapper around it). Spec: ipfs/specs#476
This allows Kubo to respond to the GetClosestPeers() http routing v1 endpoint as spec'ed here: ipfs/specs#476 It is based on work from ipfs/boxo#1021 We let IpfsNode implmement the contentRouter.Client interface with the new method. We use our DHTs to get the closest peers. We try to respect the count/closerThan options here. We then trigger FindPeers lookups to fill-in information about the peers (addresses) and return the result. Tests are missing and will come up once discussions around the spec and the boxo pr have settled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to extend the scope of the API to allow getting the closest DHT servers to CIDs and IPNS keys in addition to Peer IDs.
This allows Kubo to respond to the GetClosestPeers() http routing v1 endpoint as spec'ed here: ipfs/specs#476 It is based on work from ipfs/boxo#1021 We let IpfsNode implmement the contentRouter.Client interface with the new method. We use our DHTs to get the closest peers. We try to respect the count/closerThan options here. We then trigger FindPeers lookups to fill-in information about the peers (addresses) and return the result. Tests are missing and will come up once discussions around the spec and the boxo pr have settled.
changed path parameter from {peer-id} to {key} to accept both CIDs and
Peer IDs, matching actual DHT usage where closest peers can be queried
for arbitrary keys
removed count and closer-than query parameters that were adding
complexity without clear use cases in practice
clarified response size should match DHT bucket size (20 for Amino DHT)
added note that this optional endpoint helps light clients lower the
cost of DHT walks in browser contexts
Adds a new HTTP endpoint that can be used to request records for the closest peers to a given key that the routing implementation knows about.
The use-case for this is browser nodes performing random walks to find peers that they can make a circuit relay reservation on, without having to be DHT clients to perform the walk which can be undesirable given all the connection/processing overhead that entails.
I've tried to avoid defining what "closest" means to leave it up to the routing implementation, only that the responses should be "closer" than the optional
closerThanparameter which should mean the caller gets useful results for other scenarios.