-
Notifications
You must be signed in to change notification settings - Fork 236
IPIP-0421: HTTP Delegated Routing Reader Privacy Upgrade #421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 10 commits
7fba2e7
f01558f
f857f33
caede81
07967b3
f76c87c
0d2948e
6c76a33
2d800ad
18b258e
25242f6
b894279
ef341c1
0195260
23eb7d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,111 @@ | ||
| --- | ||
| title: "IPIP-0421: HTTP Delegated Routing Reader Privacy Upgrade" | ||
| date: 2023-05-31 | ||
| ipip: proposal | ||
| editors: | ||
| - name: Andrew Gillis | ||
| github: gammazero | ||
| - name: Ivan Schasny | ||
| github: ischasny | ||
| - name: Masih Derkani | ||
| github: masih | ||
| - name: Will Scott | ||
| github: willscott | ||
| order: 421 | ||
| tags: ['ipips', 'routing', 'privacy', 'double hashing'] | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| This IPIP specifies new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. | ||
|
|
||
| ## Motivation | ||
|
|
||
| IPFS is currently lacking of many privacy protections. One of its main weak points lies in the lack | ||
| of privacy protections for the Content Routing subsystem. Currently neither Readers (clients accessing files) | ||
| nor Writers (hosts storing and distributing content) have much privacy with regard to content they publish or | ||
| consume. It is very easy for a Content Router or a Passive Observer to learn which file is requested by | ||
| which client during the routing process, as the potential adversary easily learns about the requested `CID`. | ||
| A curious actor could request the same `CID` and download the associated file to monitor the user’s behavior. | ||
| This is obviously undesirable and has been for some time now a strong request from the community. | ||
|
|
||
| The latest upgrades to the DHT and IPNI have introduced Double Hashing - a technique that aims to better preserve Reader Privacy. | ||
| With Double Hashing in place Provider Records are encrypted and opaque to Content Routers. If presented with the original `CID` a | ||
| Content Router can decrypt the relevant Provider Records and serve them via the existing Delegated Routing API. | ||
| However in order to benefit from the privacy enhancement users need to change the way they interact with Content Routers, in particular: | ||
| - A second hash over the original `Multihash` must be used when looking up the content; | ||
| - Returned Provider Records are encrypted and must be decrypted by the client before using them; | ||
| - The client might choose to fetch additional encrypted Metadata from the Content Router. | ||
|
|
||
| This new way of interaction can not be fullfilled by the existing API. This IPIP is an incremental improvement to the HTTP Delegated Routing API that adds | ||
| new endpoints for serving encrypted content. The original API can still be used for not Privacy Preserving lookups. | ||
|
|
||
| Writer Privacy is out of scope of this IPIP and is going to be addressed separately. | ||
|
|
||
| ## Detailed design | ||
|
|
||
| See the Delegated Routing Reader Privacy Upgrade spec (:cite[http-routing-reader-privacy-v1]) included with this IPIP. | ||
|
|
||
| ## Design rationale | ||
|
|
||
| This API proposal makes the following changes: | ||
| - Adds new methods for looking up encrypted Provider Records and encrypted Metadata; | ||
| - Defines Hashing and Encryption functions and response payloads structure. | ||
|
|
||
| There are no ideomatic changes to the API - all data formats, design rationale and principles outlined in the original :cite[ipip-0337] apply here. | ||
|
|
||
| ### User benefit | ||
|
|
||
| With the new APIs users can protect themselves from: | ||
| - a malicious actor spying on the user by observing the user to Content Router traffic and then downloading the same data; | ||
| - the new API is a first step towards fully private HTTP Delegated Routing protocol that will eliminate IPNI as centralised observers. | ||
|
|
||
| There are no other functional improvements. | ||
|
|
||
| ### Compatibility | ||
|
|
||
| #### Backwards Compatibility | ||
lidel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Users will need to explicitly turn on Reader Privacy on their nodes. A new flag can be introduced to the Kubo's HTTP Delegated Content Router configuration to facilitate that functionality. | ||
| Users on older nodes can continue using the old API and turn on reader Privacy at a alter point. | ||
|
|
||
| Content Routers should provide the same QoS for both Privacy Preserving and regular APIs. This is because both can be served over the same encrypted data. If presented with a regular CID, a Content Router | ||
| can perform decryption operations on behalf of the user (i.e. mimic the client logic) and return results in clear text. If presented with a second hash the Content Router can return encrypted results and let the | ||
| user to do decryption themselves. | ||
|
|
||
| It's possible that not all Content Routers will adopt Reader Privacy. The default HTTP Delegated Router like `cid.contact` should have Reader Privacy enabled by default in the newer versions of Kubo / Helia. | ||
| Users should verify themselves whether a custom router of their choice supports Reader Privacy or not when configuring it. | ||
|
|
||
| The `/routing/v1/encrypted/` API will be implemented in existing libraries like [`boxo/routing/http`](https://github.com/ipfs/boxo/tree/main/routing/http) and will not introduce any breaking changes to existing clear text endpoints. | ||
| The API will be released in a new minor version. | ||
|
|
||
| #### Forwards Compatibility | ||
|
|
||
| Reader Privacy relies on usage of specific hashing and encryption functions. Function rotation will require a network-wide migration. Content Routers might not be able to migrate "under the hood" as they | ||
| don't possess the original values. Function rotation should be a very infrequent event and will require network-wide efforts. When function rotation is needed - a version of the API will be incremented. | ||
|
|
||
lidel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ### Security | ||
|
|
||
| See "Threat Modelling" section of :cite[http-routing-reader-privacy-v1] | ||
|
|
||
| ### Alternatives | ||
|
|
||
| TODO: Describe alternate designs that were considered and related work. | ||
|
|
||
| - TODO: Oblivious HTTP ([IETF](https://www.ietf.org/archive/id/draft-thomson-http-oblivious-01.html), [Cloudflare](https://blog.cloudflare.com/stronger-than-a-promise-proving-oblivious-http-privacy-properties/)) | ||
masih marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Test fixtures | ||
|
|
||
| TODO: List relevant CIDs or JSON payloads. Describe how implementations can use them to determine | ||
| specification compliance. This section can be skipped if IPIP does not deal | ||
| with the way IPFS handles content-addressed data, or the modified specification | ||
| file already includes this information. | ||
|
|
||
| ### Resources | ||
|
|
||
| - [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) | ||
| - [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5) | ||
|
|
||
| ### Copyright | ||
|
|
||
| Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,111 @@ | ||||||
| --- | ||||||
| title: Routing V1 HTTP Delegated Routing Reader Privacy Upgrade | ||||||
| description: > | ||||||
| This specification describes Delegated Routing Reader Privacy Upgrade. It's an | ||||||
| incremental improvement to HTTP Delegated Routing API and inherits all of its | ||||||
| formats and design rationale. | ||||||
| date: 2023-05-31 | ||||||
| maturity: reliable | ||||||
| editors: | ||||||
| - name: Andrew Gillis | ||||||
| github: gammazero | ||||||
| - name: Ivan Schasny | ||||||
| github: ischasny | ||||||
| - name: Masih Derkani | ||||||
| github: masih | ||||||
| - name: Will Scott | ||||||
| github: willscott | ||||||
| order: 0 | ||||||
| tags: ['routing', 'double hashing', 'privacy'] | ||||||
| --- | ||||||
|
|
||||||
| This specification describes a new HTTP API for Privacy Preserving Delegated Content Routing provider lookups. It's an extension to HTTP Delegated Routing API and inherits all of its formats and design rationale. | ||||||
|
|
||||||
| ## API Specification | ||||||
|
|
||||||
| ### Magic Values | ||||||
|
|
||||||
| All salts below are 64-bytes long, and represent a string padded with `\x00`. | ||||||
|
|
||||||
| - `SALT_DOUBLEHASH = bytes("CR_DOUBLEHASH\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` | ||||||
| - `SALT_ENCRYPTIONKEY = bytes("CR_ENCRYPTIONKEY\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00")` | ||||||
masih marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||
|
|
||||||
| Magic values are needed to calculate different digests from the same value for different purposes. For example a hash of a Multihash that is used for lookups should be different from the one that is used for | ||||||
| key derivation, even though both are calculated from the same original value. In order to do that the Multihash is concatenated with different magic values before applying the hash funciton - `SALT_DOUBLEHASH` | ||||||
| for lookups and `SALT_ENCRYPTIONKEY` for key derivation as described in the `Glossary`. | ||||||
|
|
||||||
| ### Glossary | ||||||
|
|
||||||
| - **`enc`** is [AESGCM](https://en.wikipedia.org/wiki/Galois/Counter_Mode) encryption. The following notation will be used for the rest of the specification `enc(passphrase, nonce, payload)`. | ||||||
| - **`hash`** is [SHA256](https://en.wikipedia.org/wiki/SHA-2) hashing. | ||||||
| - **`||`** is concatenation of two values. | ||||||
| - **`deriveKey`** is deriving a 32-byte encryption key from a passphrase that is done as `hash(SALT_ENCRYPTIONKEY || passphrase)`. | ||||||
| - **`CID`** is the [Content IDentifier](https://github.com/multiformats/cid). | ||||||
| - **`MH`** is the [Multihash](https://github.com/multiformats/multihash) contained in a `CID`. It corresponds to the | ||||||
| digest of a hash function over some content. | ||||||
| - **`HASH2`** is a second hash over the multihash. Second Hashes must be of `Multihash` format with `DBL_SHA_256` codec. | ||||||
| The digest must be calculated as `hash(SALT_DOUBLEHASH || MH)`. | ||||||
| - **`ProviderRecord`** is a JSON with Provider Record as described in the [HTTP Delegated Routing Specification](http-routing-v1.md). | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it? The routing-v1 spec allows for opaque blobs in the provider record. Where's the line between "metadata" and "provider record" here? |
||||||
| - **`ProviderRecordKey`** is a concatentation of `peerID || contextID`. There is no need for explicitly encoding lengths as they are | ||||||
| already encoded as a part of the multihash format. Max `contextID` length is 64 bytes. | ||||||
| - **`EncProviderRecordKey`** is `Nonce || enc(deriveKey(multihash), Nonce, ProviderRecordKey)`. Max `EncProviderRecordKey` is 200 bytes. | ||||||
| - **`HashProviderRecordKey`** is a hash over `ProviderRecordKey` that must be calculated as `hash(SALT_DOUBLEHASH || ProviderRecordKey)`. | ||||||
| - **`Metadata`** is free form bytes that can represent such information such as IPNI metadata. Max `Metadata` length is 1024 bytes. | ||||||
| - **`EncMetadata`** is `Nonce || enc(deriveKey(ProviderRecordKey), Nonce, Metadata)`. Max `EncMetadata` length is 2000 bytes. | ||||||
|
|
||||||
| :::note | ||||||
|
|
||||||
| Maximum allowed lengths might change without incrementing the API version. Such fields as `contextID` or `Metadata` are free-form bytes and | ||||||
| their maximum lengths can be changed in the underlying protocols. | ||||||
|
|
||||||
| ::: | ||||||
|
|
||||||
| ### API | ||||||
| #### `GET /routing/v1/encrypted/providers/{HASH2}` | ||||||
masih marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How is
|
||||||
|
|
||||||
| ##### Response codes | ||||||
|
|
||||||
| - `200` (OK): the response body contains 1 or more records | ||||||
| - `404` (Not Found): must be returned if no matching records are found | ||||||
| - `422` (Unprocessable Entity): request does not conform to schema or semantic constraints | ||||||
|
|
||||||
| ##### Response Body | ||||||
|
|
||||||
| ```json | ||||||
| { | ||||||
| "EncProviderRecordKeys": [ | ||||||
| "EBxdYDhd.....", | ||||||
| "IOknr9DK.....", | ||||||
| ] | ||||||
| } | ||||||
| ``` | ||||||
|
|
||||||
| Where: | ||||||
|
|
||||||
| - `EncProviderRecordKeys` a list of base64 encoded `EncProviderRecordKey`; | ||||||
|
|
||||||
| #### `GET /routing/v1/encrypted/metadata/{HashProviderRecordKey}` | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same question about encoding as for HASH2 |
||||||
|
|
||||||
| ##### Response codes | ||||||
|
|
||||||
| - `200` (OK): the response body contains 1 record | ||||||
| - `404` (Not Found): must be returned if no matching records are found | ||||||
| - `422` (Unprocessable Entity): request does not conform to schema or semantic constraints | ||||||
|
|
||||||
| ##### Response Body | ||||||
|
|
||||||
| ```json | ||||||
| { | ||||||
| "EncMetadata": "EBxdYDhd....." | ||||||
| } | ||||||
| ``` | ||||||
|
|
||||||
| Where: | ||||||
|
|
||||||
| - `EncMetadatas` is base64 encoded `EncMetadata`; | ||||||
|
|
||||||
| ### Notes | ||||||
|
|
||||||
| Assembling a full `ProviderRecord` from the encrypted data will require multiple roundtrips to the server. The first one to fetch a list of `EncProviderRecordKey`s and then one per | ||||||
| `EncProviderRecordKey` to fetch `EncMetadata`. In order to reduce the number of roundtrips to one the client implementation should use the local libp2p peerstore for multiaddress discovery | ||||||
| and [libp2p multistream select](https://github.com/multiformats/multistream-select) for protocol negotiation. | ||||||
Uh oh!
There was an error while loading. Please reload this page.