-
Notifications
You must be signed in to change notification settings - Fork 135
Description
Background
Consider the straightforward case that a block is unfetchable by the block source—this is a somewhat common occurrence for Boost, where there may be a disconnect between advertised CIDs and those that are available unsealed. So requests may come in for CIDs that are supposed to be there, but the block fetcher returns an error that the block is unavailable.
The Boxo gateway code will return a 200, a set of valid headers, and a CARv1 header with the requested CID in the roots array, but nothing else, and termination of the stream is clean. There is no indication of a problem without parsing the CAR. Of course in the Trustless Gateway paradigm, it's up to the user to validate that the CAR contains the expected blocks, so from that perspective we have what we need to determine whether there's a problem or not. But this does present complications for debugging problems. In particular when debugging Rhea retrieval problems I need to have access to Boost logs on the server side to see what the problem might be, I have no indication from the outside that Boost even thinks that there's a problem.
Desired behaviour
While the spec doesn't cover this, here's what I think should happen and how we built the Lassie HTTP handler:
- Wait until there is at least one block to return before setting any headers or a CAR header; only when there is at least one block should we start sending data.
- If an error occurs before we get a single block, we return an error response
a. If we can't get any candidates from the indexer then we can do a404
with the bodyno candidates found
b. Other failures are treated as a "gateway timeout",504
, with the bodyfailed to fetch CID: <error message>
.
Code exploration
handler#serveCar
starts setting headers immediately, with no opportunity to change course if there's a problem:Line 56 in 1356946
setContentDispositionHeader(w, name, "attachment") - It defers to
BlocksBackend#GetCAR
to return aReader
for the CAR and simply does anio.Copy
of that data to the output body. BlocksBackend#GetCAR
immediately sets up a pipe and starts writing a CAR:boxo/gateway/blocks_backend.go
Lines 282 to 289 in 1356946
r, w := io.Pipe() go func() { cw, err := storage.NewWritable( w, []cid.Cid{pathMetadata.LastSegment.Cid()}, car.WriteAsCarV1(true), car.AllowDuplicatePuts(params.Duplicates.Bool()), ) - The use of
storage.NewWritable
with a root will immediately write a CARv1 header to theWriter
it's given.
Hence with a valid trustless gateway /ipfs
request, we will always get a valid CARv1 header with the root we request, regardless of whether the requested root CID is even fetchable.
In Lassie, we deal with this in two ways, both encapsulated in DeferredWriter
which is compatible with github.com/ipld/go-ipld-prime/storage/WritableStorage
, like what the github.com/ipld/go-car/v2/storage/CarWriter
is. https://github.com/filecoin-project/lassie/blob/main/pkg/storage/deferredcarwriter.go
- Don't set up the CAR writer until we have our first
Put
operation - Provide an
OnPut
event listener that lets us watch for the first put and set the headers in expectation of a CAR with at least one block.