Skip to content

Gateway: CAR handler shouldn't return 200 & a CAR header if data is unavailable #458

@rvagg

Description

@rvagg

Background

Consider the straightforward case that a block is unfetchable by the block source—this is a somewhat common occurrence for Boost, where there may be a disconnect between advertised CIDs and those that are available unsealed. So requests may come in for CIDs that are supposed to be there, but the block fetcher returns an error that the block is unavailable.

The Boxo gateway code will return a 200, a set of valid headers, and a CARv1 header with the requested CID in the roots array, but nothing else, and termination of the stream is clean. There is no indication of a problem without parsing the CAR. Of course in the Trustless Gateway paradigm, it's up to the user to validate that the CAR contains the expected blocks, so from that perspective we have what we need to determine whether there's a problem or not. But this does present complications for debugging problems. In particular when debugging Rhea retrieval problems I need to have access to Boost logs on the server side to see what the problem might be, I have no indication from the outside that Boost even thinks that there's a problem.

Desired behaviour

While the spec doesn't cover this, here's what I think should happen and how we built the Lassie HTTP handler:

  1. Wait until there is at least one block to return before setting any headers or a CAR header; only when there is at least one block should we start sending data.
  2. If an error occurs before we get a single block, we return an error response
    a. If we can't get any candidates from the indexer then we can do a 404 with the body no candidates found
    b. Other failures are treated as a "gateway timeout", 504, with the body failed to fetch CID: <error message>.

Code exploration

  • handler#serveCar starts setting headers immediately, with no opportunity to change course if there's a problem:
    setContentDispositionHeader(w, name, "attachment")
  • It defers to BlocksBackend#GetCAR to return a Reader for the CAR and simply does an io.Copy of that data to the output body.
  • BlocksBackend#GetCAR immediately sets up a pipe and starts writing a CAR:
    r, w := io.Pipe()
    go func() {
    cw, err := storage.NewWritable(
    w,
    []cid.Cid{pathMetadata.LastSegment.Cid()},
    car.WriteAsCarV1(true),
    car.AllowDuplicatePuts(params.Duplicates.Bool()),
    )
  • The use of storage.NewWritable with a root will immediately write a CARv1 header to the Writer it's given.

Hence with a valid trustless gateway /ipfs request, we will always get a valid CARv1 header with the root we request, regardless of whether the requested root CID is even fetchable.

In Lassie, we deal with this in two ways, both encapsulated in DeferredWriter which is compatible with github.com/ipld/go-ipld-prime/storage/WritableStorage, like what the github.com/ipld/go-car/v2/storage/CarWriter is. https://github.com/filecoin-project/lassie/blob/main/pkg/storage/deferredcarwriter.go

  1. Don't set up the CAR writer until we have our first Put operation
  2. Provide an OnPut event listener that lets us watch for the first put and set the headers in expectation of a CAR with at least one block.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium: Good to have, but can wait until someone steps upeffort/daysEstimated to take multiple days, but less than a weekhelp wantedSeeking public contribution on this issuekind/enhancementA net-new feature or improvement to an existing featuretopic/gatewayIssues related to HTTP Gateway

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions