Skip to content

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Sep 24, 2024

While running the indexer in production, I noticed that many index providers return advertisemnt but then they return "404 cid not found" when we ask for the (payload) entries.

Let's recover from that error by ignoring the advertisement and continuing the walk.

Links:

While running the indexer in production, I noticed that many index
providers return advertisemnt but then they return "404 cid not found"
when we ask for the (payload) entries.

Let's recover from that error by ignoring the advertisement and
continuing the walk.

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos requested a review from juliangruber September 24, 2024 12:35
/** @type {any} */(err).serverMessage ?? '<not found>'
)
return {
entriesFetchError: true,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the best property name, can you suggest a better one? Or may I should set the value to a string instead of an error?

Suggested change
entriesFetchError: true,
entriesFetchError: String(err.statusCode),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just looking up this property, as was assuming it's one of these

  • an Error
  • an array of entries for which there was a fetch error
  • the count of entries for which there was a fetch error

What about entriesFetchError: err? Or otherwise hasEntriesFetchError: true

/** @type {any} */(err).serverMessage ?? '<not found>'
)
return {
entriesFetchError: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just looking up this property, as was assuming it's one of these

  • an Error
  • an array of entries for which there was a fetch error
  • the count of entries for which there was a fetch error

What about entriesFetchError: err? Or otherwise hasEntriesFetchError: true


try {
const { previousAdvertisementCid, ...entry } = await fetchAdvertisedPayload(
const { previousAdvertisementCid, entriesFetchError, ...entry } = await fetchAdvertisedPayload(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is entriesFetchError plural and entry singular?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me explain how this works:

  • First, we fetch the advertisement object. This object contains a CID linking to another object containing (payload block) entries.
  • Second, we fetch the object with the entries.
  • Finally, we take the first entry and discard the rest.

So, the error happens when we are fetching entries, but then we return only a single entry.

err.statusCode,
/** @type {any} */(err).serverMessage ?? '<not found>'
)
return {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we returning an object with { error } instead of throwing it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this can be confusing, and I am thinking about redesigning my implementation.

What I wanted to accomplish in this PR:

  • If the request for entries returns 404, we want to skip the advertisement and walk to the next advertisement in the chain. This is handled by returning { error }
  • If the request for entries fails with a different error, we want to retry it. This is handled by (re)throwing the error.

After thinking about this some more, I am leaning towards the following design:

  • If the request for entries returns 404, we skip the advertisement. (No change from the current proposal.)
  • If the request for entries fails differently, we retry it several times. If all attempts fail, then we skip the advertisement, too.

That way, we optimise for finishing the walk of all advertisements so that we can process new advertisements and avoid repeating the same failing request forever.

bajtos and others added 7 commits September 30, 2024 15:25
Co-authored-by: Julian Gruber <julian@juliangruber.com>
Co-authored-by: Julian Gruber <julian@juliangruber.com>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos
Copy link
Member Author

bajtos commented Oct 1, 2024

@juliangruber I reworked my PR as follows:

Inside fetchAdvertisedPayload:

  • I added pRetry to retry when the server returns a 5xx error. (It does not make sense to retry after an error like 404 because the next request is almost certainly going to fail too.)
  • I changed the return value to either contains entry object or an error string code.
  • I recognise two errors - CANNOT_FETCH_ENTRIES and CANNOT_DETERMINE_PIECE_CID
  • In particular, this function no longer throws when we cannot fetch the content for the Entries CID link.

Inside processNextAdvertisement, I added code to distinguish those two error codes and added a new property state.adsMissingPieceCID.

@bajtos bajtos requested a review from juliangruber October 1, 2024 12:10
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Copy link
Member

@juliangruber juliangruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions but nothing blocking the PR from being merged

const indexEntry = entry.pieceCid ? entry : undefined
if (error === 'CANNOT_FETCH_ENTRIES') {
state.entriesNotRetrievable = (state.entriesNotRetrievable ?? 0) + 1
} else if (error === 'CANNOT_DETERMINE_PIECE_CID') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (error === 'CANNOT_DETERMINE_PIECE_CID') {
} else if (error === 'MISSING_PIECE_CID') {

To match the state property

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what state property do you refer to. Do you mean the text in the debug logs?

I don't have a strong opinion about the error codes, I will apply your suggestions.

}

const indexEntry = entry.pieceCid ? entry : undefined
if (error === 'CANNOT_FETCH_ENTRIES') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (error === 'CANNOT_FETCH_ENTRIES') {
if (error === 'ENTRY_NOT_RETRIEVABLE') {

To match the state property

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll use ENTRIES_NOT_RETRIEVABLE. We cannot retrieve the list of entries. We don't try to retrieve the entry itself.

}} AdvertisedPayload */
/**
* @param {unknown} err
* @param {string} providerAddress
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param {string} providerAddress
* @param {string} providerAddress
* @returns {string}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeScript infers the return type.

Screenshot 2024-10-02 at 13 34 08

But I can see how it can be helpful to see the return type in the JSDoc comment. I'll add @returns {string | undefined}

if (!pieceCid) {
debug('advertisement %s has no PieceCID in metadata: %j', advertisementCid, meta.deal)
return {
error: /** @type {const} */('CANNOT_DETERMINE_PIECE_CID'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this type cast for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this cast, the error would have the type string and auto-completion in VS Code won't work.

With the cast, the error has an enum type 'CANNOT_DETERMINE_PIECE_CID' | 'CANNOT_FETCH_ENTRIES'.

Then, when you start writing a condition like if (error === ', VSCode can auto-complete the valid values.

bajtos added 2 commits October 2, 2024 13:33
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos merged commit d99d54e into main Oct 2, 2024
10 checks passed
@bajtos bajtos deleted the handle-ad-errors branch October 2, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants