Skip to content

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Sep 4, 2024

While working on the initial implementation, I realized that the originally proposed design was a bit difficult to implement. I updated the design document to match the actual implementation.

Also, I decided to use Redis for persistence, in line with what @juliangruber did in spark-rewards.

Links:

Out of scope:

  • Sentry integration
  • REST API component

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos requested a review from juliangruber September 4, 2024 10:22
@bajtos bajtos removed the request for review from juliangruber September 4, 2024 12:50
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos changed the title feat: update our state from IPNI state feat: initial implementation of the indexer Sep 4, 2024
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos marked this pull request as ready for review September 16, 2024 08:22
@bajtos bajtos requested a review from juliangruber September 16, 2024 08:22
@bajtos
Copy link
Member Author

bajtos commented Sep 16, 2024

The CI is failing, I need to setup Redis in GitHub Actions.

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
providerId,
getProviderInfo,
minStepIntervalInMs: 100
}).finally(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you worried about the maximum concurrency this can run in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! My current thinking is that there are several thousand storage providers now, plus maybe a few hundred non-Filecoin index providers. We should not be running more than 10k concurrent walkers. Walkers spend most of their time waiting for I/O (Redis, HTTP requests to index providers) or sleeping between steps.

I think that Node.js can easily handle such load. WDYT?

But! Maybe this is a sign that we need more visibility into this aspect. What I can do - but preferably in a follow-up pull request, is add an InfluxDB client and periodically write a data point with the number of concurrent walkers.

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to let it fail, debug and then see that maximum concurrency is the issue. It would be something else if we knew yes 100% it is, but like this I think it's fine.

What we need monitoring for is whether the indexer is delayed, I believe as long as it's not, no other metric is important.

bajtos and others added 5 commits September 18, 2024 14:42
Co-authored-by: Julian Gruber <julian@juliangruber.com>
Co-authored-by: Julian Gruber <julian@juliangruber.com>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos requested a review from juliangruber September 18, 2024 13:56
Copy link
Member

@juliangruber juliangruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For later: We should add monitoring for whether the indexer is behind or in sync with the chain. Eg track the age of the latest advertisement minus the age of the latest ingested advertisement

Co-authored-by: Julian Gruber <julian@juliangruber.com>
@bajtos
Copy link
Member Author

bajtos commented Sep 19, 2024

For later: We should add monitoring for whether the indexer is behind or in sync with the chain. Eg track the age of the latest advertisement minus the age of the latest ingested advertisement

Great idea!

I added a work item to CheckerNetwork/roadmap#143

@bajtos bajtos merged commit ba4d4f1 into main Sep 19, 2024
7 checks passed
@bajtos bajtos deleted the indexer-impl branch September 19, 2024 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants