Skip to content

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Aug 6, 2024

Links:

Out of the scope of this pull request:

  • querying deals by allocator id

Example request:

GET /miner/f0814049/deals/eligible/summary

Example response:

{
  "minerId": "f0814049",
  "dealCount": 10006,
  "clients": [
    { "clientId": "f02516933", "dealCount": 6880 },
    { "clientId": "f02833886", "dealCount": 3126 }
  ]
}

Example request:

GET /clients/f0215074/deals/eligible/summary

Example response:

{
  "clientId": "f0215074",
  "dealCount": 5350,
  "providers": [
    { "minerId": "f0406478", "dealCount": 4592 },
    { "minerId": "f0814049", "dealCount": 758 }
  ]
}

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
bajtos added 6 commits August 6, 2024 10:40
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Git-based dependency fails in `NPM_CONFIG_WORKSPACE=stats npm ci`:

```
npm error code 1
npm error git dep preparation failed
npm error command /opt/hostedtoolcache/node/20.16.0/x64/bin/node /opt/hostedtoolcache/node/20.16.0/x64/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/home/runner/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
npm error npm warn using --force Recommended protections disabled.
npm error npm error No workspaces found:
npm error npm error   --workspace=stats
```

Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Comment on lines +17 to 18
"spark-api": "https://github.com/filecoin-station/spark-api/archive/3de5c3325ef6cc40df02769b96957d33279d38c1.tar.gz",
"spark-evaluate": "filecoin-station/spark-evaluate#main"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I cannot use the git/GitHub shorthand for spark-api because spark-api has workspaces, and npm refuses to install the entire project when running npm ci limited to a single spark-stats workspace. I think it's a bug in npm, but 🤷🏻

$ NPM_CONFIG_WORKSPACE=stats npm ci
(...)

npm error code 1
npm error git dep preparation failed
npm error command /opt/hostedtoolcache/node/20.16.0/x64/bin/node /opt/hostedtoolcache/node/20.16.0/x64/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/home/runner/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
npm error npm warn using --force Recommended protections disabled.
npm error npm error No workspaces found:
npm error npm error   --workspace=stats

@bajtos bajtos enabled auto-merge (squash) August 7, 2024 06:09
@bajtos bajtos disabled auto-merge August 7, 2024 06:09
@bajtos bajtos merged commit c117b69 into main Aug 7, 2024
@bajtos bajtos deleted the feat-deals-tracked branch August 7, 2024 06:09
@bajtos
Copy link
Member Author

bajtos commented Aug 7, 2024

@juliangruber hindsight please 🙏🏻

Copy link

sentry-io bot commented Aug 7, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ Error: connect ECONNREFUSED 127.0.0.1:5432 getApiPgPool(index) View Issue

Did you find this useful? React with a 👍 or 👎

Copy link
Member

@juliangruber juliangruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns with this PR


- `GET /miners/retrieval-success-rate/summary?from=<day>&to=<day>`

http://stats.filspark.com/miners/retrieval-success-rate/summary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this looks like a mistake; thanks for flagging it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found it. This was a duplicate entry, the endpoint is already documented on lines 19-21 above.

Screenshot 2024-09-02 at 12 12 33

env:
DATABASE_URL: postgres://postgres:postgres@localhost:5432/spark_stats
EVALUATE_DB_URL: postgres://postgres:postgres@localhost:5432/spark_evaluate
API_DB_URL: postgres://postgres:postgres@localhost:5432/spark
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly against this design. I thought we had settled on an architecture where services don't directly query databases not owned by them. Now we're introducing one more of this antipattern (as agreed by us).

I think the way to go instead is for spark-api to expose the data that spark-stats needs via HTTP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your concern! I actually started to implement the spark-api HTTP endpoint here: CheckerNetwork/spark-api#388, but then I realised we want to put this behind Cloudflare and protect it using the same access token that we will use for other spark-stats endpoints.

@juliangruber
How would you propose implementing your suggestion? Should spark-stats simply proxy requests for the new endpoints to spark-api?

const getRetrievableDealsForMiner = async (_req, res, minerId) => {
  const apiRes = await fetch(`${sparkApiUrl}/miner/${minerId}/deals/eligible/summary`)
  assert(apiRes.ok, 502)
  json(res, await apiRes())
}

I'd like to prevent people from calling spark-api's endpoint directly, so I would need to implement some sort of an authentication. I can implement a fixed access token configured via Fly secrets both in spark-api and spark-stats, let spark-stats send that token in the request and let spark-api reject requests with a missing or an incorect token.

Copy link
Member Author

@bajtos bajtos Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also YOLO it and create a tech debt that we will need to pay once we start working on the Spark Data API product:

  • Option A: implement these spark-stats endpoints as HTTP redirects to spark-api
  • Option B: implement these endpoints as a proxy, but don't authenticate them - allow anybody to request the data from spark-api directly if they know about this option.

Option A:

const getRetrievableDealsForMiner = async (_req, res, minerId) => {
  res.setHeader(
    'location',
    `${sparkApiUrl}/miner/${minerId}/deals/eligible/summary`
  )
  res.statusCode = 302
  res.end()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on what direction we choose, I'll need to rework the API for deals tracked per allocator (#196) as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the options I see and think could work:

  1. spark-api exposes an HTTP API and spark-stats proxies to it. For authentication we can use a hard coded access token, or keccak256 method used in spark-rewards.
  2. spark-api publishes this data to spark-stats (via HTTP), which persists it in its own database
  3. spark-stats periodically pulls this data from spark-api (via HTTP) and persists it in its own database

I think 1. is the easiest to implement. 2. and 3. are the safest, as spark-api won't be affected by spark-stats load.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this a bit more.

To track DDO deals, I will need to build a new service that listens to Filecoin actor events and maintains its own deal database. Then both spark-api and spark-stats will need to get data about eligible deals from that service, most likely over HTTP.

We may want to make that data about deals public to allow 3rd parties to verify that Spark is sampling the deals fairly.

👍

In that light, here is my plan:

I don't understand this, how are you going to copy APIs to a closed PR?

  • Rework spark-stats endpoints to redirect to spark-api, drop the dev-dependency on spark-api repository.

Rationale:

  • spark-api is already behind Cloudflare and it accepts only requests made via Cloudflare. If spark-stats implements a reverse proxy, we will create unnecessary network traffic we need to pay for (Cloudflare -> spark-stats -> Cloudflare -> spark-api).
  • Since we don't know yet whether we want/need to make the data about eligible deals private, let's keep it public for now as it requires less work.

Redirect also works for me, if it's a temporary one (302). We don't want anyone to remember spark-api's URL if it's not an API we want to commit to hosting

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy the implementation of the APIs introduced here to #196
I don't understand this, how are you going to copy APIs to a closed PR?

Sorry, I should have re-read the comment before submitting.

I want to copy the API endpoints from this PR and #196 to spark-api.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redirect also works for me, if it's a temporary one (302). We don't want anyone to remember spark-api's URL if it's not an API we want to commit to hosting

💯

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

Copy link
Member Author

@bajtos bajtos Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants