-
Notifications
You must be signed in to change notification settings - Fork 4
feat: API for deals tracked per miner/client #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Git-based dependency fails in `NPM_CONFIG_WORKSPACE=stats npm ci`: ``` npm error code 1 npm error git dep preparation failed npm error command /opt/hostedtoolcache/node/20.16.0/x64/bin/node /opt/hostedtoolcache/node/20.16.0/x64/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/home/runner/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run npm error npm warn using --force Recommended protections disabled. npm error npm error No workspaces found: npm error npm error --workspace=stats ``` Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
"spark-api": "https://github.com/filecoin-station/spark-api/archive/3de5c3325ef6cc40df02769b96957d33279d38c1.tar.gz", | ||
"spark-evaluate": "filecoin-station/spark-evaluate#main" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I cannot use the git/GitHub shorthand for spark-api
because spark-api has workspaces, and npm refuses to install the entire project when running npm ci
limited to a single spark-stats workspace. I think it's a bug in npm, but 🤷🏻
$ NPM_CONFIG_WORKSPACE=stats npm ci
(...)
npm error code 1
npm error git dep preparation failed
npm error command /opt/hostedtoolcache/node/20.16.0/x64/bin/node /opt/hostedtoolcache/node/20.16.0/x64/lib/node_modules/npm/bin/npm-cli.js install --force --cache=/home/runner/.npm --prefer-offline=false --prefer-online=false --offline=false --no-progress --no-save --no-audit --include=dev --include=peer --include=optional --no-package-lock-only --no-dry-run
npm error npm warn using --force Recommended protections disabled.
npm error npm error No workspaces found:
npm error npm error --workspace=stats
@juliangruber hindsight please 🙏🏻 |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have concerns with this PR
|
||
- `GET /miners/retrieval-success-rate/summary?from=<day>&to=<day>` | ||
|
||
http://stats.filspark.com/miners/retrieval-success-rate/summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this looks like a mistake; thanks for flagging it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env: | ||
DATABASE_URL: postgres://postgres:postgres@localhost:5432/spark_stats | ||
EVALUATE_DB_URL: postgres://postgres:postgres@localhost:5432/spark_evaluate | ||
API_DB_URL: postgres://postgres:postgres@localhost:5432/spark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm strongly against this design. I thought we had settled on an architecture where services don't directly query databases not owned by them. Now we're introducing one more of this antipattern (as agreed by us).
I think the way to go instead is for spark-api to expose the data that spark-stats needs via HTTP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your concern! I actually started to implement the spark-api HTTP endpoint here: CheckerNetwork/spark-api#388, but then I realised we want to put this behind Cloudflare and protect it using the same access token that we will use for other spark-stats endpoints.
@juliangruber
How would you propose implementing your suggestion? Should spark-stats simply proxy requests for the new endpoints to spark-api?
const getRetrievableDealsForMiner = async (_req, res, minerId) => {
const apiRes = await fetch(`${sparkApiUrl}/miner/${minerId}/deals/eligible/summary`)
assert(apiRes.ok, 502)
json(res, await apiRes())
}
I'd like to prevent people from calling spark-api's endpoint directly, so I would need to implement some sort of an authentication. I can implement a fixed access token configured via Fly secrets both in spark-api and spark-stats, let spark-stats send that token in the request and let spark-api reject requests with a missing or an incorect token.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also YOLO it and create a tech debt that we will need to pay once we start working on the Spark Data API product:
- Option A: implement these spark-stats endpoints as HTTP redirects to spark-api
- Option B: implement these endpoints as a proxy, but don't authenticate them - allow anybody to request the data from spark-api directly if they know about this option.
Option A:
const getRetrievableDealsForMiner = async (_req, res, minerId) => {
res.setHeader(
'location',
`${sparkApiUrl}/miner/${minerId}/deals/eligible/summary`
)
res.statusCode = 302
res.end()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on what direction we choose, I'll need to rework the API for deals tracked per allocator (#196) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the options I see and think could work:
spark-api
exposes an HTTP API andspark-stats
proxies to it. For authentication we can use a hard coded access token, or keccak256 method used inspark-rewards
.spark-api
publishes this data tospark-stats
(via HTTP), which persists it in its own databasespark-stats
periodically pulls this data fromspark-api
(via HTTP) and persists it in its own database
I think 1. is the easiest to implement. 2. and 3. are the safest, as spark-api
won't be affected by spark-stats
load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this a bit more.
To track DDO deals, I will need to build a new service that listens to Filecoin actor events and maintains its own deal database. Then both spark-api and spark-stats will need to get data about eligible deals from that service, most likely over HTTP.
We may want to make that data about deals public to allow 3rd parties to verify that Spark is sampling the deals fairly.
👍
In that light, here is my plan:
- Copy the implementation of the APIs introduced here to feat: API for deals tracked per allocator #196
I don't understand this, how are you going to copy APIs to a closed PR?
- Rework spark-stats endpoints to redirect to spark-api, drop the dev-dependency on spark-api repository.
Rationale:
- spark-api is already behind Cloudflare and it accepts only requests made via Cloudflare. If spark-stats implements a reverse proxy, we will create unnecessary network traffic we need to pay for (Cloudflare -> spark-stats -> Cloudflare -> spark-api).
- Since we don't know yet whether we want/need to make the data about eligible deals private, let's keep it public for now as it requires less work.
Redirect also works for me, if it's a temporary one (302). We don't want anyone to remember spark-api's URL if it's not an API we want to commit to hosting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redirect also works for me, if it's a temporary one (302). We don't want anyone to remember spark-api's URL if it's not an API we want to commit to hosting
💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark-api PR:
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Links:
Out of the scope of this pull request:
Example request:
Example response:
Example request:
Example response: