-
Notifications
You must be signed in to change notification settings - Fork 248
A98: Channelz v2 #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
A98: Channelz v2 #493
Conversation
Co-authored-by: Yuri Golobokov <yuri.golobokov@icloud.com>
``` | ||
message Entity { | ||
// The identifier for this entity. | ||
int64 id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use unsigned types for ids?
It looks like channelz uses signed types but documents the fact that they need to be positive values, which seems a little silly. I'd argue that the protocol shouldn't allow values that we consider invalid.
I guess the counter-argument would be to stick to the same types used in channelz for compatibility. But if we're taking the opportunity to define a new protocol anyway, maybe we should clean this up at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd like to keep compatibility here. Having the underlying type have some invalid values has been useful for implementation too.
} | ||
``` | ||
|
||
Entities have state and configuration, and tend be be relatively long lived - such that querying them makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that in the C-core impl you've been working on, it's possible for debug entities to outlive the actual gRPC object that they're associated with, which is useful for after-the-fact debugging. Do we want to say something about that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some bits
These ids are allocated sequentially per the rationale in A14, and implementations should use the same id space for debug entities and channelz objects. | ||
|
||
**kind**: An entity has a kind. This is a string descriptor of the general category of object that this entity describes. | ||
We use strings here rather than enums so that implementations are free to extend the kind space with their own objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API doesn't seem to have any way to discover what types are supported or what the relationships between those types are. Would it make sense to add something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably does, but I think I need concrete use cases on how they'd be used... and right now a .md
file would cover all known use-cases.
We use strings here rather than enums so that implementations are free to extend the kind space with their own objects. | ||
Common entity kinds will see some level of standardization across stacks, but we expect many kinds to be specific per gRPC implementation. | ||
Initially implementations are expected to match kinds with channelz object types: | ||
kind `"channel"` -> `channelz.v1.Channel`, `"subchannel"` -> `channelz.v1.Subchannel`, `"server"` -> `channelz.v1.Server`, `"socket"` -> `channelz.v1.Socket`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we could define some well-known data types for these kinds that we know will exist in all implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... this spec or a sister spec?
A98-debug-protocol.md
Outdated
// The name of the trace to query. | ||
string name = 2; | ||
// The amount of time to query. Implementations may arbitrarily cap this. | ||
google.protobuf.Duration duration = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a default if this is unset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this and switched the protocol to streaming.
A98-debug-protocol.md
Outdated
int64 id = 1; | ||
// The name of the trace to query. | ||
string name = 2; | ||
// The amount of time to query. Implementations may arbitrarily cap this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably note that the QueryTrace RPC won't return a response until after this much time, so the caller must set its deadline to be larger than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to with the switch to streaming!
**data**: This is a list of protobuf Any objects describing the current state of this entity. | ||
Implementations may define their own protobufs to be exposed here, and common sets will be standardized separately. | ||
|
||
**trace**: Finally, an entity may supply a small summary of its history as a trace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we note here that traces can be either proactively recorded, in which case they are returned here, or recorded on-demand via the QueryTrace RPC method?
Should we impose any cap on the number of proactively recorded traces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some commentary
The actual collection machinery remains synchronous - I don't have bandwidth to update the entire ecosystem yet, and would want to wait for grpc/proposal#493 before making major changes here anyway - when we'll do a deeper cleanup of the services exported for this API. But this allows us to have a single timeout even if there are multiple collection sources, which is going to be valuable for my next change (which will add collection from promise parties) ---- DO NOT SUBMIT. This PR is for testing purposes only. [cl/761643330](http://cl/761643330) PiperOrigin-RevId: 761643330
The actual collection machinery remains synchronous - I don't have bandwidth to update the entire ecosystem yet, and would want to wait for grpc/proposal#493 before making major changes here anyway - when we'll do a deeper cleanup of the services exported for this API. But this allows us to have a single timeout even if there are multiple collection sources, which is going to be valuable for my next change (which will add collection from promise parties) PiperOrigin-RevId: 762261877
Per grpc/proposal#493 (sans service just yet) - just getting the basic protobufs in place to start working on implementation Closes #39871 PiperOrigin-RevId: 772476936
Per grpc/proposal#493 (sans service just yet) - just getting the basic protobufs in place to start working on implementation Closes grpc#39871 PiperOrigin-RevId: 772476936
Per grpc/proposal#493 (sans service just yet) - just getting the basic protobufs in place to start working on implementation Closes grpc#39871 PiperOrigin-RevId: 772476936
No description provided.