-
Notifications
You must be signed in to change notification settings - Fork 159
Add OpenTracing module #418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ba3d259
to
7b2ecd5
Compare
There will be a couple more things later today:
But this is ready for the main review and feedback. |
temporal-opentracing/src/main/java/io/temporal/opentracing/OpenTracingConfig.java
Outdated
Show resolved
Hide resolved
...acing/src/main/java/io/temporal/opentracing/internal/OpenTracingContextPropagationUtils.java
Outdated
Show resolved
Hide resolved
...acing/src/main/java/io/temporal/opentracing/internal/OpenTracingContextPropagationUtils.java
Outdated
Show resolved
Hide resolved
...acing/src/main/java/io/temporal/opentracing/internal/OpenTracingContextPropagationUtils.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work Dmitry, I left some comments here, please ping me on slack if you have questions.
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
openTracingConfig.getWorkflowRunOperationNamePrefix() | ||
+ "-" | ||
+ Workflow.getInfo().getWorkflowType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add workflow type at the end? Should we add it into tags instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just WorkflowStart or WorkflowRun as an operation name is very undescriptive and pretty meaningless.
A definition/best practice from OpenTracing Specification:
An operation name, a human-readable string which concisely represents the work done by the Span (for example, an RPC method name, a function name, or the name of a subtask or stage within a larger computation). The operation name should be the most general string that identifies a (statistically) interesting class of Span instances. That is, "get_user" is better than "get_user/314159".
For example, here are potential operation names for a Span that gets hypothetical account information:
Operation Name Guidance get Too general get_account/792 Too specific get_account Good, and account_id=792 would make a nice Span tag
Also from Open Tracing Guide:
The reason for choosing general operation names is to allow the tracing systems to do aggregations. For example, Jaeger tracer has an option of emitting metrics for all the traffic going through the application. Having a unique operation name for each span would make the metrics useless. For application or library developers who wish to capture the program arguments in the traces to distinguish them, the recommended solution is to annotate spans with tags or logs. These are discussed in the section below.
Each operation name should have some semantic meaning to it and should be chosen such that its cardinality is neither too high nor too low. The cardinality of an operation name is defined as follows:
card(operation X) = total number of spans of operation X
For example, when using http.url (as operation name) the cardinality would be too high, on the other hand if http.method is chosen as the operation name then the cardinality would be too low.
I think that my current solution is most in line with these guidelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually goes against what you've quoted above, specifically "operation name should be the most general string" as we concatenate specific workflow type to operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you just took a part of the phrase out of the context. The full phrase is "most general string that identifies a (statistically) interesting class of Span instances". And there are more phrases around that explain what does it mean.
You don't suppose to just select the most general string. You select the most general string that makes sense semantically for the process that we trace and which cardinality is not too small or too large (because it would be useless).
-
human-readable string which concisely represents the work done by the Span (for example, an RPC method name, a function name, or the name of a subtask or stage within a larger computation)
WorkflowStart, WorkflowRun, ActivityStart, ActivityRun don't "concisely represents the work done by the Span"
-
For example, when using http.url (as operation name) the cardinality would be too high, on the other hand if http.method is chosen as the operation name then the cardinality would be too low.
Your offer is close to using http.method
from this example.
- The first link gives us specific good and bad examples:
Operation Name Guidance get Too general get_account/792 Too specific get_account Good, and account_id=792 would make a nice Span tag
Your offer will lead to making an operation name EVEN MORE generic than this get
example. Your offer is comparable to "Just Some HTTP Call" level. Which doesn't even presented in the examples, because it's clearly too generic.
- Even temporal go sdk uses workflow/activity types as open tracing span operation names: https://github.com/temporalio/sdk-go/blob/cec615e1aaf798e1b153ba3918b4d3027edde808/internal/propagation.go#L50
https://github.com/temporalio/sdk-go/blob/cec615e1aaf798e1b153ba3918b4d3027edde808/internal/propagation.go#L64
I stand on the fact that my current solution exactly implements the guidelines or close to them and it should be implemented this way.
I'm ready to compromise on removing the prefixes and putting them in tags (which will make Start and Run spans indistinguishable for the reader from the first sight). But I'm not ready to compromse on taking workflow/activity types out from span operation names. It will be an incorrect implementation of the spec. It will also be an implementation that is not in agreement with how other opentracing implementations (and even Temporal own Go SDK) are done.
I'm also a future user of this solution and I'm very motivated to keep it this way. Your proposal will make the trace look like a chain of indistinguishable "ActivityRun" spans without digging into tags and it will be a disaster for a user who actually will read the trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how OpenTelemetry UIs look like:
To even get to tags you need to open the details of a specific span. How comfortable it will work with a sequence of indistinguishable ActivityRun spans?
Also, see OpenTracing implementation for JaxRS2 (as the closest framework to Temporal because it's aware about endpoint class and method names):
https://github.com/opentracing-contrib/java-jaxrs/blob/master/opentracing-jaxrs2/src/main/java/io/opentracing/contrib/jaxrs2/server/OperationNameProvider.java#L46
Default Microprofile operation name
<HTTP method>:<package name>.<Class name>.<method name>
<HTTP method>
is our StartWorkflow/RunActivity, etc.
<package name>.<Class name>.<method name>
is our ActivityType/WorkflowType
They do provide an alternative strategy
/**
* Returns HTTP method as operation name
*/
class HTTPMethodOperationName implements OperationNameProvider {
but it's not a default.
.../main/java/io/temporal/opentracing/internal/OpenTracingWorkflowOutboundCallsInterceptor.java
Outdated
Show resolved
Hide resolved
@vitarb Thanks for your review!
This PR is ready to be revisited. |
1be097d
to
6c01e14
Compare
6d21943
to
a8587b9
Compare
...l-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingContextAccessor.java
Outdated
Show resolved
Hide resolved
4550a09
to
74a7941
Compare
temporal-opentracing/src/main/java/io/temporal/opentracing/SpanOperationType.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/SpanOperationType.java
Show resolved
Hide resolved
...c/main/java/io/temporal/opentracing/internal/OpenTracingActivityInboundCallsInterceptor.java
Outdated
Show resolved
Hide resolved
...l-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingContextAccessor.java
Outdated
Show resolved
Hide resolved
temporal-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingSpanUtils.java
Outdated
Show resolved
Hide resolved
...l-opentracing/src/main/java/io/temporal/opentracing/internal/OpenTracingContextAccessor.java
Show resolved
Hide resolved
...rc/main/java/io/temporal/opentracing/internal/OpenTracingWorkflowClientCallsInterceptor.java
Outdated
Show resolved
Hide resolved
...rc/main/java/io/temporal/opentracing/internal/OpenTracingWorkflowClientCallsInterceptor.java
Outdated
Show resolved
Hide resolved
...c/main/java/io/temporal/opentracing/internal/OpenTracingWorkflowInboundCallsInterceptor.java
Show resolved
Hide resolved
.../main/java/io/temporal/opentracing/internal/OpenTracingWorkflowOutboundCallsInterceptor.java
Outdated
Show resolved
Hide resolved
188de6d
to
a2df830
Compare
What was changed:
This PR adds a new temporal-opentracing module with support for OpenTracing context propagation from the client to workflow and to activity execution.
Why?
We want to be able to tie all the parts of the workflow execution under the same client parent OpenTracing span for tracing and debugging.
How it can be used?
See README provided with this module: https://github.com/temporalio/sdk-java/pull/418/files#diff-a87181e0e99e6147518982b133e66674da61be50e9898fe9d1eab9dc91ab2cc6R5
Closes
#258