-
Notifications
You must be signed in to change notification settings - Fork 631
slo: include all request cancellations within SLO #4355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slo: include all request cancellations within SLO #4355
Conversation
|
for case 2, I think the Handlers should pass correct error to let me update the handlers to handle these late cancellation. |
df58c34 to
f8de330
Compare
|
@electron0zero Adding some more thoughts about the default behavior of counting canceled requests within SLO: I think we could also support an operator that wants to exclude them from SLO by adding an additional label that indicates if the request was completed or canceled like An additional benefit is that this would give us metrics on canceled requests in general, which I don't think we have today. Could be useful. I think this should be feasible and localized to the postSLOHook method where it is checking for cancelations already. But let me know if there are considerations I missed. |
f8de330 to
0ba43c1
Compare
48b2d02 to
055e6af
Compare

What this PR does:
We count all request cancellations outside SLO, but that's not right because request cancellations are done by users and should fall within SLO.
Request cancellations are now exposed under
resultlabel intempo_query_frontend_queries_totalandtempo_query_frontend_queries_within_slo_totalwithcompletedorcanceledvalues to differentiate between completed and canceled requests.operators can filter
tempo_query_frontend_queries_within_slo_totalwithresult=completedto get the old behaviour of the metric where cancellations where excluded from the metric.with the result label, we also expose the information about the cancelled requests, and you can filter on
result=canceledto see cancelled requests.I also added a span and span events in the http and gRPC collector to expose more details of the pipeline.
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]