-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Only publish desired balance gauges on master #115383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only publish desired balance gauges on master #115383
Conversation
Closes ES-9834
@@ -168,6 +171,7 @@ public String toString() { | |||
if (event.localNodeMaster() == false) { | |||
onNoLongerMaster(); | |||
} | |||
nodeIsMaster.set(event.localNodeMaster()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps there is an alternative pattern for doing this kind of thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can reduce the number of atomic set calls with a check of event.localNodeMaster() != event.previousState().nodes().isLocalNodeElectedMaster()
.
Alternatively, we can register the metrics in this class since we are already accessing the fields like desiredBalanceReconciler.unassignedShards
directly in onNoLongerMater
. This way we can keep a volatile field for master check and no need to pass it down. Not necessarily better than what you have. Happy for you to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done a fairly significant refactor to move all the metrics and their state & logic to a dedicated class, it means we can unit test, get rid of all the Atomics, and make a nicer interface for setting whether we're master and getting and setting the metric values.
Hi @nicktindall, I've created a changelog YAML for you. |
Pinging @elastic/es-distributed (Team:Distributed) |
"es.allocator.desired_balance.allocations.undesired.current", | ||
"Total number of shards allocated on undesired nodes excluding shutting down nodes", | ||
"{shard}" | ||
"{shard}", | ||
this::getUndesiredAllocations | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These return a Closeable
which (in the APM implementation at least) deregisters the metric when closed, we were ignoring it previously when we used the synchronous API and these metrics live for the life of the node so I assume we don't need to worry about it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine since a node may become master again. So we don't want to close it since there is no way to re-open. Muting with empty result is what we want here.
); | ||
undesiredAllocations = LongGaugeMetric.create( | ||
meterRegistry, | ||
meterRegistry.registerLongsGauge( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am new to this one.
Does it emit a single document with array of metrics or number of separate documents?
In other words, if not an elected master, would we get empty array or no documents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The places I've seen it used is where you have a bunch of values for a metric that have different labels. So a number of separate documents. In this case we're just using it to be able to publish nothing when we are not the master (i.e. no documents)
docs/changelog/115383.yaml
Outdated
pr: 115383 | ||
summary: Only publish desired balance gauges on master | ||
area: Allocation | ||
type: bug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely convinced this is bug. Sounds more like an enhancement when we only publish metric from elected master and do not publish zero metrics from everywhere else.
Hi @nicktindall, I've updated the changelog YAML for you. |
…desired_balance_metrics_on_current_master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
"es.allocator.desired_balance.allocations.undesired.current", | ||
"Total number of shards allocated on undesired nodes excluding shutting down nodes", | ||
"{shard}" | ||
"{shard}", | ||
this::getUndesiredAllocations | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine since a node may become master again. So we don't want to close it since there is no way to re-open. Muting with empty result is what we want here.
@@ -168,6 +171,7 @@ public String toString() { | |||
if (event.localNodeMaster() == false) { | |||
onNoLongerMaster(); | |||
} | |||
nodeIsMaster.set(event.localNodeMaster()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we can reduce the number of atomic set calls with a check of event.localNodeMaster() != event.previousState().nodes().isLocalNodeElectedMaster()
.
Alternatively, we can register the metrics in this class since we are already accessing the fields like desiredBalanceReconciler.unassignedShards
directly in onNoLongerMater
. This way we can keep a volatile field for master check and no need to pass it down. Not necessarily better than what you have. Happy for you to decide.
…desired_balance_metrics_on_current_master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the extra refinement.
@@ -168,6 +170,7 @@ public String toString() { | |||
if (event.localNodeMaster() == false) { | |||
onNoLongerMaster(); | |||
} | |||
desiredBalanceMetrics.setNodeIsMaster(event.localNodeMaster()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I still think checking event.localNodeMaster() != event.previousState().nodes().isLocalNodeElectedMaster()
before calling this method is helpful since otherwise most of the time it ends up setting the same value.
Closes ES-9834