Skip to content

Infrastructure for assuming cluster features in the next major version #118143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Dec 17, 2024

Conversation

thecoop
Copy link
Member

@thecoop thecoop commented Dec 6, 2024

Add support for node features to be labelled as 'assumed in the next major version', allowing them to be removed.

This is targetted at v9 and v8.18, as the feature removals need to be done on both versions to maintain continued compatibility. Once the infrastructure is merged and backported, the procedure for removing features is as follows:

  1. Mark the feature as assumed on v9, and backport to v8.18
  2. Wait for that commit to be fully deployed to serverless
  3. Remove the feature from main and elide all checks for that feature

@thecoop thecoop added >feature :Core/Infra/Core Core issues without another label :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. auto-backport Automatically create backport pull requests when merged v9.0.0 v8.18.0 labels Dec 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @thecoop, I've created a changelog YAML for you.

@thecoop thecoop force-pushed the cluster-feature-majors branch from 654b6de to 885e8de Compare December 6, 2024 12:01
@thecoop
Copy link
Member Author

thecoop commented Dec 6, 2024

This uses DiscoveryNode.getVersion() extensively to get the major version of a node. As this is due to be removed, I've created #118255 to introduce a 'major version' concept to BuildVersion

@thecoop thecoop force-pushed the cluster-feature-majors branch from cee97fa to b604511 Compare December 6, 2024 14:49
@thecoop thecoop marked this pull request as ready for review December 9, 2024 11:52
@thecoop thecoop requested review from a team as code owners December 9, 2024 11:52
@elasticsearchmachine elasticsearchmachine added Team:Core/Infra Meta label for core/infra team Team:Distributed Coordination Meta label for Distributed Coordination team labels Dec 9, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@thecoop thecoop force-pushed the cluster-feature-majors branch from 3becca3 to 3889d4e Compare December 9, 2024 14:31
}

private static Version nextMajor() {
return Version.fromId((Version.CURRENT.major + 1) * 1_000_000 + 99);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until the assert in VersionInformation constructor is removed, we need to continue to use Version here

@thecoop thecoop requested review from rjernst and a team December 9, 2024 16:06
*/
public record NodeFeature(String id) {
public record NodeFeature(String id, boolean assumedAfterNextCompatibilityBoundary) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming might need a bit of work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggle coming up with something better... assumePresenceAfterNextCompatibilityBoundary, but that's even harder to read.... assumedOnNextMajor maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though next major is not correct in the context of serverless...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly, it was assumedOnNextMajor, but it's always true for serverless, so...

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Dec 9, 2024
@thecoop thecoop requested review from prdoyle and mosche December 12, 2024 15:50
Copy link
Contributor

@mosche mosche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more nits / suggestions on naming, but LGTM 👍

* @see org.elasticsearch.env.BuildVersion#canRemoveAssumedFeatures
*/
public static boolean featuresCanBeAssumedForNode(DiscoveryNode node) {
return node.getBuildVersion().canRemoveAssumedFeatures();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stumbled over remove in canRemoveAssumedFeatures, that was confusing to me and I kept looking for where we remove those. Maybe turning it around makes it easier to understand: canAssumeRemovedFeatures

Copy link
Member Author

@thecoop thecoop Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not quite right though, its not that all features that are missing can be assumed, it's that the features that this ES has marked as assumed can be taken as being met in a node with this version, regardless of whether they're published by the node or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right 👍
If we can find something better than remove, it'd be great, but certainly not worth delaying this further.
canInferAssumedFeatures, canExpectAssumedFeatures, canSkipAssumedFeatures
Or more targeting the versioning side of this: hasPassedNextCompatiblityBoundary or similar.

*/
public record NodeFeature(String id) {
public record NodeFeature(String id, boolean assumedAfterNextCompatibilityBoundary) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggle coming up with something better... assumePresenceAfterNextCompatibilityBoundary, but that's even harder to read.... assumedOnNextMajor maybe?

@thecoop
Copy link
Member Author

thecoop commented Dec 12, 2024

@elasticmachine rerun elasticsearch-ci/part-1

@thecoop thecoop merged commit f5712e4 into elastic:main Dec 17, 2024
17 checks passed
@thecoop thecoop deleted the cluster-feature-majors branch December 17, 2024 13:18
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 118143

thecoop added a commit to thecoop/elasticsearch that referenced this pull request Dec 17, 2024
elastic#118143)

Allow features to be marked as 'assumed', allowing them to be removed in the next major version.
thecoop added a commit that referenced this pull request Dec 18, 2024
navarone-feekery pushed a commit to navarone-feekery/elasticsearch that referenced this pull request Dec 26, 2024
elastic#118143)

Allow features to be marked as 'assumed', allowing them to be removed in the next major version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Core/Infra/Core Core issues without another label :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >feature serverless-linked Added by automation, don't add manually Team:Core/Infra Meta label for core/infra team Team:Distributed Coordination Meta label for Distributed Coordination team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants