Skip to content

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented Dec 1, 2021

Prohibit restoring a data stream alias if its write data stream conflicts with
the write data stream of an existing data stream alias.

Currently, the write data stream of data stream alias in the snapshot is ignored and
the write data stream of the data stream alias currently active in the cluster
is always picked.

In order to resolve this properly, I've merged the merge(...) and renameDataStreams(...)
methods into a restore(...) method, which merges a data stream and does renaming at
the same time. This should be ok, since renaming can only be done as part of restoring.
More importantly, this allows to validate whether an existing alias and
an alias from a snapshot to not have conflicting write data streams while at the same time
allowing that a write data stream to be renamed (because data streams got renamed).

Closes #80976

…stream

Prohibit restoring a data stream alias if its write data stream conflicts with
the write data stream of an existing data stream alias.

Currently, the write data stream of data stream alias in the snapshot is ignored and
the write data stream of the data stream alias currently active in the cluster
is always picked.

Closes elastic#80976
@martijnvg martijnvg marked this pull request as ready for review December 2, 2021 14:12
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Dec 2, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@arteam arteam added v7.16.2 and removed v7.16.1 labels Dec 13, 2021
@danhermann danhermann self-requested a review December 14, 2021 16:40
Copy link
Contributor

@danhermann danhermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple clarifying questions below.

if (mergedDataStreams.contains(previous.getWriteDataStream())) {
writeDataStream = previous.getWriteDataStream();
}
// else throw error?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the case referenced by this comment need to be handled? I don't know how an alias would get into the state where it didn't have its write data stream, but it does seem like we shouldn't just fall through in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we need to handle the case where this instance has a write data stream and the previous instance does not have a write data stream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the case referenced by this comment need to be handled? I don't know how an alias would get into the state where it didn't have its write data stream, but it does seem like we shouldn't just fall through in that case.

Previously we were also not handling this, this is why I added the comment here. I think we should handle it and throw an IllegalStateException if this happens.

Also, do we need to handle the case where this instance has a write data stream and the previous instance does not have a write data stream?

We kind of handle this via an assertion in the constructor in DataStreamAlias. If we would handle this via a normal exception then I think it shouldn't be handled here but the mentioned constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is possible to get into the state whether the write data stream of previous instance doesn't exist in mergedDataStreams set:

  • Defining a data stream alias with a write data stream that does't exist in the list of data streams isn't possible. Mainly because when defining a write data stream then this is done via is_write_index boolean flag from the update aliases api and templates. Also there is the assertion in DataStreamAlias class' constructor.
  • The mergedDataStreams set is based on the data streams of both this and previous instances.

I've turned this into an assertion in the latest commit that I've pushed.

@danhermann
Copy link
Contributor

Thanks, @martijnvg. LGTM.

@martijnvg martijnvg merged commit a4b232a into elastic:master Dec 17, 2021
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Dec 17, 2021
…stream (elastic#81217)

Prohibit restoring a data stream alias if its write data stream conflicts with
the write data stream of an existing data stream alias.

Currently, the write data stream of data stream alias in the snapshot is ignored and
the write data stream of the data stream alias currently active in the cluster
is always picked.

In order to resolve this properly, I've merged the merge(...) and renameDataStreams(...)
methods into a restore(...) method, which merges a data stream and does renaming at
the same time. This should be ok, since renaming can only be done as part of restoring.
More importantly, this allows to validate whether an existing alias and
an alias from a snapshot to not have conflicting write data streams while at the same time
allowing that a write data stream to be renamed (because data streams got renamed).

Closes elastic#80976
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.0
7.17 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 81217

elasticsearchmachine pushed a commit that referenced this pull request Dec 17, 2021
…stream (#81217) (#81843)

Prohibit restoring a data stream alias if its write data stream conflicts with
the write data stream of an existing data stream alias.

Currently, the write data stream of data stream alias in the snapshot is ignored and
the write data stream of the data stream alias currently active in the cluster
is always picked.

In order to resolve this properly, I've merged the merge(...) and renameDataStreams(...)
methods into a restore(...) method, which merges a data stream and does renaming at
the same time. This should be ok, since renaming can only be done as part of restoring.
More importantly, this allows to validate whether an existing alias and
an alias from a snapshot to not have conflicting write data streams while at the same time
allowing that a write data stream to be renamed (because data streams got renamed).

Closes #80976
@martijnvg martijnvg removed the v7.17.0 label Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Data streams Data streams and their lifecycles Team:Data Management Meta label for data/management team v8.0.0-rc1 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Snapshot restore requests allow you to restore alias with write data stream, even if the alias already has a write data stream
6 participants