Skip to content

HDDS-13062. Upon OM startup, verify ozone.metadata.dirs and ozone.om.db.dirs are in the same mount #8502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

smengcl
Copy link
Contributor

@smengcl smengcl commented May 22, 2025

WIP. Review not recommended yet.

What changes were proposed in this pull request?

OM bootstrapping relies on hard links to work. This implies snapshot dir must be on the same filesystem as the OM DB's. Otherwise it throws exception and gets stuck in the retry loop as seen in HDDS-13062 JIRA description.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13062

How was this patch tested?

  • Additional tests are added at each level.

@smengcl smengcl requested a review from Copilot May 22, 2025 21:21
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ensures that the OM bootstrapping process verifies that the directories for ozone.metadata.dirs and ozone.om.db.dirs reside on the same mount point, which is required for hard links to function properly.

  • Updated the directory initialization in OzoneManager.java to check mount point consistency.
  • Added integration tests and unit tests in TestOzoneManagerConfiguration.java and TestOmUtils.java to verify the new logic.
  • Added a TODO comment in ServerUtils.java for potential helper relocation.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java Updated error messaging and added mount point check for metadata and DB directories.
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerConfiguration.java Added tests to verify same and different mount point conditions during startup.
hadoop-ozone/common/src/test/java/org/apache/hadoop/ozone/TestOmUtils.java Added unit test for the mount point check helper function in OmUtils.
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/OmUtils.java Introduced helper functions to determine the mount point of files.
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/server/ServerUtils.java Added a TODO comment regarding helper functions relocation.

Comment on lines +1591 to +1592
// omMetaDir and omRatisDirectory must be under the same mount point for hard links to work properly
if (!OmUtils.isUnderSameMountPoint(omMetaDir, omRatisSnapshotDir)) {
Copy link
Preview

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mount point check compares 'omMetaDir' with 'omRatisSnapshotDir', but the error message mentions ozone.metadata.dirs and ozone.om.db.dirs. Please verify that the correct directories are being compared; if the intent is to ensure the metadata and DB directories are on the same mount point, consider replacing 'omRatisSnapshotDir' with the appropriate variable.

Suggested change
// omMetaDir and omRatisDirectory must be under the same mount point for hard links to work properly
if (!OmUtils.isUnderSameMountPoint(omMetaDir, omRatisSnapshotDir)) {
// omMetaDir and omDbDir must be under the same mount point for hard links to work properly
if (!OmUtils.isUnderSameMountPoint(omMetaDir, omDbDir)) {

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant