-
Notifications
You must be signed in to change notification settings - Fork 537
HDDS-8436. Support setSafeMode(), isFileClosed() FileSystem API #4825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
Outdated
Show resolved
Hide resolved
be98bf6
to
613acce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jojochuang , thanks a lot for working on this! The change looks good. Please see the comments inlined.
case LEAVE: | ||
case FORCE_EXIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, we only support FORCE_EXIT
. We should support LEAVE
later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For LEAVE
, it should exit safe mode only if preCheck is true. We should make the change separately.
//SCMSafeModeManager
@@ -237,20 +237,25 @@ public void completePreCheck(EventPublisher eventQueue) {
* 2. Emits START_REPLICATION for ReplicationManager.
* 3. Cleanup resources.
* 4. Emit safe mode status.
- * @param eventQueue
*/
@VisibleForTesting
- public void exitSafeMode(EventPublisher eventQueue) {
- LOG.info("SCM exiting safe mode.");
- // If safemode is exiting, then pre check must also have passed so
- // set it to true.
- setPreCheckComplete(true);
+ public boolean exitSafeMode(boolean force) {
+ LOG.info("SCM exiting safe mode, force? {}", force);
+ if (force) {
+ // If force exiting, set preCheck to true.
+ setPreCheckComplete(true);
+ } else {
+ if (!getPreCheckComplete()) {
+ return false;
+ }
+ }
setInSafeMode(false);
// TODO: Remove handler registration as there is no need to listen to
// register events anymore.
emitSafeModeStatus();
+ return true;
}
if (!ofsPath.isKey()) { | ||
throw new IOException("not a file"); | ||
} | ||
OzoneFileStatus status = bucket.getFileStatus(pathStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check status.isFile()
.
if (ofsPath.isSnapshotPath()) { | ||
throw new IOException("file is in a snapshot."); | ||
} else { | ||
OzoneFileStatus status = bucket.getFileStatus(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check status.isFile()
.
@Override | ||
public boolean setSafeMode(SafeModeAction action, boolean isChecked) | ||
throws IOException { | ||
statistics.incrementWriteOps(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if the action is not GET
before incrementWriteOps
.
@Override | ||
public boolean setSafeMode(SafeModeAction action, boolean isChecked) | ||
throws IOException { | ||
statistics.incrementWriteOps(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if the action is not GET
before incrementWriteOps
.
@Override | ||
public boolean setSafeMode(SafeModeAction action, boolean isChecked) | ||
throws IOException { | ||
statistics.incrementWriteOps(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if the action is not GET
before incrementWriteOps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jojochuang , thanks for the update!
We need to check GET
before incrementWriteOp
in two of RootedOzoneFileSystem
s. BTW, is it possible to move the setSafeMode
method to Basic RootedOzoneFileSystem
?
@Override | ||
public boolean setSafeMode(SafeModeAction action, boolean isChecked) | ||
throws IOException { | ||
statistics.incrementWriteOps(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if the action is not GET
before incrementWriteOps
.
We can't, because Hadoop 2 does not have SafeMode interface (introduced in Hadoop 3.3.6). So I can only put it where Ozone is not run with Hadoop 2 runtime. |
But as it turns out, we test MR jobs with Hadoop 3.1 3.2 and 3.3 runtime, and the MR job tests fail with Hadoop 3.1 and 3.2 (due to the lacking of SafeMode and LeaseRecoverable interface). So either we find a workaround or remove the Hadoop 3.1 3.2 tests. |
Change-Id: Iba9c50fac4da3f90acce8168758f4284e47825cb Compiles Change-Id: I53b7c4dbe33735ca3b39e4502be983134b05c0fe Compiles Change-Id: I699803f225460713f8f47371a3c45b4e10cf9cdf Completed safe mode implementation. Change-Id: I57ba72625c98bf7faa5c6e54526a7d61d7731fee Checkstyle. Change-Id: I1ffd5f1b2756fad7b03864cc3725e9d41cd72b10 Add missing @OverRide annotations, increment statistics properly. Change-Id: Idb45f85b8a8f72776f393a8a0f22eb23e72f815b
Change-Id: I1d0260c14c3dac2ed28f6ea86730183c0630174f
Change-Id: I4ebc4360dde010c9906b292e0e32008146759aa3
Change-Id: I5fec0e5e7653a1070b38be012e17d476fd865a7d
Change-Id: I48f884edac8092af910397536754cfda809c399a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 the change looks good.
* master: (36 commits) HDDS-8990. Intermittent timeout waiting on datanode4 9856 to become available (apache#5039) Revert "HDDS-7750. Incorrect WRITE ACL check. (apache#4992)" HDDS-7750. Incorrect WRITE ACL check. (apache#4992) HDDS-8985. Intermittent timeout exiting safe mode in HA secure tests (apache#5033) HDDS-8593. Add RootCARotationPoller to CertClient (apache#5030) HDDS-7645. Kubernetes check should fail fast if cluster cannot start (apache#5028) HDDS-8981. TestRootedOzoneFileSystem runs out of disk space (apache#5029) HDDS-8592. Fetch and save all root certificates during service's certificate rotation. (apache#5025) HDDS-8981. Disable TestRootedOzoneFileSystem#testSafeMode HDDS-8591. Create scheduler to check for new root ca certificates (apache#4961) HDDS-8979. error validating kustomization.yaml (apache#5024) HDDS-8973. Ozone SCM HA should not allocates duplicate IDs when transferring leadership (apache#5018) HDDS-8970. Snapshot Diff should return path relative to bucket root (apache#5015) HDDS-8975. Clarify SCM HA auto-bootstrap doc (apache#5021) HDDS-8689. Rotate Root CA and Sub CA in SCM. (apache#4943) HDDS-8436. Support setSafeMode(), isFileClosed() FileSystem API (apache#4825) HDDS-8880. Intermittent fork timeout in TestOMRatisSnapshots (apache#5022) HDDS-8962. Ensure docker env is stopped (apache#5011) HDDS-7794. [snapshot] SnapshotDiff should throw better error messages for exception handling (apache#5007) HDDS-7922. [FSO] S3G folder support fso layout filestatus s3A compatibility (apache#4448) ...
What changes were proposed in this pull request?
Implement new Hadoop file system APIs to support HBase.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8436 Support setSafeMode(), isFileClosed() FileSystem API
How was this patch tested?
Unit tests