Skip to content

HDDS-11463. Track and display failed DataNode storage locations in SCM. #7266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Oct 3, 2024

What changes were proposed in this pull request?

Currently, we lack tools on the SCM side to track failed disks on DataNodes. DataNodes have already reported this information, and we need to display it.

In this PR, we will display the failed disks on the DataNode. The information can be displayed in JSON format or using the default format.

Default format

Datanode Volume Failures (5 Volumes)

Node         : localhost-62.238.104.185 (de97aaf3-99ad-449d-ad92-2c4f5a744b49) 
Failed Volume: /data0/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-163.120.165.68 (cf40e987-8952-4f7a-88b7-096e6b285243) 
Failed Volume: /data1/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-253.243.206.120 (0cc77921-489d-4cf0-a036-475faa16d443) 
Failed Volume: /data2/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-136.194.243.81 (5cb6430d-0ce5-4204-b265-179ee38fb30e) 
Failed Volume: /data3/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-48.253.209.226 (f99a8374-edb0-419d-9cba-cfab9d9e8a2e) 
Failed Volume: /data4/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Json format

[ {
  "node" : "localhost-161.170.151.131 (155bb574-7ed8-41cd-a868-815f4c2b0d60)",
  "volumeName" : "/data0/ozonedata/hdds",
  "failureDate" : 1727918794694,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-67.218.46.23 (520d29eb-8387-4cda-bcb1-8727fdddd451)",
  "volumeName" : "/data1/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-30.151.88.21 (d66cab50-bbf8-4199-9d7f-82da84a30137)",
  "volumeName" : "/data2/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-78.50.38.217 (a673f50a-6f74-4e62-8c0c-f7337d5f3ce5)",
  "volumeName" : "/data3/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-138.205.52.25 (84b7e49a-9bd4-4115-96fa-69f2d259343c)",
  "volumeName" : "/data4/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
} ]

Table format

+-------------------------------------------------------------------------------------------------------------------------------------------+
|                                                         Datanode Volume Failures                                                          |
+------------------------------------------------------------------+-----------------------+---------------+--------------------------------+
|                               Node                               |      Volume Name      | Capacity Lost |          Failure Date          |
+------------------------------------------------------------------+-----------------------+---------------+--------------------------------+
|  localhost-83.212.219.28 (8b6addb1-759a-49e9-99fb-0d1a6cfb2d7f)  | /data0/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
| localhost-103.199.236.47 (0dbe503a-3382-4753-b95a-447bab5766c4)  | /data1/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
|  localhost-178.123.46.32 (2017076a-e763-4f47-abce-78535b5770a3)  | /data2/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
| localhost-123.112.235.228 (aaebb6a7-6b62-4160-9934-b16b8fdde65e) | /data3/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
| localhost-249.235.216.19 (cbc7c0b5-5ae0-4e40-91b8-1d9c419a007c)  | /data4/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
+------------------------------------------------------------------+-----------------------+---------------+--------------------------------+

What is the link to the Apache JIRA

JIRA: HDDS-11463. Track and display failed DataNode storage locations in SCM.

How was this patch tested?

Add Junit Test & Testing in a test environment.

@slfan1989 slfan1989 marked this pull request as ready for review October 5, 2024 11:48
@slfan1989 slfan1989 marked this pull request as draft October 6, 2024 01:46
@slfan1989 slfan1989 closed this Oct 22, 2024
@slfan1989 slfan1989 reopened this Oct 23, 2024
@slfan1989 slfan1989 marked this pull request as ready for review October 23, 2024 04:41
@slfan1989
Copy link
Contributor Author

@errose28 Could you please help review this PR? Thank you very much! We discussed the relevant implementation together in HDDS-11463.

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @slfan1989, this looks like a useful addition. I only had time for a quick high level look for now.

* Handler of ozone admin scm volumesfailure command.
*/
@Command(
name = "volumesfailure",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the CLI, we should probably use something like ozone admin datanode volume list. The datanode subcommand is already used to retrieve information about datanodes from SCM. Splitting the commands so that volume has its own subcommand gives us more options in the future.

To distinguish failed and healthy volumes and filter out different nodes, we can either add some kind of filter flag, or leave it up to grep/jq to be applied to the output.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also means we should make the RPC more generic to support pulling all volume information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for helping to review this PR! I will continue to improve the relevant code based on your suggestions.

@@ -382,6 +383,7 @@ public abstract static class Builder<T extends Builder<T>> {
private boolean failedVolume = false;
private String datanodeUuid;
private String clusterID;
private long failureDate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use failureTime. I'm assuming this is being stored as millis since epoch, so it will have data and time information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have improved the relevant code.

// Ensure it is set only once,
// which is the time when the failure was first detected.
if (failureDate == 0L) {
setFailureDate(Time.now());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Instant.now() per HDDS-7911.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@errose28 Can you help review this PR again? Thank you very much!

@adoroszlai adoroszlai marked this pull request as draft November 5, 2024 18:30
@adoroszlai
Copy link
Contributor

Thanks @slfan1989 for working on this. Converted it to draft because there is a failing test:

[ERROR] org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport  Time elapsed: 1.105 s  <<< ERROR!
java.lang.UnsupportedOperationException
	at java.base/java.util.AbstractList.add(AbstractList.java:153)
	at java.base/java.util.AbstractList.add(AbstractList.java:111)
	at org.apache.hadoop.hdds.scm.node.DatanodeInfo.updateStorageReports(DatanodeInfo.java:186)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.processNodeReport(SCMNodeManager.java:674)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:423)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:360)
	at org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport(TestSCMNodeManager.java:1591)

https://github.com/slfan1989/ozone/actions/runs/11471452180
https://github.com/slfan1989/ozone/actions/runs/11476535807
https://github.com/slfan1989/ozone/actions/runs/11625983369

@slfan1989 slfan1989 force-pushed the HDDS-11463 branch 2 times, most recently from a83a8f7 to b1df492 Compare November 6, 2024 09:26
@slfan1989
Copy link
Contributor Author

Thanks @slfan1989 for working on this. Converted it to draft because there is a failing test:

[ERROR] org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport  Time elapsed: 1.105 s  <<< ERROR!
java.lang.UnsupportedOperationException
	at java.base/java.util.AbstractList.add(AbstractList.java:153)
	at java.base/java.util.AbstractList.add(AbstractList.java:111)
	at org.apache.hadoop.hdds.scm.node.DatanodeInfo.updateStorageReports(DatanodeInfo.java:186)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.processNodeReport(SCMNodeManager.java:674)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:423)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:360)
	at org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport(TestSCMNodeManager.java:1591)

https://github.com/slfan1989/ozone/actions/runs/11471452180 https://github.com/slfan1989/ozone/actions/runs/11476535807 https://github.com/slfan1989/ozone/actions/runs/11625983369

@adoroszlai Thank you for reviewing this PR! I am currently making improvements, and once the changes pass the CI tests in my branch, I will reopen the PR.

cc: @errose28

@slfan1989 slfan1989 force-pushed the HDDS-11463 branch 2 times, most recently from 8bc7ae0 to ff43fac Compare November 7, 2024 08:37
@slfan1989
Copy link
Contributor Author

@adoroszlai Thank you for reviewing this PR! I will also pay closer attention to CI issues in future development. I understand that CI testing resources are valuable.

I have made improvements to the code based on @errose28 suggestions and also fixed the related unit test errors. The CI for my branch has passed(https://github.com/slfan1989/ozone/actions/runs/11719380711), and I have updated the PR status to 'Ready for Review'.

@slfan1989 slfan1989 marked this pull request as ready for review November 7, 2024 23:48
@adoroszlai adoroszlai requested a review from errose28 November 8, 2024 04:59
@slfan1989
Copy link
Contributor Author

@errose28 Could you please help review this PR again? Thank you very much! I’ve made some additional improvements to this PR, as we wanted to print all the disk information. However, since there’s quite a lot of disk data, I’ve added pagination functionality.

@adoroszlai
Copy link
Contributor

Temporarily converted to draft and assigned to myself, to resolve conflicts.

@slfan1989
Copy link
Contributor Author

@adoroszlai Thank you for your attention to this PR. I will continue to follow up on it.

@adoroszlai adoroszlai removed their assignment Feb 18, 2025
@adoroszlai
Copy link
Contributor

Merged from master. There will be one checkstyle problem:

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/datanode/package-info.java
 18: Missing a Javadoc comment.
 18: Missing javadoc for package-info.java file.

Previously the license header was a javadoc in this new file, so the problem was hidden.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @slfan1989 for the patch.

private String uuid;

// The HostName identifier of the DataNode.
@Option(names = { "--hostName" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid camelCase for options. (Also for --displayMode.)

Suggested change
@Option(names = { "--hostName" },
@Option(names = { "--hostname" },

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still applies to the latest patch.

Comment on lines 61 to 72
// PageSize refers to the number of items displayed per page
// in a paginated view.
@Option(names = { "--pageSize" },
defaultValue = "20",
description = "The number of volume information items displayed per page.")
private int pageSize;

// The current page.
@Option(names = { "--currentPage" },
defaultValue = "1",
description = "The current page.")
private int currentPage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other Ozone CLI commands allow pagination (via ListOptions reusable mixin) with --start to specify the start item, --length for "page size", and --all (only one of these last two are allowed). Please try to use options consistent with that. Using ListOption directly may not work, since --prefix does not seem to be applicable here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other Ozone CLI commands allow pagination (via ListOptions reusable mixin) with --start to specify the start item, --length for "page size", and --all (only one of these last two are allowed). Please try to use options consistent with that.

I have thoroughly reviewed ListOptions, and it is indeed a powerful tool class. However, my requirements differ slightly from its original functionality. I want to implement a pagination display, as there may sometimes be a certain number of damaged disks. For example, if there are 30 damaged disks and the limit is set to 10, the currentPage can range from 1 to 3, allowing for pagination. In this case, currentPage represents the current page number. Therefore, I plan to add the currentPage option to ListOptions to meet this need.

Using ListOption directly may not work, since --prefix does not seem to be applicable here.

The --start and --prefix options have limited relevance for our functionality. Currently, we support filtering by hostName, which I believe should be sufficient. I'm unsure whether we need to filter by disk letter as well. What do you think?

Copy link
Contributor

@adoroszlai adoroszlai May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with numeric "current page" is that the list is not fixed. By the time you request the next page, it may be shorter or longer. Then you may get the same item twice (if new item is added in the range of the earlier pages) or not at all (if item is removed from earlier pages).

Using an anchor item eliminates those problems by allowing the size of "previous pages" to change.

Compare GitHub's list of commits, which uses commit SHA as anchor:

https://github.com/apache/ozone/commits/master?after=e6daae4af3f9839be52ea6db636c9ff434acf306+34

to list of pull requests, which has simple numeric page:

https://github.com/apache/ozone/pulls?page=2

Created HDDS-12995 to move out --prefix, to allow reusing ListOptions for lists that are filtered by other parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your detailed explanation! I now understand the reliability of using anchor points and plan to make the corresponding improvements in the code. Regarding HDDS-12995, I understand that you mean to move the --prefix option from its current position, specifically placing it within the command, rather than having it as a global option. Am I understanding this correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 49 to 59
// Display it in JSON format.
@Option(names = { "--json" },
defaultValue = "false",
description = "Format output as JSON.")
private boolean json;

// Display it in TABLE format.
@Option(names = { "--table" },
defaultValue = "false",
description = "Format output as Table.")
private boolean table;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make these exclusive, like:

@CommandLine.ArgGroup(exclusive = true)
private ExclusiveLimit exclusiveLimit = new ExclusiveLimit();

static class ExclusiveLimit {
@CommandLine.Option(names = {"--length", "-l"},
description = "Maximum number of items to list",
defaultValue = "100",
showDefaultValue = CommandLine.Help.Visibility.ALWAYS)
private int limit;
@CommandLine.Option(names = {"--all", "-a"},
description = "List all results",
defaultValue = "false")
private boolean all;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion! I will improve the code based on this recommendation.

Comment on lines 89 to 90
description = "failed is used to display failed disks, " +
"normal is used to display normal disks.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description seems to be dup of --displayMode. (Same for --hostName.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will improve the help information for displayMode, hostname, and uuid.

Comment on lines 80 to 84
@Option(names = { "--displayMode" },
defaultValue = "all",
description = "failed is used to display failed disks, " +
"normal is used to display normal disks.")
private String displayMode;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 67 to 68
System.setOut(new PrintStream(outContent, false, DEFAULT_ENCODING));
System.setErr(new PrintStream(errContent, false, DEFAULT_ENCODING));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to manually set out/err when using SystemOutCapturer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question! After using SystemOutCapturer, we no longer need to manually set out/err. I have already improved this part of the code.

Comment on lines 144 to 160
private HddsProtos.DatanodeDetailsProto createDatanodeDetails() {
Random random = ThreadLocalRandom.current();
String ipAddress = random.nextInt(256)
+ "." + random.nextInt(256)
+ "." + random.nextInt(256)
+ "." + random.nextInt(256);

DatanodeDetails.Builder dn = DatanodeDetails.newBuilder()
.setUuid(UUID.randomUUID())
.setHostName("localhost" + "-" + ipAddress)
.setIpAddress(ipAddress)
.setPersistedOpState(HddsProtos.NodeOperationalState.IN_SERVICE)
.setPersistedOpStateExpiry(0);

for (DatanodeDetails.Port.Name name : ALL_PORTS) {
dn.addPort(DatanodeDetails.newPort(name, 0));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MockDatanodeDetails.

Comment on lines 308 to 313
required string uuid = 1;
required string hostName = 2;
required string volumeName = 3;
required bool failed = 4;
required int64 failureTime = 5;
required int64 capacity = 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -49,7 +51,7 @@ public class DatanodeInfo extends DatanodeDetails {
private volatile long lastHeartbeatTime;
private long lastStatsUpdatedTime;
private int failedVolumeCount;

private List<VolumeInfoProto> volumeInfos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volumeInfos is never assigned. I think the intention was to update this variable in updateStorageReports in the block holding writeLock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing out this issue! I hadn't noticed these details before.

Comment on lines +43 to +47
name = "volumes",
description = "Display the list of volumes on the DataNode.",
mixinStandardHelpOptions = true,
versionProvider = HddsVersionProvider.class)
public class VolumeSubCommand extends ScmSubcommand {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this subcommand is dangling, not added to any parent. If you add it somewhere, it will be available as ozone ... volumes. Suggestion from @errose28 was to add this as ozone admin datanode volume list, which allows adding other subcommands related to datanode volumes later.

Subcommands are added like this:

subcommands = {
ListInfoSubcommand.class,
DecommissionSubCommand.class,
MaintenanceSubCommand.class,
RecommissionSubCommand.class,
StatusSubCommand.class,
UsageInfoSubcommand.class
})

@slfan1989
Copy link
Contributor Author

@adoroszlai Thank you very much for reviewing this PR! I will continue to improve it.

@slfan1989 slfan1989 marked this pull request as ready for review May 23, 2025 14:24
@slfan1989
Copy link
Contributor Author

@adoroszlai Could you help review this PR? Thank you very much!

I have conducted tests on my own branch, and it currently passes the key CI tests.

https://github.com/slfan1989/ozone/actions/runs/15209585380

@adoroszlai adoroszlai self-requested a review May 23, 2025 15:03

// If startItem is specified, find its position in the volumeInfos list
int startIndex = 0;
if (StringUtils.isNotBlank(startItem)) {
Copy link
Contributor Author

@slfan1989 slfan1989 May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adoroszlai I added logic to skip startItem in this part of the code, but after thinking it through, I realized it’s better to use the server’s hostname or UUID as startItem instead of a disk prefix. That’s because many machines name their disks like data0 to data9, and using a disk name could lead to unexpected filtering behavior.

@slfan1989
Copy link
Contributor Author

@adoroszlai Can we move forward with this PR? I would appreciate your advice.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @slfan1989 for updating the patch.

Comment on lines 44 to 52
private static final Codec<VolumeInfo> CODEC = new DelegatedCodec<>(
Proto2Codec.get(HddsProtos.VolumeInfoProto.getDefaultInstance()),
VolumeInfo::fromProtobuf,
VolumeInfo::getProtobuf,
VolumeInfo.class);

public static Codec<VolumeInfo> getCodec() {
return CODEC;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codec is required only for storing in DB, but VolumeInfo does not seem to be persisted by either datanode or SCM. So I think this can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the CODEC.

@@ -304,6 +304,15 @@ message RemoveScmResponseProto {
optional string scmId = 2;
}

message VolumeInfoProto {
optional string uuid = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use DatanodeIDProto.

*/
public final class VolumeInfo implements Comparable<VolumeInfo> {

private String uuid;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use DatanodeID.


@Override
public int compareTo(VolumeInfo that) {
Preconditions.checkNotNull(that);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prefer builtin Objects.requireNonNull

* @throws IOException
* I/O exceptions that may occur during the process of querying the volume.
*/
StorageContainerLocationProtocolProtos.GetVolumeInfosResponseProto getVolumeInfos(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please import GetVolumeInfosResponseProto instead of StorageContainerLocationProtocolProtos.

private String uuid;

// The HostName identifier of the DataNode.
@Option(names = { "--hostName" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still applies to the latest patch.

@CommandLine.Mixin
private ListPaginationOptions listOptions;

enum DISPLAYMODE { all, normal, failed }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Enums should be named like other types (classes, interfaces): DisplayMode.

Also, please consider using all-caps for values (ALL, etc.)

Comment on lines 63 to 67
@Option(names = { "--displayMode" },
defaultValue = "all",
description = "Display mode for disks: 'failed' shows failed disks, " +
"'normal' shows healthy disks, 'all' shows all disks.")
private DISPLAYMODE displayMode;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On another look, I think "display mode" is confusing. JSON and Table are display modes. "all/normal/failed" filter the list by volume state.

I suggest renaming to --state, and renaming normal to healthy.

Then description can be simplified to Filter disks by state.


// If displayed in JSON format.
if (json) {
System.out.print(JsonUtils.toJsonStringWithDefaultPrettyPrinter(volumeInfos));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still applicable.

Also, please use println, to avoid situation like HDDS-13100.

Comment on lines 57 to 60
@AfterEach
public void tearDown() {
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary

@errose28
Copy link
Contributor

Hi @slfan1989 thanks for working on this change. I think there are three attributes being added here which should be reviewed separately:

  1. Adding an SCM RPC to retrieve volume information
  2. Tracking failure time of the volume
  3. Adding a CLI to view the volume information

The RPC to retrieve volume information is definitely required going forward regardless of the other two items to create some sort of CLI to query volume state.

Tracking the failure time of the volume seems like a somewhat invasive change since it spans the datanode, heartbeat, and SCM. Is this necessary, or is it enough to depend on a metrics database to track timing of cluster events? Of course we need improvements to our volume metrics as well as mentioned in #8405.

On the CLI front, I do think we need a dedicated ozone admin datanode info command going forward as outlined in HDDS-13097. This would give all volume information per node. With volume counters added to ozone admin datanode list as proposed in HDDS-13096, we could get all failed volumes in a two step process:

  1. jq filter on ozone admin datanode list to find all nodes with failed volumes.
  2. jq filter on ozone admin datanode info to get specific information about the failed volumes, including their capacity.

Do we need a dedicated ozone admin datanode volume list/info command pairing in addition to this? It may be useful to have such cross-cutting commands to get information in one shot, but on the other hand it may result in duplication at the CLI. For example I could see the request to add node filtering to ozone admin datanode volume list/info at which point it becomes much the same as ozone admin datanode list/info.

@slfan1989
Copy link
Contributor Author

@errose28 Thank you for your message! I'd like to share some thoughts from a different perspective. As it stands, this feature does not conflict with the proposal in #8405. #8405 represents a more innovative and forward-looking design, and although it's still under discussion, it will certainly be valuable if implemented as planned.

At the same time, I believe this feature does not impact HDDS-13096 or HDDS-13097. My comment on #8405 was more about expressing expectations for the system’s future capabilities — I hope Ozone can gradually support such features — rather than raising any objections to #8405 itself.

The design of #7266 is inspired by HDFS's disk failure detection mechanism, with the goal of improving the system's ability to identify and locate failed disks. For users migrating from HDFS to Ozone, using the volume command to directly view failed disks can offer a more intuitive and convenient operational experience.

From my perspective, we all play different roles in this project. Your team focuses on evolving and optimizing the system's architecture, while we, as external users, are more focused on refining specific functional details based on real-world use. Ultimately, however, we share the same goal: to make Ozone more robust, more user-friendly, and more widely adopted.

Naturally, it's not easy to fully align these detail-oriented changes with larger, ongoing feature developments — for example, making #7266 fully consistent with #8405. This is mainly because #8405 is broader in scope, with a longer timeline, whereas #7266 focuses on a very specific aspect. While we fully respect the overall direction, we also hope to move forward with some smaller, incremental improvements to address current practical issues.

In addition to this PR, we're also working on several other enhancements. For instance, we've implemented mechanisms to collect DataNode I/O statistics to more precisely manage container replication. We've also introduced time-based peak/off-peak control logic for various DataNode management operations (such as deletion, container replication, and EC container reconstruction). These improvements are driven by real-world production needs, and from our perspective, they've shown positive results.

However, since many of these PRs have some degree of code coupling with our previous contributions, it's difficult for us to combine everything into a single, unified patch for upstream submission.

Therefore, we hope to proceed with #7266 for now. If #8405 later results in a more complete or improved solution, we’d be happy to continue refining things in that direction. In the meantime, this also gives us a valuable opportunity to participate in the community and contribute to Ozone’s development.

@slfan1989
Copy link
Contributor Author

@adoroszlai Thank you very much for reviewing the code. I will make improvements based on your suggestions. @errose28's comments are essentially not in conflict with #7266, and I'm looking forward to seeing #7266 progress so that we can move forward with the subsequent work.

@slfan1989
Copy link
Contributor Author

@adoroszlai Thank you very much for your detailed suggestions! I've made the changes accordingly. Could you review this PR again? Thank you very much! I respect @errose28's perspective. However, I believe this PR does not conflict with #8405, nor with HDDS-13096 or HDDS-13097 — they can coexist. We've already spent considerable time reviewing this PR together, and I'd like to continue moving it forward.

@slfan1989
Copy link
Contributor Author

@adoroszlai @errose28 Can I still continue to follow up on this PR? I feel that I’ve put in some effort, but right now I’ve lost a clear direction on how to proceed. According to @errose28 suggestion, this PR only needs to keep the RPC part, but I’m not sure how to continue working on the related functionality from here.

@adoroszlai
Copy link
Contributor

@slfan1989 Thanks for all your efforts on this PR. The concerns/suggestions raised by @errose28 make sense though. Please try to reach agreement.

I won't be able to re-review until next week in any case.

@slfan1989
Copy link
Contributor Author

@slfan1989 Thanks for all your efforts on this PR. The concerns/suggestions raised by @errose28 make sense though. Please try to reach agreement.

I won't be able to re-review until next week in any case.

@adoroszlai Thank you very much for your message and for your continued support and assistance! Since @errose28 is currently planning some new features, I believe this PR could be considered as part of that effort, especially given the amount of work we've already invested. As for which specific features should be retained, it would be helpful if @errose28 could review and provide guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants