Skip to content

HDDS-13045. Implement Immediate Triggering of Heartbeat when Volume Full #8492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

siddhantsangwan
Copy link
Contributor

@siddhantsangwan siddhantsangwan commented May 20, 2025

What changes were proposed in this pull request?

This pull request is for implementing a part of the design proposed in HDDS-12929. This only contains the implementation for detecting a full volume, getting the latest storage report, adding the container action, then immediately triggering (or throttling) a heartbeat.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13045

How was this patch tested?

Modified existing unit tests. Also did some manual testing using the ozone docker compose cluster.

a. Simulated a close to full volume with a capacity of 2 GB, available space of 150 MB and min free space of 100 MB. Datanode log:

2025-05-20 09:47:05,899 [main] INFO volume.HddsVolume: HddsVolume: { id=DS-64dd669c-71fe-492f-903c-4fc7dbe4440a dir=/data/hdds/hdds type=DISK capacity=2147268899 used=1990197248 available=157071651 minFree=104857600 committed=0 }

b. Wrote 100 MB of data using freon, with the expectation that an immediate heartbeat will be triggered as soon as the available space drops to 100 MB. Datanode log shows that this happened at 09:50:52:

2025-05-20 09:50:52,028 [f8714dd7-31fc-4c63-9703-6fdb1a59b5c4-ChunkWriter-7-0] INFO impl.HddsDispatcher: Triggering heartbeat for full volume /data/hdds/hdds, with node report storageReport {
   storageUuid: "DS-bd34474b-8fd4-49be-be78-72e708b543c0"
   storageLocation: "/data/hdds/hdds"
   capacity: 2147268899
   scmUsed: 2042626048
   remaining: 104642851
   storageType: DISK
   failed: false
   committed: 0
   freeSpaceToSpare: 104857600
 }
 metadataStorageReport {
   storageLocation: "/data/metadata/ratis"
   storageType: DISK
   capacity: 2147268899
   scmUsed: 1990197248
   remaining: 157071651
   failed: false
 }

c. In the SCM, the last storage report BEFORE the write operation was received at 09:50:09:

2025-05-20 09:50:09,399 [IPC Server handler 12 on default port 9861] INFO server.SCMDatanodeHeartbeatDispatcher: Dispatching Node Report storageReport {
storageUuid: "DS-27210be2-ee53-4035-a3a3-63ec8a162456"
   storageLocation: "/data/hdds/hdds"
   capacity: 2147268899
   scmUsed: 1990197248
   remaining: 157071651
   storageType: DISK
   failed: false
   committed: 0
   freeSpaceToSpare: 104857600
 }
 metadataStorageReport {
   storageLocation: "/data/metadata/ratis"
   storageType: DISK
   capacity: 2147268899
   scmUsed: 1990197248
   remaining: 157071651
   failed: false
 }

So, the next storage report should be received a minute later at 09:51:09, unless it's triggered immediately due to volume full. The SCM log shows that the immediately triggered report was received at 09:50:52, corresponding to the DN log:

2025-05-20 09:50:52,033 [IPC Server handler 4 on default port 9861] INFO server.SCMDatanodeHeartbeatDispatcher: Dispatching Node Report storageReport {
   storageUuid: "DS-bd34474b-8fd4-49be-be78-72e708b543c0"
   storageLocation: "/data/hdds/hdds"
   capacity: 2147268899
   scmUsed: 2042626048
   remaining: 104642851
   storageType: DISK
   failed: false
   committed: 0
   freeSpaceToSpare: 104857600
 }
 metadataStorageReport {
   storageLocation: "/data/metadata/ratis"
   storageType: DISK
   capacity: 2147268899
   scmUsed: 1990197248
   remaining: 157071651
   failed: false
 }

The next storage report is received at the expected time of 09:51:09, showing that throttling also worked.

Green CI in my fork: https://github.com/siddhantsangwan/ozone/actions/runs/15135787944/job/42547140475

@siddhantsangwan siddhantsangwan marked this pull request as ready for review May 21, 2025 05:02
nodeReport = context.getParent().getContainer().getNodeReport();
context.refreshFullReport(nodeReport);
context.getParent().triggerHeartbeat();
LOG.info("Triggering heartbeat for full volume {}, with node report: {}.", volume, nodeReport);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on the write path, so we must be extra careful about performance. An info log will reduce performance, but I wonder if it's ok in this case because this won't happen often? What do others think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover the future plan is to fail the write anyway if the size is exceeding the min free and reserved space boundary.

Copy link
Member

@peterxcli peterxcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @siddhantsangwan for this improvement!

@@ -130,6 +134,10 @@ public HddsDispatcher(ConfigurationSource config, ContainerSet contSet,
this.tokenVerifier = tokenVerifier != null ? tokenVerifier
: new NoopTokenVerifier();
this.slowOpThresholdNs = getSlowOpThresholdMs(conf) * 1000000;
fullVolumeLastHeartbeatTriggerMs = new AtomicLong(-1);
long heartbeatInterval =
config.getTimeDuration("hdds.heartbeat.interval", 30000, TimeUnit.MILLISECONDS);
Copy link
Contributor

@ChenSammi ChenSammi May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call HddsServerUtil#getScmHeartbeatInterval instead?

And there is HDDS_NODE_REPORT_INTERVAL for node report. Shall we use node report property instead of heartbeat property?

try {
handleFullVolume(container.getContainerData().getVolume());
} catch (StorageContainerException e) {
ContainerUtils.logAndReturnError(LOG, e, msg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to return here?

*/
private void handleFullVolume(HddsVolume volume) throws StorageContainerException {
long current = System.currentTimeMillis();
long last = fullVolumeLastHeartbeatTriggerMs.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider different volume gets full case , for example, P0, /data1 gets full, P1, /data2 gets full,
(P1-P0) < interval, do we expect two emergent container reports, or one report?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants