HDDS-8211. S3G QoS #4421

mohan3d · 2023-03-18T07:38:50Z

What changes were proposed in this pull request?

Added a throttler package implementing request scheduler and some utilities. And a new filter in S3 to reject/accept user requests depending on the load of the s3gateway and their consumption.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8211

How was this patch tested?

Unittest and Manual test.

xBis7

@mohan3d Thanks for working on this. Can you add more details about the manual test? Where you able to get it manually to drop requests?

FairCallQueue, if backoff is enabled rejects requests when the queues are full. If a user has been submitting a lot of requests but the FCQ priority queues are not full, the system still processes these requests but less frequently.

If I understand your approach correctly, you are rejecting requests based on how many the user has already submitted. If a user has exceeded the maximum number of requests, you start dropping any new ones he submits. This raises a concern that it might be dropping requests that it shouldn't have.

mohan3d · 2023-03-20T13:06:03Z

@xBis7 Thanks for reviewing this.
Yeah exactly you got it right, New requests from a user who made too many requests already will be rejected at some point. If I understand well the new requests will be rejected at some point (if queues are full)?

xBis7 · 2023-03-20T13:22:38Z

@mohan3d FairCallQueue has multiple priority queues (4 by default) and requests are placed into those queues based on their priorities. DecayRpcScheduler calculates the priority of a call and then the call goes into the queue with the corresponding priority (for instance, highest priority call will go into the queue with the highest priority).

Your approach simulates the behavior of backoff but with backoff in order to reject a request all queues must be full. Check here.

Let's say we have only 1 user and max requests is set to 10000. If the user exceeds that number of requests then your filter will start rejecting all new requests coming from that user while it shouldn't since he is the only one stressing the system.

mohan3d · 2023-03-20T13:42:27Z

@xBis7 Good example, what should happen with the user when they are making too many requests in same earlier example. assuming that FairCallQueue is being used with backoff enabled?

I think it will end up with the queues full (if the requests rate is too high) and it will start reject the requests from this single user?

xBis7 · 2023-03-20T15:13:20Z

what should happen with the user when they are making too many requests in same earlier example. assuming that FairCallQueue is being used with backoff enabled?

@mohan3d To be honest, I haven't tested such a scenario, but for the requests to fail with a single user, it means that the system fails. I would expect it to slow down processing and not fail entirely.

Also, I can see that the number of handlers is updated during processing requests but is also the number of requests per user getting decremented?

mohan3d · 2023-03-20T23:33:47Z

@xBis7 so the correct idea should be to slow down the processing in case the systems is too busy instead of rejecting requests? I am trying to understand more about the expected behavior. I might be able to update the solution.

Also, I can see that the number of handlers is updated during processing requests but is also the number of requests per user getting decremented?

Yeah both will be updated the way you mentioned. requests per a user will be reduced by a factor after each interval.

xBis7 · 2023-03-23T18:24:35Z

so the correct idea should be to slow down the processing in case the systems is too busy instead of rejecting requests?

@mohan3d Yes, you could also be rejecting requests but at the point where the system is at full capacity. When you have only 1 user, FCQ doesn't slow him down or drop his requests.

adoroszlai · 2024-01-27T15:54:47Z

@mohan3d Thanks a lot for the patch. Please let us know if you plan to continue working on this. The tests will need to be migrated to use JUnit5 -- we can do that for you if the PR is not abandoned.

mohan3d · 2024-01-28T23:07:35Z

@adoroszlai Yeah, I would like to continue working on it. Aside from migrating the tests, what else do I need to update to get the PR merged.

adoroszlai · 2024-01-29T06:55:36Z

Thanks @mohan3d for continued interest in this.

migrating the tests

Done, pushed to your fork.

what else do I need to update to get the PR merged.

I'm not familiar with FCQ, so I'll defer review to @duongkame, @kerneltime and @xBis7.

ivandika3

I have left a minor comment.

ivandika3 · 2024-02-29T10:08:53Z

hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/ThrottlerFilter.java

+      ctx.abortWith(Response
+          .status(Response.Status.SERVICE_UNAVAILABLE)
+          .entity("Too many requests")
+          .build());


Just a minor suggestion: We can wrap this in OS3Exception according to the SlowDown Error code. A new OS3Exception can be defined in the S3ErrorTable (https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html).

The AuthorizationFilter#wrapOS3Exception can be reused by moving it to S3Utils.

Thanks a lot @ivandika3 for your review and the suggested improvement.

@ivandika3 I pushed some updates, please let me know if it needs further changes.

Thanks for the update. Looks good.

mohan3d added 2 commits March 18, 2023 15:01

Implement s3g QoS

cbe6667

add decay-scheduler config

df90fb0

xBis7 reviewed Mar 20, 2023

View reviewed changes

kerneltime requested review from kerneltime and duongkame March 20, 2023 16:17

kerneltime mentioned this pull request Mar 27, 2023

HDDS-7319. To support s3g usernames with Fair Call Queue from hadoop-commons #4116

Merged

adoroszlai marked this pull request as draft April 17, 2023 10:05

adoroszlai added 3 commits January 29, 2024 07:22

Merge remote-tracking branch 'origin/master' into qos

72eb45c

migrate to JUnit5

ee4b7a8

use assertThrows

b3fcdea

ivandika3 reviewed Feb 29, 2024

View reviewed changes

refactor slow down error

df82afa

mohan3d requested a review from ivandika3 February 29, 2024 13:35

fix style

3117e2e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-8211. S3G QoS #4421

HDDS-8211. S3G QoS #4421

Uh oh!

mohan3d commented Mar 18, 2023

Uh oh!

xBis7 left a comment

Uh oh!

mohan3d commented Mar 20, 2023 •

edited

Loading

Uh oh!

xBis7 commented Mar 20, 2023

Uh oh!

mohan3d commented Mar 20, 2023

Uh oh!

xBis7 commented Mar 20, 2023

Uh oh!

mohan3d commented Mar 20, 2023

Uh oh!

xBis7 commented Mar 23, 2023

Uh oh!

adoroszlai commented Jan 27, 2024

Uh oh!

mohan3d commented Jan 28, 2024

Uh oh!

adoroszlai commented Jan 29, 2024

Uh oh!

ivandika3 left a comment •

edited

Loading

Uh oh!

ivandika3 Feb 29, 2024

Uh oh!

mohan3d Feb 29, 2024

Uh oh!

mohan3d Feb 29, 2024

Uh oh!

ivandika3 Feb 29, 2024

Uh oh!

Uh oh!

HDDS-8211. S3G QoS #4421

Are you sure you want to change the base?

HDDS-8211. S3G QoS #4421

Uh oh!

Conversation

mohan3d commented Mar 18, 2023

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

xBis7 left a comment

Choose a reason for hiding this comment

Uh oh!

mohan3d commented Mar 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xBis7 commented Mar 20, 2023

Uh oh!

mohan3d commented Mar 20, 2023

Uh oh!

xBis7 commented Mar 20, 2023

Uh oh!

mohan3d commented Mar 20, 2023

Uh oh!

xBis7 commented Mar 23, 2023

Uh oh!

adoroszlai commented Jan 27, 2024

Uh oh!

mohan3d commented Jan 28, 2024

Uh oh!

adoroszlai commented Jan 29, 2024

Uh oh!

ivandika3 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ivandika3 Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

mohan3d Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

mohan3d Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

ivandika3 Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mohan3d commented Mar 20, 2023 •

edited

Loading

ivandika3 left a comment •

edited

Loading