Skip to content

HDDS-8773. [S3G] Improve list performance in FSO bucket #4868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 9, 2023

Conversation

whbing
Copy link
Contributor

@whbing whbing commented Jun 10, 2023

What changes were proposed in this pull request?

Option --delimiter '/' in ListObjectsV2 Is a commonly used option. Common scenarios are as follows:

  • simple aws s3 cmd like aws s3 --endpoint http://<ip>:9878 ls buk/dir/
  • aws s3api cmd like aws s3api --endpoint http://<ip>:9878 list-objects-v2 --bucket buk1-ln --prefix '' --delimiter '/'
  • fuse mount s3 bucket and ls cmd

Only listing immediate children node of prefix is needed in the above scenario.

In the current implementation of FSO bucket, the object is listed by Depth-First-Search algorithm, and then filtered by delimiter, which greatly reduces the performance.

It was reduced from tens of seconds to 3 seconds in my test environment after optimization.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8773

How was this patch tested?

  • Unit test
  • CLI commands in ozone cluster environment as follow
    info : s3v/buk1-ln linked fso bucket, and simulate multiple calls to the iterator by reducing the parameter ozone.client.list.cache
$ hadoop fs -count -v ofs://om/s3v/buk1-ln/*
   DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
          14         1703               2177 ofs://om/s3v/buk1-ln/test
           7            1                238 ofs://om/s3v/buk1-ln/test0
          93       373113             373113 ofs://om/s3v/buk1-ln/test1
          31       227379             227379 ofs://om/s3v/buk1-ln/test2

before optimization:

time aws s3api --endpoint http://`hostname -i`:9878 list-objects-v2 --bucket buk1-ln --prefix '' --delimiter '/'
{
    "CommonPrefixes": [
        {
            "Prefix": "test/"
        },
        {
            "Prefix": "test0/"
        },
        {
            "Prefix": "test1/"
        },
        {
            "Prefix": "test2/"
        }
    ]
}

real	0m35.005s
user	0m0.961s
sys	0m0.181s

after optimization:

time aws s3api --endpoint http://`hostname -i`:9878 list-objects-v2 --bucket buk1-ln --prefix '' --delimiter '/'
{
    "CommonPrefixes": [
        {
            "Prefix": "test/"
        },
        {
            "Prefix": "test0/"
        },
        {
            "Prefix": "test1/"
        },
        {
            "Prefix": "test2/"
        }
    ]
}

real	0m3.106s
user	0m0.915s
sys	0m0.131s

Detail data are shown in the following table:

time with prefix test0/ test/ test1/ test2/ ''
before optimization 3.171s 3.268s 21.861s 15.120s 35.005s
after optimization 3.127s 3.143s 3.182s 3.170s 3.106s

@adoroszlai
Copy link
Contributor

@tanvipenumudy please take a look

@whbing
Copy link
Contributor Author

whbing commented Jun 11, 2023

@adoroszlai Added test, please help trigger ci, Thanks ! ( ci passed in my branch https://github.com/whbing/ozone/actions/runs/5233785220)

@ChenSammi
Copy link
Contributor

Hi @kerneltime , could you please help to review this patch?

@adoroszlai
Copy link
Contributor

@duongkame please review

Copy link
Contributor

@zhtttylz zhtttylz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch and patch here,just leave some nit comments inline,JFYI.

@ChenSammi
Copy link
Contributor

Hi @whbing , looks like the build has some issue, could you take a look?

@whbing whbing marked this pull request as draft July 9, 2023 14:02
@whbing whbing marked this pull request as ready for review August 16, 2023 07:15
@whbing
Copy link
Contributor Author

whbing commented Aug 16, 2023

Thanks @captainzmc for merging #5003 upon which this PR depends. Now this PR is ready and I have successfully tested some scenarios in my environment. @captainzmc @ChenSammi @adoroszlai If you have time, thanks for helping to review this pr.

@adoroszlai
Copy link
Contributor

@captainzmc @duongkame @kerneltime @tanvipenumudy please review

@whbing
Copy link
Contributor Author

whbing commented Oct 16, 2023

Thanks, and look forward to the review. The PR is working well in our cluster. Just fine-tuned and added comments in above new commit.

@kerneltime
Copy link
Contributor

Thank you @whbing for this very useful contribution. I will get this review done this week. cc @tanvipenumudy @duongkame

Copy link
Contributor

@tanvipenumudy tanvipenumudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @whbing for this important change, please find a comment.

Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @whbing , the patch overall looks good.

@ChenSammi ChenSammi merged commit 8e52a3a into apache:master Nov 9, 2023
ibrusentsev pushed a commit to ibrusentsev/ozone that referenced this pull request Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants