Skip to content

fetchTree: shallow git fetching by default#10028

Merged
edolstra merged 1 commit into
NixOS:masterfrom
DavHau:fetchTree-shallow-default
Jun 3, 2024
Merged

fetchTree: shallow git fetching by default#10028
edolstra merged 1 commit into
NixOS:masterfrom
DavHau:fetchTree-shallow-default

Conversation

@DavHau

@DavHau DavHau commented Feb 17, 2024

Copy link
Copy Markdown
Member

Motivation:
make git fetching more efficient for most repos by default

@DavHau DavHau requested a review from edolstra as a code owner February 17, 2024 13:21
Comment thread tests/nixos/fetch-git/test-cases/fetchTree-shallow/default.nix
@edolstra edolstra added the idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome. label Feb 23, 2024
@edolstra edolstra self-assigned this Feb 23, 2024
@edolstra

Copy link
Copy Markdown
Member

Team discussion:

  • Idea approved.
  • This needs a release notes entry.
  • The VM tests should be changed to a functional test since those are much faster.
  • Are there any potential performance issues with switching to shallow fetching? E.g. is there a possibility that incremental fetching doesn't work as well if the client doesn't have the whole history?

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-02-03-nix-team-meeting-127/40309/1

@DavHau DavHau force-pushed the fetchTree-shallow-default branch from 7e4eea5 to 9cd889c Compare February 28, 2024 05:39
@DavHau

DavHau commented Feb 28, 2024

Copy link
Copy Markdown
Member Author

This needs a release notes entry.

Done

The VM tests should be changed to a functional test since those are much faster.

This wouldn't make much sense here, as basically all code paths related to shallow cloning and caching are not triggered when a local git repository is fetched. Only for remote repos this test makes sense, but this would be hard to integrate with the functional tests, as it needs a git server which is not trivial to set up without nixos modules.

Are there any potential performance issues with switching to shallow fetching? E.g. is there a possibility that incremental fetching doesn't work as well if the client doesn't have the whole history?

Yes, generally it is possible to have scenarios where the lack of incrementality of shallow fetching leads to more overall network traffic and disk I/O, like for example when fetching many different revisions of the same repo. Though it is worth mentioning that:

  • the network traffic for shallow fetching always scales linearly with the amount of different revisions fetched, which seems bearable, while full cloning can get totally out of control on repos with a large history. For some ecosystems it is common for the repository history to be hundreds of times larger than a single checkout. Some repositories have a history of several GB. Using shallow cloning as a default strategy avoids hitting these worst case scenarios regularly.
  • Non-incremental fetching currently already seems to be the preferred choice overall. There is probably a reason why github flake inputs are fetched via the tarball API and not using git full cloning. The performance of fetching the full nixpkgs repo wouid probably not be great on a raspberry pi.

I believe for now, this is the better default, but thinking long term it might be better if the fetching strategy would be determined by nix automatically with possibility of overriding the behavior via nix.conf. The nix expression doesn't seem to be the best place for these options, as the author usually doesn't have enough information to make the best decision.

Motivation:
make git fetching more efficient for most repos by default
@DavHau DavHau force-pushed the fetchTree-shallow-default branch from 9cd889c to 358c26f Compare February 28, 2024 06:27
@DavHau

DavHau commented Mar 9, 2024

Copy link
Copy Markdown
Member Author

Generally I believe this is ready.

@edolstra If I'm missing something that would allow me to add a meaningful functional test instead of the nixos tests, please let me know.

Right know, I don't see a good way, as fetchTree doesn't return any attributes which would indicate if a repo was fetched shallowly or not (which is great BTW as it simplifies future upgrade paths). So the only way left to determine if a clone was shallow is looking at the cached git tree, but there is no cache when fetching a local repo.

@tomberek tomberek self-assigned this Mar 25, 2024

@tomberek tomberek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and tested.

I'm okay with the test being where it is, provides a clean slate and avoids cache effects.

It is not quite clear to the user when this will kick in, or if it has worked.

@edolstra edolstra merged commit ac3e5d2 into NixOS:master Jun 3, 2024
@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-06-03-nix-team-meeting-minutes-149/46582/1

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/handling-git-submodules-in-flakes-from-nix-2-18-to-2-22-nar-hash-mismatch-issues/45118/5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

idea approved The given proposal has been discussed and approved by the Nix team. An implementation is welcome.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants