Skip to content

Normalize each import before it is cached. #392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 17, 2018

Conversation

sellout
Copy link
Collaborator

@sellout sellout commented May 16, 2018

In tests on Dhall configs we use at work, I saw speedups between 33–90% on
extremely complicated configs with this change.

It mostly just adds a line calling normalize, but also threads the Normalizer
through a bunch of places to make sure we normalize properly for the context.

This is based on #390, because this optimization is likely less effective if
we’re doing more imports.

Gabriella439 and others added 3 commits May 15, 2018 20:38
This is one step in a series of commits to update the Haskell
implementation to match the import semantics standardized in:

dhall-lang/dhall-lang#127

This is a BREAKING CHANGE mostly due to restricting the grammar for
valid file paths and URLs.  However, in practice most existing code
should still work after these changes.

Specifically, this change:

* Updates the grammar for file paths and URLs
* Simplifies and cleans up the semantics for canonicalizing paths (and
  chains of paths)

The main difference in the grammar is that file paths and URL paths are
unified to have the same syntax and they are both more restricted than
before.

For example, the following URLs are no longer valid:

* `http://example.com` (missing a path to a file)
* `http://example.com/()` (invalid path characters)

... and the following paths are no longer valid:

* `/?#` (invalid path characters)

Note that this actually deviates from the standard in one way: file
paths no longer permit `?` and `#` characters (whereas the standard
allows them).  However, in the course of testing these changes I
realized that if file paths permit `?` and `#` characters and you unify
URLs to also permit them then you get an ambiguous grammar where a `?`
or `#` in a URL could be interpreted as either part of a path component
or part of a query parameter or fragment.

The semantics for canonicalizing paths is a straightforward translation
of the standard to Haskell code.
In tests on Dhall configs we use at work, I saw speedups between 33–90% on
extremely complicated configs with this change.

It mostly just adds a line calling `normalize`, but also threads the Normalizer
through a bunch of places to make sure we normalize properly for the context.

This is based on dhall-lang#390, because this optimization is likely less effective if
we’re doing more imports.
@Gabriella439 Gabriella439 merged commit 429aeaf into dhall-lang:master May 17, 2018
@Gabriella439
Copy link
Collaborator

Thanks! :)

@sellout sellout deleted the normalize-imports branch May 17, 2018 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants