Skip to content

Add encode & decode operations to base62 helper #10038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

sgmiller
Copy link
Collaborator

@sgmiller sgmiller commented Sep 24, 2020

Implement Decode operations in the base62 helper by treating the []byte as a big.Int and then then doing a general base conversion using div/remainder.

Implement Encode as the inverse, using Mul/Add.

@@ -48,3 +53,65 @@ func RandomWithReader(length int, reader io.Reader) (string, error) {
}
}
}

// Encode encodes bytes to base62. This does *not* scale linearly with input as base64, so use caution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a caution that will be a) unknown to the user, b) leave them stuck if they do know about it. Is there an alternative but safe/stable version of these processes such that maybe we conditionally use one or the other alg based on input size?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently not. There seems to be no solution to 62^a = 256^b unlike with say base64, so you can't iteratively process the input to keep the number sizes small. Base62 is in short pretty weird. I've found other implementations on the web that do strange things to make it work, but they don't decode in a way that's compatible with what we produce in Random*

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could certainly rename the functions or even have an arbitrary limit (or both). Say EncodeSmall, with a 1000 character limit or something.

@jefferai
Copy link
Member

FWIW somewhere else we are using base58. Since it's used for cryptocurrency it's much more standardized (so you can encode and decode with different libraries) and disambiguates characters that otherwise look alike. If this had existed at the time we may have used this instead. But for some of the open questions some inspiration might be found there. Or that may be an existing, tested alternative.

@sgmiller
Copy link
Collaborator Author

sgmiller commented Oct 2, 2020

FWIW somewhere else we are using base58. Since it's used for cryptocurrency it's much more standardized (so you can encode and decode with different libraries) and disambiguates characters that otherwise look alike. If this had existed at the time we may have used this instead. But for some of the open questions some inspiration might be found there. Or that may be an existing, tested alternative.

Thanks. Lifted the changes from base58 for efficiency into this PR, but between 58 and 62 it really doesn't make much difference for tokenization. I'll leave this PR open as a nice bit of additional functionality for our use, but 58 seems like a winner for transform, so I'll switch to that tomorrow. Both still have the problem of being unsuitable for encoding/decoding large values.

@aphorise
Copy link
Contributor

I was curious what's the status with this request or when it may be released?

@VioletHynes VioletHynes added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants