-
Notifications
You must be signed in to change notification settings - Fork 26
Add check for degenerate padded case in decode #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
add better padding validation + note about validation strategy
d2402e1 to
5d36588
Compare
| | otherwise -> err "Base64-encoded bytestring has invalid size" | ||
| | r == 0 -> validateLastPad bs noPad $ go bs | ||
| | r == 2 -> validateLastPad bs noPad $ go (B.append bs (B.replicate 2 0x3d)) | ||
| | r == 3 -> validateLastPad bs noPad $ go (B.append bs (B.replicate 1 0x3d)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's all written down in the $Validation note, but just to recap:
Let bs be a bytestring of length l. Then the following properties hold:
-
l == 0 mod 4: The input bytestring is assumed to be well-formed. This will always be the expected case for padded Base64 and Base64url values, or for unpadded Base64url values that happen to have a pre-encoded length multiple of 6. In any case, these will go through the standard decode routine, and any existing padding chars will be validated in the final quanta (see:finalChunk). -
l == 1 mod 4: This is never a valid length for Base64 or Base64url-encoded values. The specification requires that the unpadded length of the encoded string bel == 0 mod 4,l == 2 mod 4, orl == 3 mod 4. There will never be a valid unpadded input of lengthl == 1 mod 4as a result. This can be rejected outright. -
l == 2 mod 4: In this case, two padding chars must appear in the final quanta. If any additional padding chars exist in the string, then they will fail as final quanta, as we require the final four bytes (say,(a b '=' '=')) to have thata /= '='andb /= '='. Additional pads will fail that clause offinalChunk. Thus, it's safe to add 2 padding chars to the end of a supposedly unpadded input of lengthl == 2 mod 4, since the addition will never form a well-formed input if the unpadded string is already malformed. -
l == 3 mod 4: This is the only tricky case. When inputs have this length, then we expect that that adding padding chars will result in the form(a b c '='). However, if the unpadded input has'='in thecposition, it is possible that adding padding chars to the string "completes" the input in the sense that it forms a valid input where the unpadded fragment can be seen as a bytestring of lengthl == 2 mod 4. This could potentially be an attack vector, and constitutes a security risk. Thankfully, this is also easy to check, since, we only need to validate that the last char of an unpadded bytestring of lengthl == 3 mod 4is not'='. If any additional padding chars are present, then there is no risk that they will contribute to a well-formed input, since they will fail as final quanta in theaandbpositions. So really, the requirement with padding bytestrings of lengthl == 3 mod 4is that they are of the form(a b c '='), c /= '='after padding.
23Skidoo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, changes in this PR look good.
|
Thanks @23Skidoo, merging. |
This removes the odd inconsistency between failure modes for URL where degenerate inputs like
ZE=passdecode, but notdecodeUnpaddedanddecodePadded. Addresses #35before:
after (note the subtlety in the messages returned):
TODO: