[#114774] [Ruby master Feature#19884] Make Safe Navigation Operator work on classes — "p8 (Petrik de Heus) via ruby-core" <ruby-core@...>
Issue #19884 has been reported by p8 (Petrik de Heus).
13 messages
2023/09/15
[ruby-core:114608] [Ruby master Bug#19784] String#delete_prefix! problem
From:
"jhawthorn (John Hawthorn) via ruby-core" <ruby-core@...>
Date:
2023-09-01 00:07:33 UTC
List:
ruby-core #114608
Issue #19784 has been updated by jhawthorn (John Hawthorn).
@ywenc and I found a regression from this patch. We have some code handling a broken UTF-8 String with a combination of valid and invalid bytes (UTF-8 followed by binary, which IMO should probably be binary encoded, but it's surprising that the behaviour changed).
```
"hello\xBE".start_with?("hello") #=> false in trunk, was true on 3.2
"hello\xFE".start_with?("hello") #=> true (both 3.2 and trunk, intended behaviour)
"hello\xBE".delete_prefix("hello") => "\xBE" (both on 3.2 and trunk), because we skip the check when the prefix is valid
"\xFFhello\xBE".delete_prefix("\xFFhello") => "\xFFhello\xBE" in trunk
```
This is because we're looking at character following the prefix, observing that it looks like a UTF-8 continuation byte, and so returns false.
This approach might work for ends_with?/delete_suffix, where we don't break on an invalid character in the suffix, but doesn't feel right for prefixes. It sounds like the intended design is that to the user this should feel like we were comparing from the start of the strings char-by-char for valid and byte-by-byte for invalid.
We added tests and tried using the end of the previous character, rather than the "start" of the current, to determine if the prefix ends at a char boundary. https://github.com/ruby/ruby/pull/8348
----------------------------------------
Bug #19784: String#delete_prefix! problem
https://bugs.ruby-lang.org/issues/19784#change-104432
* Author: inversion (Yura Babak)
* Status: Closed
* Priority: Normal
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Here is the snipped and the question is in the comments:
``` ruby
fp = 'with_BOM_16.txt'
body = File.read(fp).force_encoding('UTF-8')
p body # "\xFF\xFE1\u00001\u0000"
p body.start_with?("\xFF\xFE") # true
body.delete_prefix!("\xFF\xFE") # !!! why doesn't work?
p body # "\xFF\xFE1\u00001\u0000"
p body.start_with?("\xFF\xFE") # true
body[0, 2] = ''
p body # "1\u00001\u0000"
p body.start_with?("\xFF\xFE") # false
```
Works same
on Linux (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux])
and Windows (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x64-mingw-ucrt])
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- [email protected]
To unsubscribe send an email to [email protected]
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/