[#71931] [Ruby trunk - Feature #11786] [Open] [PATCH] micro-optimize case dispatch even harder — normalperson@...

Issue #11786 has been reported by Eric Wong.

9 messages 2015/12/08

[ruby-core:72108] [Ruby trunk - Feature #11814] String#valid_encoding? without force_encoding

From: knu@...
Date: 2015-12-14 08:29:41 UTC
List: ruby-core #72108
Issue #11814 has been updated by Akinori MUSHA.


I gave up with this idea for now because I thought the use cases would not expand as wide as expected and it'd be not enough just to add valid_encoding?(enc) if you got serious about encoding detection. (Sorry usa-san!)

However, since this issue is raised, let me share one good use case for future viewers.

Suppose you have a list of byte arrays which you don't know which encoding they are encoded in, like when you want to guess the encoding of the file names stored in a zip file.

So, if you had String#valid_encoding?(enc) you could achieve it like this without modifying, copying or concatenating strings:

~~~
POSSIBLE_ENCODINGS = [Encoding::UTF_8, Encoding::Windows_31J, Encoding::ISO_8859_1, Encoding::ASCII_8BIT]

encoding = byte_arrays.inject(POSSIBLE_ENCODINGS) { |encs, b|
  encs & POSSIBLE_ENCODINGS.select { |enc| b.valid_encoding?(enc) }
}.first
~~~

----------------------------------------
Feature #11814: String#valid_encoding? without force_encoding
https://bugs.ruby-lang.org/issues/11814#change-55523

* Author: Usaku NAKAMURA
* Status: Rejected
* Priority: Normal
* Assignee: 
----------------------------------------
Now we have to set a encoding to a string to validate it, just like:

```ruby
str.force_encoding('euc-jp').valid_encoding?  # => true or false
```

But to modify the string is not so smart.
knu-san requires the way to validate a string without modifiing it [*1].

Then, I propose to add an optional encoding parameter to `String#valid_encoding?`.

```ruby
str.valid_encoding?('euc-jp')  # => true or false
```

A patch is attached.

[*1] https://twitter.com/knu/status/676009662655934465 (in Japanese)

---Files--------------------------------
valid_encoding.patch (4.4 KB)


-- 
https://bugs.ruby-lang.org/

In This Thread

Prev Next