[#71931] [Ruby trunk - Feature #11786] [Open] [PATCH] micro-optimize case dispatch even harder — normalperson@...

Issue #11786 has been reported by Eric Wong.

9 messages 2015/12/08

[ruby-core:72124] Re: [ruby-cvs:60264] duerst:r53112 (trunk): * enc/ebcdic.h: new dummy encoding EBCDIC-US

From: Martin J. Dürst <duerst@...>
Date: 2015-12-14 23:39:26 UTC
List: ruby-core #72124
On 2015/12/14 22:34, U.NAKAMURA wrote:
> Hi,
>
> In message "[ruby-cvs:60264] duerst:r53112 (trunk): * enc/ebcdic.h: new dummy encoding EBCDIC-US"
>    on Mon, 14 Dec 2015 22:11:33 +0900 (JST), [email protected] wrote:
>>      * enc/ebcdic.h: new dummy encoding EBCDIC-US
>
> This is great!

Thanks. I did it because somebody at the Ruby Kaigi party said that they 
needed it, and I had received code earlier that (almost) did it.

> But I want to know where the name 'EBCDIC-US' came from.

The name came from 
http://www.iana.org/assignments/character-sets/character-sets.xhtml, as 
MIB 2078.

> I guess this is the encoding that is named as 'IBM037' by IANA,
> and it's aliases are 'ebcdic-cp-us' and so on.

Looking at http://tools.ietf.org/html/rfc1345, the code conversion is 
indeed much closer to ebcdic-cp-us than to EBCDIC-US. So using EBCDIC-US 
is definitely wrong. Thanks for pointing this out.

We need some more careful checks, but it seems that the data that I used 
is the same as on https://en.wikipedia.org/wiki/EBCDIC_037, whereas 
'IBM037' in http://tools.ietf.org/html/rfc1345 has the C2 area all 
unassigned.


> I'n not against for 'EBCDIC-US'.
> But, if we use this as the real name, we should explain the reason.

We definitely need to fix this, most probably to make the name 'IBM037', 
and add some alias(es). I don't think we need all the aliases from IANA 
(cp037, ebcdic-cp-us, ebcdic-cp-ca, ebcdic-cp-wt, ebcdic-cp-nl, 
csIBM037), but I would like to have one alias with the famous 6 letters 
in it so that it's easier for people to find. I'd probably go with 
'ebcdic-cp-us'.

Any advice appreciated! The last time I have used EBCDIC was in the 
early 1980ies, so that's a long time ago.

Regards,   Martin.

In This Thread