Bug #1681: Integer#chr Should Infer Encoding of Given Codepoint - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #1681

closed

Integer#chr Should Infer Encoding of Given Codepoint

Added by runpaint (Run Paint Run Run) almost 16 years ago. Updated about 14 years ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]

Backport:

[ruby-core:23997]

Description

=begin
String#ord and Integer#chr are symmetrical operations on ASCII Strings:

 'a'.ord.chr   #=> "a"

But Integer#chr fails to round-trip when the given codepoint is outside the range of ASCII:

 "\u{2563}".ord.chr #=> RangeError: 9571 out of char range

To fix this, the codepoint's encoding needs to be specified:

 "\u{2563}".ord.chr('utf-8')  #=> "╣"

This seems needlessly verbose given that Ruby already knows that my source encoding is UTF-8. I suggest, then, that, when invoked with no argument, Integer#chr displays the given codepoint w.r.t to the current encoding, raising a RangeError only if the codepoint is out-of-bounds for this inferred encoding.
=end

Actions

Copy link

Updated by nobu (Nobuyoshi Nakada) almost 16 years ago

=begin
Hi,

At Wed, 24 Jun 2009 06:42:29 +0900,
Run Paint Run Run wrote in [ruby-core:23997]:

This seems needlessly verbose given that Ruby already knows
that my source encoding is UTF-8.

It's irrelevant to source encoding. A possiblity would be
Encoding.default_internal?

--
Nobu Nakada

=end

Actions

Copy link

Updated by runpaint (Run Paint Run Run) almost 16 years ago

=begin

This seems needlessly verbose given that Ruby already knows
that my source encoding is UTF-8.

It's irrelevant to source encoding. A possiblity would be
Encoding.default_internal?

Indeed; my mistake. :-)

--
Run Paint Run Run

=end

Actions

Copy link

Updated by matz (Yukihiro Matsumoto) almost 16 years ago

=begin
Hi,

In message "Re: [ruby-core:24001] Re: [Bug #1681] Integer#chr Should Infer Encoding of Given Codepoint"
on Wed, 24 Jun 2009 09:54:06 +0900, Run Paint Run Run [email protected] writes:
|
|>> This seems needlessly verbose given that Ruby already knows
|>> that my source encoding is UTF-8.
|>
|> It's irrelevant to source encoding. A possiblity would be
|> Encoding.default_internal?
|
|Indeed; my mistake. :-)

Source encoding may be different from default internal encoding.
Since codepoint number does not contain any encoding information,
there's information loss. I am not sure it is OK to use possibly
wrong encoding information (default internal), even as a default.

I'd like to hear opinion from others.

						matz.

=end

Actions

Copy link

Updated by duerst (Martin Dürst) almost 16 years ago

=begin
We have String#encode (without any arguments), which transcodes to
default_internal (and in addition, doesn't raise an exception for
invalid byte sequences,..., which may be a security issue), so I don't
think using Integer#chr with a default encoding of default_internal
would be such a big problem.

Regards, Martin.

On 2009/06/25 18:06, Yukihiro Matsumoto wrote:

Hi,

In message "Re: [ruby-core:24001] Re: [Bug #1681] Integer#chr Should Infer Encoding of Given Codepoint"
on Wed, 24 Jun 2009 09:54:06 +0900, Run Paint Run Run[email protected] writes:
|
|>> This seems needlessly verbose given that Ruby already knows
|>> that my source encoding is UTF-8.
|>
|> It's irrelevant to source encoding. A possiblity would be
|> Encoding.default_internal?
|
|Indeed; my mistake. :-)

Source encoding may be different from default internal encoding.
Since codepoint number does not contain any encoding information,
there's information loss. I am not sure it is OK to use possibly
wrong encoding information (default internal), even as a default.

I'd like to hear opinion from others.
 					matz.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:[email protected]

=end

Actions

Copy link

Updated by matz (Yukihiro Matsumoto) almost 16 years ago

Status changed from Open to Closed
% Done changed from 0 to 100

=begin
Applied in changeset r23865.
=end

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #1681

Integer#chr Should Infer Encoding of Given Codepoint

Updated by nobu (Nobuyoshi Nakada) almost 16 years ago

Updated by runpaint (Run Paint Run Run) almost 16 years ago

Updated by matz (Yukihiro Matsumoto) almost 16 years ago

Updated by duerst (Martin Dürst) almost 16 years ago

Updated by matz (Yukihiro Matsumoto) almost 16 years ago