From: nobu@... Date: 2014-08-19T13:22:56+00:00 Subject: [ruby-core:64458] [ruby-trunk - Bug #10149] [Closed] Some characters in EUC-KR does not encode to UTF-8 properly Issue #10149 has been updated by Nobuyoshi Nakada. Status changed from Open to Closed % Done changed from 0 to 100 Applied in changeset r47221. ---------- euckr-tbl.rb: euro and registered signs * enc/trans/euckr-tbl.rb (EUCKR_TO_UCS_TBL): add missing euro and registered signs. [ruby-core:64452] [Bug #10149] ---------------------------------------- Bug #10149: Some characters in EUC-KR does not encode to UTF-8 properly https://bugs.ruby-lang.org/issues/10149#change-48407 * Author: Eric Seo * Status: Closed * Priority: Normal * Assignee: * Category: core * Target version: * ruby -v: ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0] * Backport: 2.0.0: REQUIRED, 2.1: REQUIRED ---------------------------------------- This bug is confirmed on 2.1.2p95 There are (at least) two valid euc-kr characters that do not get converted to utf-8 properly **1. "\xA2\xE6" should convert to U+20AC (Euro Sign)** Current behavior: ~~~ruby irb(main):001:0> "\xA2\xE6".encode('UTF-8', 'EUC-KR') Encoding::UndefinedConversionError: "\xA2\xE6" from EUC-KR to UTF-8 ~~~ **2. "\xA2\xE7" should convert to U+00AE (Registered Sign)** Current behavior: ~~~ruby irb(main):002:0> "\xA2\xE7".encode('UTF-8', 'EUC-KR') Encoding::UndefinedConversionError: "\xA2\xE7" from EUC-KR to UTF-8 ~~~ I confirmed both characters convert correctly on python: ~~~python >>> "\xA2\xE7".decode('euc-kr') u'\xae' ~~~ I am guessing this is because these two characters are missing in this mapping: http://svn.ruby-lang.org/repos/ruby/trunk/enc/trans/euckr-tbl.rb -- https://bugs.ruby-lang.org/