From: naruse@... Date: 2016-07-19T08:31:51+00:00 Subject: [ruby-core:76432] [Ruby trunk Bug#12577][Rejected] Is '$' punctuation or not? Inconsistency between us-ascii and UTF-8 Issue #12577 has been updated by Yui NARUSE. Status changed from Open to Rejected It's because of their specs as follows: POSIX > punct > Define characters to be classified as punctuation characters. > In the POSIX locale, neither the nor any characters in classes alpha, digit, or cntrl shall be included. > > In a locale definition file, no character specified for the keywords upper, lower, alpha, digit, cntrl, xdigit, or as the shall be specified. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07 Unicode > \p{gc=Punctuation} \p{gc=Symbol} -- \p{alpha} http://unicode.org/reports/tr18/#punct ---------------------------------------- Bug #12577: Is '$' punctuation or not? Inconsistency between us-ascii and UTF-8 https://bugs.ruby-lang.org/issues/12577#change-59676 * Author: Martin D��rst * Status: Rejected * Priority: Normal * Assignee: * ruby -v: ruby 2.4.0dev (2016-07-09 trunk 55618) [x86_64-cygwin] * Backport: 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN ---------------------------------------- US-ASCII thinks '$' is punctuation. UTF-8 thinks it's not. This means that the following two scripts: ``` # encoding: us-ascii puts '$' =~ /\p{Punct}/ ? 'match' : 'no match' ``` and ``` # encoding: utf-8 puts '$' =~ /\p{Punct}/ ? 'match' : 'no match' ``` produce different results. It also means that the output from the single line script ``` puts '$' =~ /\p{Punct}/ ? 'match' : 'no match' ``` changed when we changed the default script encoding from US-ASCII to UTF-8. This may be okay as it is, but I'm reporting it here to check what others think. -- https://bugs.ruby-lang.org/ Unsubscribe: