From: "headius (Charles Nutter)" Date: 2013-10-09T14:53:06+09:00 Subject: [ruby-core:57764] [ruby-trunk - Feature #8992] Use String#freeze and compiler tricks to replace "str"f suffix Issue #8992 has been updated by headius (Charles Nutter). normalperson (Eric Wong) wrote: > "headius (Charles Nutter)" wrote: > > So here's the same question I asked in the #frozen feature: why can't > > #freeze just use the fstring table? > > That would be an interesting experiment. After all, it is #freeze and > not #freeze!, so maybe we have some leverage there. I think we do. The worst case scenario is that *while referenced* we have more entries in the table, which may include strings that become "shady" and pass out to C exts. But those strings would stay alive under the current definition of "shady" and even under older Ruby versions with a purely conservative GC the effects are no worse. So basically: * If the string is long lived normally, it will take up X bytes for its lifetime. * If the string gets stored in the fstring table, it will last no longer than it would without the fstring table. * If the string is short-lived, it will have a bit more overhead for dealing with fstring table, but very little; hash calculation and table management at most. It seems acceptable to have #freeze basically be Java's #intern. > > * fstrings will GC and clear themselves from that table > > I think this needs some work for the non-parser case, there seems to > be a bad interaction with lazy sweep. My analysis of my failed > patch for Feature #8998: > http://mid.gmane.org/20131009021547.GA1839@dcvr.yhbt.net I don't doubt your analysis, but I don't think it's any worse with #freeze using fstring table. It's just multiplied by the number of strings that get frozen. Critical failure * N is still a critical failure. > I also get (identical?) segfaults with the following: > > diff --git a/object.c b/object.c > --- a/object.c > +++ b/object.c > @@ -1029,6 +1029,8 @@ VALUE > rb_obj_freeze(VALUE obj) > { > if (!OBJ_FROZEN(obj)) { > + if (TYPE(obj) == T_STRING) > + return rb_fstring(obj); > OBJ_FREEZE(obj); > if (SPECIAL_CONST_P(obj)) { > if (!immediate_frozen_tbl) { Same argument here, I think. ---------------------------------------- Feature #8992: Use String#freeze and compiler tricks to replace "str"f suffix https://bugs.ruby-lang.org/issues/8992#change-42365 Author: headius (Charles Nutter) Status: Open Priority: Normal Assignee: matz (Yukihiro Matsumoto) Category: core Target version: current: 2.1.0 BACKGROUND: In https://bugs.ruby-lang.org/issues/8579 @charliesome introduced the "f" suffix for creating already-frozen strings. A string like "str"f would have the following characteristics: * It would be frozen before the expression returned * It would be the same object everywhere, pulling from a global "fstring" table To avoid memory leaks, these pooled strings would remove themselves from the "fstring" table on GC. However, there are problems with this new syntax: * It will never parse in Ruby 2.0 and earlier. * It's not particularly attractive, though this is a subjective matter. * It does not lend itself well to use in other scenarios, such as for arrays and hashes (http://bugs.ruby-lang.org/issues/8909 ) PROPOSAL: I propose that we eliminate the new "f" suffix and just make the compiler smart enough to see literal strings with .frozen the same way. So this code: str = "mystring".freeze Would be equivalent in the compiler to this code: str = "mystring"f And the fstring table would still be used to return pooled instances. IMPLEMENTATION NOTES: The fstring table already exists on master and would be used for these pooled strings. An open question is whether the compiler should forever optimize "str".frozen to return the pooled version or whether it should check (inline-cache style) whether String#freeze has been replaced. I am ok with either, but the best potential comes from ignoring String#freeze redefinitions...or making it impossible to redefine String#freeze. BONUS BIKESHEDDING: If we do not want to overload the existing .freeze method in this way, we could follow suggestions in http://bugs.ruby-lang.org/issues/8977 to add a new "frozen" method (or some other name) that the compiler would understand. If it were "frozen", the following two lines would be equivalent: str = "mystring".frozen str = "mystring"f In addition, using .frozen on any string would put it in the fstring table and return that pooled version. I also propose one alternative method name: the unary ~ operator. There is no ~ on String right now, and it has no meaning for strings that we'd be overriding. So the following two lines would be equivalent: str = ~"mystring" str = "mystring"f JUSTIFICATION: Making the compiler aware of normal method-based String freezing has the following advantages: * It will parse in all versions of Ruby. * It will be equivalent in all versions of Ruby other than the fstring pooling. * It extends neatly to Array and Hash; the compiler can see Array or Hash with literal elements and return the same object. * It does not require a pragma (http://bugs.ruby-lang.org/issues/8976 ) * It looks like Ruby. -- http://bugs.ruby-lang.org/