From: lsegal@...
Date: 2020-03-06T21:15:57+00:00
Subject: [ruby-core:97386] [Ruby master Bug#16675] Regression on Ripper in Ruby 2.7 when parsing new line

Issue #16675 has been updated by lsegal (Loren Segal).


> Ripper doesn't fire the events in the order of the source, typically around here-documents.

Can you explain issue with here-documents? AFAIK, having used Ripper for 5+ years now, this is the first time we've identified Ripper firing lexical events out of order. Given that YARD is probably one of the earliest adopters of the library and has likely parsed a huge chunk of all publicly distributed Ruby code (i.e., an enormous amount of Ruby code), we have a pretty wide range of data that indicates that this is a *new* problem.


----------------------------------------
Bug #16675: Regression on Ripper in Ruby 2.7 when parsing new line 
https://bugs.ruby-lang.org/issues/16675#change-84512

* Author: Benoit_Tigeot (Benoit Tigeot)
* Status: Closed
* Priority: Normal
* ruby -v: 2.7
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
Hello

While using migrating RSpec documentation to last Yard. I noticed an issue in code parsing and Ripper. The regression appears on Ruby 2.7 and Head.

``` ruby
require 'pp'
require 'ripper'

SOURCE = "def name\n  # comment\nend"

class RipperParser < Ripper
  attr_accessor :tokens

  SCANNER_EVENTS.each do |event|
    define_method("on_#{event}") do |*args|
      puts "TOKEN:     #{event}"
      (@tokens ||= []) << [event, args]
      super(*args)
    end
  end
end

parser = RipperParser.new(SOURCE, '(stdin)')

puts "PARSING:"
parser.parse

puts "\nTOKENS:"
pp parser.tokens

puts "\nRIPPER SAYS"
pp Ripper.lex(SOURCE)
```


```diff
--- a/2-6-3_ripper_lex.txt
+++ b/2-7-0_ripper_lex.txt
@@ -1,27 +1,27 @@
->> RUBY_VERSION: 2.6.3
+>> RUBY_VERSION: 2.7.0
 PARSING:
 TOKEN:     kw
 TOKEN:     sp
 TOKEN:     ident
-TOKEN:     nl
 TOKEN:     sp
 TOKEN:     comment
+TOKEN:     nl
 TOKEN:     kw

 TOKENS:
 [[:kw, ["def"]],
  [:sp, [" "]],
  [:ident, ["name"]],
- [:nl, ["\n"]],
  [:sp, ["  "]],
  [:comment, ["# comment\n"]],
+ [:nl, ["\n"]],
  [:kw, ["end"]]]

 RIPPER SAYS
-[[[1, 0], :on_kw, "def", EXPR_FNAME],
- [[1, 3], :on_sp, " ", EXPR_FNAME],
- [[1, 4], :on_ident, "name", EXPR_ENDFN],
- [[1, 8], :on_nl, "\n", EXPR_BEG],
- [[2, 0], :on_sp, "  ", EXPR_BEG],
- [[2, 2], :on_comment, "# comment\n", EXPR_BEG],
- [[3, 0], :on_kw, "end", EXPR_END]]
+[[[1, 0], :on_kw, "def", FNAME],
+ [[1, 3], :on_sp, " ", FNAME],
+ [[1, 4], :on_ident, "name", ENDFN],
+ [[1, 8], :on_nl, "\n", BEG],
+ [[2, 0], :on_sp, "  ", ENDFN],
+ [[2, 2], :on_comment, "# comment\n", ENDFN],
+ [[3, 0], :on_kw, "end", END]]
```

As Loren Segal [mentionned](https://github.com/lsegal/yard/issues/1313#issuecomment-595458928)
> Note that "comment" is detected before "nl" in both the event and the collected tokens, which is different from the results in Ripper.lex


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>