From: lsegal@... Date: 2020-03-06T21:15:57+00:00 Subject: [ruby-core:97386] [Ruby master Bug#16675] Regression on Ripper in Ruby 2.7 when parsing new line Issue #16675 has been updated by lsegal (Loren Segal). > Ripper doesn't fire the events in the order of the source, typically around here-documents. Can you explain issue with here-documents? AFAIK, having used Ripper for 5+ years now, this is the first time we've identified Ripper firing lexical events out of order. Given that YARD is probably one of the earliest adopters of the library and has likely parsed a huge chunk of all publicly distributed Ruby code (i.e., an enormous amount of Ruby code), we have a pretty wide range of data that indicates that this is a *new* problem. ---------------------------------------- Bug #16675: Regression on Ripper in Ruby 2.7 when parsing new line https://bugs.ruby-lang.org/issues/16675#change-84512 * Author: Benoit_Tigeot (Benoit Tigeot) * Status: Closed * Priority: Normal * ruby -v: 2.7 * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN ---------------------------------------- Hello While using migrating RSpec documentation to last Yard. I noticed an issue in code parsing and Ripper. The regression appears on Ruby 2.7 and Head. ``` ruby require 'pp' require 'ripper' SOURCE = "def name\n # comment\nend" class RipperParser < Ripper attr_accessor :tokens SCANNER_EVENTS.each do |event| define_method("on_#{event}") do |*args| puts "TOKEN: #{event}" (@tokens ||= []) << [event, args] super(*args) end end end parser = RipperParser.new(SOURCE, '(stdin)') puts "PARSING:" parser.parse puts "\nTOKENS:" pp parser.tokens puts "\nRIPPER SAYS" pp Ripper.lex(SOURCE) ``` ```diff --- a/2-6-3_ripper_lex.txt +++ b/2-7-0_ripper_lex.txt @@ -1,27 +1,27 @@ ->> RUBY_VERSION: 2.6.3 +>> RUBY_VERSION: 2.7.0 PARSING: TOKEN: kw TOKEN: sp TOKEN: ident -TOKEN: nl TOKEN: sp TOKEN: comment +TOKEN: nl TOKEN: kw TOKENS: [[:kw, ["def"]], [:sp, [" "]], [:ident, ["name"]], - [:nl, ["\n"]], [:sp, [" "]], [:comment, ["# comment\n"]], + [:nl, ["\n"]], [:kw, ["end"]]] RIPPER SAYS -[[[1, 0], :on_kw, "def", EXPR_FNAME], - [[1, 3], :on_sp, " ", EXPR_FNAME], - [[1, 4], :on_ident, "name", EXPR_ENDFN], - [[1, 8], :on_nl, "\n", EXPR_BEG], - [[2, 0], :on_sp, " ", EXPR_BEG], - [[2, 2], :on_comment, "# comment\n", EXPR_BEG], - [[3, 0], :on_kw, "end", EXPR_END]] +[[[1, 0], :on_kw, "def", FNAME], + [[1, 3], :on_sp, " ", FNAME], + [[1, 4], :on_ident, "name", ENDFN], + [[1, 8], :on_nl, "\n", BEG], + [[2, 0], :on_sp, " ", ENDFN], + [[2, 2], :on_comment, "# comment\n", ENDFN], + [[3, 0], :on_kw, "end", END]] ``` As Loren Segal [mentionned](https://github.com/lsegal/yard/issues/1313#issuecomment-595458928) > Note that "comment" is detected before "nl" in both the event and the collected tokens, which is different from the results in Ripper.lex -- https://bugs.ruby-lang.org/ Unsubscribe: