From: "matz (Yukihiro Matsumoto) via ruby-core" Date: 2023-02-09T05:48:25+00:00 Subject: [ruby-core:112290] [Ruby master Feature#19061] Proposal: make a concept of "consuming enumerator" explicit Issue #19061 has been updated by matz (Yukihiro Matsumoto). Regarding the concrete proposals: 1. Introduce an `Enumerator#consuming?` method The consuming information is not reliable especially with I/O (some IO may not be rewindable, but lseek(2) may not return error for the IO, e.g. on MacOS). Thus we cannot implement trust-worthy `consuming?` method 2. Introduce `consuming: true` parameter for Enumerator.new Since `consuming?` state of the enumerators are unreliable, this keyword argument is useless 3. Introduce Enumerator#consuming method to produce a consuming enumerator from a non-consuming one The original PoC code modifies the original, the modified one raising error for duping internal fiber. It's not acceptable behavior (but former may be). In theory, we can overhaul the implementation of enumerators, but I don't think it's worth the cost. The final decision may be up to the actual use-case. But I doubt the benefit. Matz. ---------------------------------------- Feature #19061: Proposal: make a concept of "consuming enumerator" explicit https://bugs.ruby-lang.org/issues/19061#change-101722 * Author: zverok (Victor Shepelev) * Status: Open * Priority: Normal ---------------------------------------- **The problem** Let's imagine this synthetic data: ```ruby lines = [ "--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators", "", "Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..." ] ``` The logic of parsing it is more or less clear: * skip the first line * take lines until meet empty, to read the header * take the rest of the lines to read the body It can be easily translated into Ruby code, almost literally: ```ruby def parse(enumerator) puts "Testing: #{enumerator.inspect}" enumerator.next p enumerator.take_while { !_1.empty? } p enumerator.to_a end ``` Now, let's try this code with two different enumerators on those lines: ```ruby require 'stringio' enumerator1 = lines.each enumerator2 = StringIO.new(lines.join("\n")).each_line(chomp: true) puts "Array#each" parse(enumerator1) puts puts "StringIO#each_line" parse(enumerator2) ``` Output (as you probably already guessed): ``` Array#each Testing: # ["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] ["--EMAIL--", "From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators", "", "Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] StringIO#each_line Testing: #:each_line(chomp: true)> ["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] ["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] ``` Only the second enumerator behaves the way we wanted it to. Things to notice here: 1. Both enumerators are of the same class, "just enumerator," but they behave differently: one of them is **consuming** data on each iteration method, the other does not; but there is no programmatic way to tell whether some enumerator instance is consuming 2. There is no easy way to **make a non-consuming enumerator behave in a consuming way**, to open a possibility of a sequence of processing "skip this, take that, take the rest" **Concrete proposal** 1. Introduce an `Enumerator#consuming?` method that will allow telling one of the other (and make core enumerators like `#each_line` properly report they are consuming). 2. Introduce `consuming: true` parameter for `Enumerator.new` so it would be easy for user's code to specify the flag 3. Introduce `Enumerator#consuming` method to produce a consuming enumerator from a non-consuming one: ```ruby # reference implementation is trivial: class Enumerator def consuming source = self Enumerator.new { |y| loop { y << source.next } } end end enumerator3 = lines.each.consuming parse(enumerator3) ``` Output: ``` ["From: zverok.offline@gmail.com", "To; bugs@ruby-lang.org", "Subject: Consuming Enumerators"] ["Here, I am presenting the following proposal.", "Let's talk about consuming enumerators..."] ``` -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/