From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" Date: 2024-01-13T02:58:32+00:00 Subject: [ruby-core:116188] [Ruby master Bug#20169] `GC.compact` can raises `EFAULT` on IO Issue #20169 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). Well I did a bit more thinking about this. Firstly, I had a very unproductive morning trying to see if Mach exceptions on MacOS could catch invalid accesses in system calls, the way that `userfaultfd` can on Linux. Short answer: no. The second insight I had, though, is that if these objects are on the machine stack, then they actually should be pinned anyway. If you have a C extension and you take a pointer to some internal part of an object, it's already a requirement that you ensure the Ruby value gets spilled to the stack - i.e. you need to do something like this. ``` VALUE str = rb_sprintf("i am a cool string"); write(fd, RSTRING_PTR(str), RSTRING_LEN(str)); RB_GC_GUARD(str); // spill string to stack; NOT OPTIONAL! ``` Mabye we could change the GC compaction algorithm to not move _any_ objects on a page (and hence skip protecting the page) if any objects in the page are live on the machine stack? That _would_ substantially lessen the effectiveness of GC compaction I suppose, but we could maybe get that effectiveness back if userfaultfd is available? ---------------------------------------- Bug #20169: `GC.compact` can raises `EFAULT` on IO https://bugs.ruby-lang.org/issues/20169#change-106203 * Author: ko1 (Koichi Sasada) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- 1. `GC.compact` introduces read barriers to detect read accesses to the pages. 2. I/O operations release GVL to pass the control while their execution, and another thread can call `GC.compact` (or auto compact feature I guess, but not checked yet). 3. Call `write(ptr)` can return `EFAULT` when `GC.compact` is running because `ptr` can point read-barrier protected pages (embed strings). Reproducible steps: Apply the following patch to increase possibility: ```patch diff --git a/io.c b/io.c index f6cd2c1a56..83d67ba2dc 100644 --- a/io.c +++ b/io.c @@ -1212,8 +1212,12 @@ internal_write_func(void *ptr) } } + int cnt = 0; retry: - do_write_retry(write(iis->fd, iis->buf, iis->capa)); + for (; cnt < 1000; cnt++) { + do_write_retry(write(iis->fd, iis->buf, iis->capa)); + if (result <= 0) break; + } if (result < 0 && !iis->nonblock) { int e = errno; ``` Run the following code: ```ruby t1 = Thread.new{ 10_000.times.map{"#{_1}"}; GC.compact while true } t2 = Thread.new{ i=0 $stdout.write "<#{i+=1}>" while true } t2.join ``` and ``` $ make run (snip) 4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4># terminated with exception (report_on_exception is true): ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in
' ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in
' make: *** [uncommon.mk:1383: run] Error 1 ``` I think this is why we get `EFAULT` on CI. To increase possibilities running many busy processes (`ruby -e 'loop{}'` for example) will help (and on CI environment there are such busy processes accidentally). -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/