Project

General

Profile

Actions

Bug #17304

closed

Ruby stuck calling sched_yield on fork

Added by thinline (THINline s.r.o.) over 4 years ago. Updated over 4 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-linux-gnu]
[ruby-core:100691]

Description

We have been encountering intermittent bug when using fork - the interpreter process gets stuck in a loop that keeps calling sched_yield. This keeps happening seemingly randomly every few days, while working correctly most of the time. (Summed over all machines we are talking about one successful invocation every second while getting this error once in two-three days.) If I did my search right, it's this code from thread_pthread.c in native_stop_timer_thread(void):

while (ATOMIC_CAS(timer_thread_pipe.writing, (rb_atomic_t)0, 0)) {
  native_thread_yield();
}

If I remember correctly, first time we were hit by this was version 2.1.x - it was hapenning with 2.3.x for sure. (Ruby distributed by Debian in all cases.) I found a similar issue here https://bugs.ruby-lang.org/issues/13794 but that is supposed to be fixed. Patch from revision 60079 mentioned there is applied in Debian sources.

Ruby backtrace from affected process (obtained via gdb) is:

from /usr/bin/app:102:in `<main>'
from /usr/bin/app:102:in `new'
from /usr/lib/app/klass.rb:119:in `initialize'
from /usr/lib/app/klass.rb:119:in `new'
from /usr/lib/app/instance/instance.rb:81:in `initialize'
from /usr/lib/app/instance/instance.rb:81:in `fork'
from /usr/lib/app/instance/instance.rb:103:in `block in initialize'
from /usr/lib/app/instance/instance.rb:103:in `exec'

Attached is a backtrace from GDB (thread apply all bt) and simple reproducing program, which is a simplified version of what our app does and which I wasn't able to actually reproduce the bug with (As I said, it only happens randomly and somewhat rarely.)

I realize this might be pretty difficult to hunt down, so if you need any other information, let me know, I will try to obtain it next time the bug is hit.


Files

reproducer.rb (710 Bytes) reproducer.rb simplified reproducer thinline (THINline s.r.o.), 11/02/2020 12:44 PM
backtrace.txt (18.3 KB) backtrace.txt backtrace thinline (THINline s.r.o.), 11/02/2020 12:45 PM

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

  • Status changed from Open to Feedback

Can you reproduce this issue with the master branch, or at least Ruby 2.6 or 2.7? The code you posted from Ruby 2.5 is no longer present in Ruby 2.6 or later versions. Ruby 2.5 is in security maintenance mode, and this doesn't appear to be a security issue.

Actions #2

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

  • Status changed from Feedback to Closed

Updated by thinline (THINline s.r.o.) over 4 years ago

jeremyevans0 (Jeremy Evans) wrote in #note-1:

Can you reproduce this issue with the master branch, or at least Ruby 2.6 or 2.7? The code you posted from Ruby 2.5 is no longer present in Ruby 2.6 or later versions. Ruby 2.5 is in security maintenance mode, and this doesn't appear to be a security issue.

Hello, thanks for the reply and sorry for the delayed response - was expecting an e-mail notification and completely forgot to check manually.

Testing in newer version of Ruby would be quite problematic unfortunately - Debian doesn't have a backport branch for Ruby and I don't want to upgrade production machines to Debian testing branch (I haven't been able to reproduce this anywhere except in production.)

However, since the posted code is not present in newer Ruby versions, I agree that spending time on this would be a waste. New Debian stable version with Ruby 2.7 should be released in 6-ish months and upgrading few machines to a preview of that in February or March should be feasible. If the bug persists then, I'll open a new report.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0