Bug #5811: Ruby Process Deadlocks With Fork on Mac OS X Lion - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #5811

closed

Ruby Process Deadlocks With Fork on Mac OS X Lion

Added by netshade (Chris Zelenak) over 13 years ago. Updated about 12 years ago.

Status:

Closed

Assignee:

akr (Akira Tanaka)

Target version:

ruby -v:

ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0]

Backport:

[ruby-core:<unknown>]

Description

=begin
Given a Ruby process that acts like the following:

Spawn new thread that initializes a TCPSocket
Execute script using backticks in main thread

there is a chance that it will deadlock on Lion. The GDB traces for the threads show:

The TCP connecting thread stuck on native_cond_wait/thread_pthread.c:321 by way of rsock_getaddrinfo/raddrinfo.c:359
The main thread stuck on read() by way of rb_f_backquote/io.c:7266

Meanwhile, in the forked process from rb_f_backquote:

The main thread is stuck at (longer trace):
#0 0x00007fff9160c6b6 in semaphore_wait_trap ()
#1 0x00007fff8fc03bc2 in _dispatch_thread_semaphore_wait ()
#2 0x00007fff8fc04286 in dispatch_once_f ()
#3 0x00007fff95e12f20 in si_module_static_search ()
#4 0x00007fff95e16a3d in si_module_with_name ()
#5 0x00007fff95e0eac8 in getpwuid ()
#6 0x00007fff90daa842 in getgroups$DARWIN_EXTSN ()
#7 0x000000010b82b020 in rb_group_member (gid=0) at file.c:1002
#8 0x000000010b82b10f in eaccess (path=0x7fff6b3d3570 "/bin/hostname", mode=1) at file.c:1052
...

The documentation for getpwuid in Mac OS X Lion states that getpwuid now is threadsafe, much like getpwuid_r - however, the values returned by getpwuid are thread local and disposed automatically, as opposed to getpwuid_r's allocation of results. The disassembly of semaphore_wait_trap and __psynch_cvwait both show syscalls being made (I don't know how to go much further here), but the arguments are all void to these functions too when snooping in GDB. I believe that the posix wait and semaphore_wait taking place are in fact making syscalls to wait on a condition variable of the same value - this value is the same due to the shared memory state of the fork.

When an artificial delay ("sleep 1") is introduced after the creation of the TCP connect thread, this deadlock no longer occurs.

Attached is a test script that uses the Instrumental Agent gem for the TCP connect and can reliably cause the deadlock under 1.9.3.
=end

Files

Download all files

test.rb (331 Bytes) test.rb		netshade (Chris Zelenak), 12/27/2011 03:42 AM
socket_backtick_test.rb (270 Bytes) socket_backtick_test.rb	Reproduction Case for Ruby Core Issue 5811	samg (Sam Goldstein), 02/08/2013 05:02 AM

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #5811

Ruby Process Deadlocks With Fork on Mac OS X Lion

Updated by ko1 (Koichi Sasada) about 13 years ago

Updated by shyouhei (Shyouhei Urabe) about 13 years ago

Updated by mrkn (Kenta Murata) almost 13 years ago

Updated by drbrain (Eric Hodel) over 12 years ago

Updated by samg (Sam Goldstein) over 12 years ago

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Updated by akr (Akira Tanaka) over 12 years ago

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Updated by samg (Sam Goldstein) about 12 years ago