From: stuartdhadfield@... Date: 2018-03-01T15:56:28+00:00 Subject: [ruby-core:85885] [Ruby trunk Bug#14561] Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread Issue #14561 has been updated by stuarthadfield (Stuart Hadfield). harbirg (Harbir G) wrote: > I can also reproduce the same crash on MacOS High Sierra 10.13.2. Occurs on both 2.5.0 and 2.6.0-preview1. +1. Reliably reproducible on Mac, ruby 2.5.0 OSX El Capitan 10.11.6, Ruby 2.5.0 Currently causes our unit tests to seg fault on production code. Does not occur on Circle Env which is Linux based. ---------------------------------------- Bug #14561: Consistent 2.5.0 seg fault in GC, related to accessing an enumerator in a thread https://bugs.ruby-lang.org/issues/14561#change-70740 * Author: dazuma (Daniel Azuma) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17] * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN ---------------------------------------- This seg fault happens consistently on OSX (specifically I'm reproing it on a late 2015 Macbook pro running 10.13.3, but it seems to happen on similar machines as well). It happens only on Ruby 2.5.0. Small repro case: #### enum = Enumerator.new { |y| y << 1 } thread = Thread.new { enum.peek } # enum.next also causes the segfault, but not enum.size thread.join GC.start # <- seg fault here #### The C-level backtrace identifies this as within the mark phase of GC: -- C level backtrace information ------------------------------------------- 0 ruby 0x000000010f77ced7 rb_vm_bugreport + 135 1 ruby 0x000000010f602628 rb_bug_context + 472 2 ruby 0x000000010f6f1491 sigsegv + 81 3 libsystem_platform.dylib 0x00007fff6a779f5a _sigtramp + 26 4 ruby 0x000000010f61bb93 rb_gc_mark_machine_stack + 99 5 ruby 0x000000010f76bf39 rb_execution_context_mark + 137 6 ruby 0x000000010f5ea32b cont_mark + 27 7 ruby 0x000000010f626a02 gc_marks_rest + 146 8 ruby 0x000000010f6253c0 gc_start + 2816 9 ruby 0x000000010f61d628 garbage_collect + 184 10 ruby 0x000000010f622215 gc_start_internal + 485 11 ruby 0x000000010f7703be vm_call_cfunc + 286 12 ruby 0x000000010f759af4 vm_exec_core + 12260 13 ruby 0x000000010f76ac8e vm_exec + 142 14 ruby 0x000000010f60c101 ruby_exec_internal + 177 15 ruby 0x000000010f60bff8 ruby_run_node + 56 16 ruby 0x000000010f592d1f main + 79 I also ran this against Ruby recompiled with -O0, and got a more detailed backtrace: -- C level backtrace information ------------------------------------------- 0 libruby.2.5.dylib 0x000000010c416e19 rb_print_backtrace + 25 1 libruby.2.5.dylib 0x000000010c416f28 rb_vm_bugreport + 136 2 libruby.2.5.dylib 0x000000010c2096f2 rb_bug_context + 450 3 libruby.2.5.dylib 0x000000010c35b4ee sigsegv + 94 4 libsystem_platform.dylib 0x00007fff6a779f5a _sigtramp + 26 5 libruby.2.5.dylib 0x000000010c2395a1 mark_locations_array + 49 6 libruby.2.5.dylib 0x000000010c22a5bb gc_mark_locations + 75 7 libruby.2.5.dylib 0x000000010c22a7d9 mark_stack_locations + 41 8 libruby.2.5.dylib 0x000000010c22a79f rb_gc_mark_machine_stack + 79 9 libruby.2.5.dylib 0x000000010c3f8868 rb_execution_context_mark + 264 10 libruby.2.5.dylib 0x000000010c1e263e cont_mark + 46 11 libruby.2.5.dylib 0x000000010c1e2572 fiber_mark + 146 12 libruby.2.5.dylib 0x000000010c22f4c6 gc_mark_children + 1094 13 libruby.2.5.dylib 0x000000010c23734c gc_mark_stacked_objects + 108 14 libruby.2.5.dylib 0x000000010c237a5b gc_mark_stacked_objects_all + 27 15 libruby.2.5.dylib 0x000000010c236cb1 gc_marks_rest + 129 16 libruby.2.5.dylib 0x000000010c238787 gc_marks + 103 17 libruby.2.5.dylib 0x000000010c2352e2 gc_start + 802 18 libruby.2.5.dylib 0x000000010c22ca18 garbage_collect + 56 19 libruby.2.5.dylib 0x000000010c231f7d gc_start_internal + 493 20 libruby.2.5.dylib 0x000000010c401f2a call_cfunc_m1 + 42 21 libruby.2.5.dylib 0x000000010c400d1d vm_call_cfunc_with_frame + 605 22 libruby.2.5.dylib 0x000000010c3fc41d vm_call_cfunc + 173 23 libruby.2.5.dylib 0x000000010c3fb8fe vm_call_method_each_type + 190 24 libruby.2.5.dylib 0x000000010c3fb690 vm_call_method + 160 25 libruby.2.5.dylib 0x000000010c3fb5e5 vm_call_general + 53 26 libruby.2.5.dylib 0x000000010c3e784e vm_exec_core + 8974 27 libruby.2.5.dylib 0x000000010c3f6fe6 vm_exec + 182 28 libruby.2.5.dylib 0x000000010c3f7d5b rb_iseq_eval_main + 43 29 libruby.2.5.dylib 0x000000010c214208 ruby_exec_internal + 232 30 libruby.2.5.dylib 0x000000010c214111 ruby_exec_node + 33 31 libruby.2.5.dylib 0x000000010c2140d0 ruby_run_node + 64 32 ruby 0x000000010c16ff2f main + 95 As far as I can tell, the C instruction triggering the segfault is here in gc.c (around line 4064): static void mark_locations_array(rb_objspace_t *objspace, register const VALUE *x, register long n) { VALUE v; while (n--) { v = *x; // <----- Seems to be crashing here? gc_mark_maybe(objspace, v); x++; } } Indicating a bad pointer in the machine stack. I'm not sufficiently familiar with the VM internals to make much further progress, but I hope the repro case is helpful. It seems to require accessing an Enumerator element within a separate thread, and then waiting for the thread to end. -- https://bugs.ruby-lang.org/ Unsubscribe: