From: Vladimir Makarov Date: 2018-02-08T10:14:22-05:00 Subject: [ruby-core:85483] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults On 02/07/2018 06:19 PM, Eric Wong wrote: > eregontp@gmail.com wrote: >> The GIL guarantees all C code is executed with the GIL held, so C code always perfectly sees effects performed by C code of other threads >> (except rb_thread_call_without_gvl but it is not used here, is it?). > Correct, GVL protects common cases (string/fixnum/etc) > >> In this case: is the code during a Hash lookup, after calling Ruby's #hash/#eql? expecting some state to not have changed since before the call? >> If so, it should be enough to re-check that state after the call. > Yes, probably doable. I thought about this while eating; > but we can probably use rebuilds_num as a seqlock: > > unsigned int seq; > > retry: > seq = tab->rebuilds_num; /* needs barriers */ > ... > ...->compare(a, b); /* may rebuild */ > if (tab->rebuilds_num != seq) > goto retry; > > Not free in terms of overhead, but should still be cheap and > shouldn't need atomics on the reader side; only memory barriers > to ensure correct ordering with some CPUs. > >> When growing/shrinking a Hash table, the GIL should be held, so that should be observed atomically by other threads. >> Or is it the problem that #rehash needs to call back to Ruby code? > Right, the rebuild is done entirely with the GVL held, so that's > not a problem. It needs to update rebuild_num atomically > for the above reader pseudocode to work, but that's a cold path. > > Vladimir: thoughts? > Yes, I think it could work. It is probably a solution with minimal possible overhead. Besides compare, there are other external functions called by st.c (e.g. func in st_insert2 or st_general_foreach) but probably they don't go from C land to Ruby land (where the thread switch can happen and the table modified or even eql method would be written to modify the table which would be a very weird code). As I saw these functions are used to inform GC only. But it worth to add additional asserts to check it. Strange that the old hash tables had no such code already. Probably we were lucky that the bug did not triggered before. I'll work on it. I think the patch will be ready on the next week. I need to evaluate a performance impact too. Unsubscribe: