[#61171] Re: [ruby-changes:33145] normal:r45224 (trunk): gc.c: fix build for testing w/o RGenGC — SASADA Koichi <ko1@...>
(2014/03/01 16:15), normal wrote:
[#61243] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops — normalperson@...
Issue #9425 has been updated by Eric Wong.
[#61359] [ruby-trunk - Bug #9609] [Open] [PATCH] vm_eval.c: fix misplaced RB_GC_GUARDs — normalperson@...
Issue #9609 has been reported by Eric Wong.
(2014/03/07 19:09), [email protected] wrote:
SASADA Koichi <[email protected]> wrote:
[#61424] [REJECT?] xmalloc/xfree: reduce atomic ops w/ thread-locals — Eric Wong <normalperson@...>
I'm unsure about this. I _hate_ the extra branches this adds;
Hi Eric,
SASADA Koichi <[email protected]> wrote:
(2014/03/14 2:12), Eric Wong wrote:
SASADA Koichi <[email protected]> wrote:
(2014/03/15 17:34), Eric Wong wrote:
[#61452] [ruby-trunk - Feature #9632] [Open] [PATCH 0/2] speedup IO#close with linked-list from ccan — normalperson@...
Issue #9632 has been reported by Eric Wong.
[#61496] [ruby-trunk - Feature #9638] [Open] [PATCH] limit IDs to 32-bits on 64-bit systems — normalperson@...
Issue #9638 has been reported by Eric Wong.
[#61568] hash function for global method cache — Eric Wong <normalperson@...>
I came upon this because I noticed existing st numtable worked poorly
(2014/03/18 8:03), Eric Wong wrote:
SASADA Koichi <[email protected]> wrote:
what's the profit from using binary tree in place of hash?
Юрий Соколов <[email protected]> wrote:
[#61687] [ruby-trunk - Bug #9606] Ocassional SIGSEGV inTestException#test_machine_stackoverflow on OpenBSD — normalperson@...
Issue #9606 has been updated by Eric Wong.
[#61760] [ruby-trunk - Feature #9632] [PATCH 0/2] speedup IO#close with linked-list from ccan — normalperson@...
Issue #9632 has been updated by Eric Wong.
[ruby-core:61639] [ruby-trunk - Feature #9425] [PATCH] st: use power-of-two sizes to avoid slow modulo ops
Issue #9425 has been updated by Eric Wong. [email protected] wrote: > OK, I was right about compare_by_identity being worse with power-of-two, > but I fixed it by tweaking numhash: > > http://bogomips.org/ruby.git/patch?id=1579e9d0d82789 I tweaked that further, adding +3 instead of +1 to RUBY_SPECIAL_SHIFT in r45384. Also updated NEWS in case some extensions need tweaking for performance. Haswell Xeon E3-1230 v3 numbers: hash_aref_miss 1.166 hash_aref_str 1.167 hash_aref_sym 1.224 hash_aref_sym_long 1.270 hash_flatten 1.656 hash_ident_num 1.142 hash_ident_obj 1.193 hash_ident_str 1.194 hash_ident_sym 1.171 hash_keys 1.002 hash_shift 1.122 hash_values 1.006 loop_whileloop2 1.001 vm2_bighash* 1.233 ---------------------------------------- Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops https://bugs.ruby-lang.org/issues/9425#change-45904 * Author: Eric Wong * Status: Closed * Priority: Low * Assignee: * Category: core * Target version: current: 2.2.0 ---------------------------------------- Prime number-sized hash tables are only needed to compensate for bad hash functions. Ruby has good hash functions nowadays, so reduce our code size with power-of-two-sized hash tables which allows us to avoid the slow modulo operation. I expected numhash performance to be worse, but it seems most of those hashes are either too-small-to-matter or well-distributed anyways. If we find problems with some existing numhashes we should start using a proper hash function (Murmur via st_hash_uint) on those. This consistently speeds up the bm_hash_flatten and bm_vm2_bighash. target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems" target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems" Benchmarks on a Xeon E3-1230 v3 CPU: minimum results in each 10 measurements. Execution time (sec) name trunk st-noprime hash_flatten 0.500 0.345 hash_keys 0.191 0.192 hash_shift 0.019 0.018 hash_values 0.201 0.200 loop_whileloop2 0.090 0.090 vm2_bighash* 4.457 3.578 Speedup ratio: compare with the result of `trunk' (greater is better) name st-noprime hash_flatten 1.451 hash_keys 0.998 hash_shift 1.046 hash_values 1.003 loop_whileloop2 1.000 vm2_bighash* 1.246 Somewhat less impressive on an AMD FX 8320: minimum results in each 10 measurements. Execution time (sec) name trunk st-noprime hash_flatten 0.633 0.596 hash_keys 0.236 0.232 hash_shift 0.031 0.032 hash_values 0.234 0.238 loop_whileloop2 0.135 0.135 vm2_bighash* 8.198 6.982 Speedup ratio: compare with the result of `trunk' (greater is better) name st-noprime hash_flatten 1.063 hash_keys 1.020 hash_shift 0.976 hash_values 0.982 loop_whileloop2 1.000 vm2_bighash* 1.174 ---------------------------------------------------------------- The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf: test_thread.rb: stop at once (2014-01-16 06:34:47 +0000) are available in the git repository at: git://80x24.org/ruby.git st-noprime for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479: st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000) ---------------------------------------------------------------- Eric Wong (1): st: use power-of-two sizes to avoid slow modulo ops st.c | 100 +++++++++++++++++-------------------------------------------------- 1 file changed, 25 insertions(+), 75 deletions(-) ---Files-------------------------------- 0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB) -- https://bugs.ruby-lang.org/