[#92070] [Ruby trunk Feature#15667] Introduce malloc_trim(0) in full gc cycles — sam.saffron@...
Issue #15667 has been updated by sam.saffron (Sam Saffron).
3 messages
2019/04/01
[ruby-core:92435] [Ruby trunk Misc#15800] Reduce ONIG_NREGION from 10 to 4: power of 2 and testing revealed most pattern matches are less than or equal to 4 results
From:
lourens@...
Date:
2019-04-27 13:00:55 UTC
List:
ruby-core #92435
Issue #15800 has been reported by methodmissing (Lourens Naud=E9).
----------------------------------------
Misc #15800: Reduce ONIG_NREGION from 10 to 4: power of 2 and testing revea=
led most pattern matches are less than or equal to 4 results
https://bugs.ruby-lang.org/issues/15800
* Author: methodmissing (Lourens Naud=E9)
* Status: Open
* Priority: Normal
* Assignee: =
----------------------------------------
References PR https://github.com/ruby/ruby/pull/2135 - it's a very small ch=
ange, but runnin due diligence past the list too for discussion.
I noticed `onig_region_resize` (called from `onig_region_copy`) would defau=
lt to allocating a `10 * 8` bytes block on 64bit for both the `beg` and `en=
d` members of `OnigRegion`.
Preliminary testing with Rails and the benchmark suite suggests that most p=
attern matches are `<=3D` 4 results.
#### Due diligence with debug counters
Few requests on a blank redmine instance:
```
[RUBY_DEBUG_COUNTER] obj_match_under4 10650 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER] obj_match_ge4 1589 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER] obj_match_ge8 66
[RUBY_DEBUG_COUNTER] obj_match_ptr 12305
```
single match `1000000.times { 'haystack'.match(/hay/) }`
```
[RUBY_DEBUG_COUNTER] obj_match_under4 999366 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER] obj_match_ge4 473 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER] obj_match_ge8 0
[RUBY_DEBUG_COUNTER] obj_match_ptr 999839
```
multiple matches `> 4` `1000000.times { /(.)(.)(\d+)(\d)/.match("THX1138.")=
}`
```
[RUBY_DEBUG_COUNTER] obj_match_under4 353 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER] obj_match_ge4 997579 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER] obj_match_ge8 0
[RUBY_DEBUG_COUNTER] obj_match_ptr 997932
```
#### Memory and ips benchmarks, MatchData specific
```
lourens@CarbonX1:~/src/ruby/ruby$ /usr/local/bin/ruby --disable=3Dgems -rru=
bygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver =
--executables=3D"compare-ruby::~/src/ruby/trunk/ruby --disable=
=3Dgems -I.ext/common --disable-gem" --executables=3D"built-rub=
y::./miniruby -I./lib -I. -I.ext/common -r./prelude --disable-gem" -v --re=
peat-count=3D24 -r memory $(ls ./benchmark/*match*.{yml,rb} 2>/dev/null)
compare-ruby: ruby 2.7.0dev (2019-04-19 trunk 67619) [x86_64-linux]
built-ruby: ruby 2.7.0dev (2019-04-19 reduce-onig-de.. 67619) [x86_64-linux]
last_commit=3DReduce ONIG_NREGION from 10 to 4: power of 2 and testing reve=
aled most pattern matches are less than or equal to 4 results
Calculating -------------------------------------
compare-ruby built-ruby =
match_gt4 11.936M 11.600M bytes - 1.000 times
match_small 11.848M 11.608M bytes - 1.000 times
Comparison:
match_gt4
built-ruby: 11600000.0 bytes =
compare-ruby: 11936000.0 bytes - 1.03x larger
match_small
built-ruby: 11608000.0 bytes =
compare-ruby: 11848000.0 bytes - 1.02x larger
lourens@CarbonX1:~/src/ruby/ruby$ /usr/local/bin/ruby --disable=3Dgems -rru=
bygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver =
--executables=3D"compare-ruby::~/src/ruby/trunk/ruby --disable=
=3Dgems -I.ext/common --disable-gem" --executables=3D"built-rub=
y::./miniruby -I./lib -I. -I.ext/common -r./prelude --disable-gem" -v --re=
peat-count=3D24 -r ips $(ls ./benchmark/*match*.{yml,rb} 2>/dev/null)
compare-ruby: ruby 2.7.0dev (2019-04-19 trunk 67619) [x86_64-linux]
built-ruby: ruby 2.7.0dev (2019-04-19 reduce-onig-de.. 67619) [x86_64-linux]
last_commit=3DReduce ONIG_NREGION from 10 to 4: power of 2 and testing reve=
aled most pattern matches are less than or equal to 4 results
Calculating -------------------------------------
compare-ruby built-ruby =
match_gt4 1.664 1.754 i/s - 1.000 times in 0.=
600793s 0.570031s
match_small 1.856 2.047 i/s - 1.000 times in 0.=
538838s 0.488407s
Comparison:
match_gt4
built-ruby: 1.8 i/s =
compare-ruby: 1.7 i/s - 1.05x slower
match_small
built-ruby: 2.0 i/s =
compare-ruby: 1.9 i/s - 1.10x slower
```
I am fine with removing the debug counters and committed them for now as it=
's easier for reviewers to also reproduce locally.
For additional context I noticed that character offsets are bounded by the =
`num_regs` member as per https://github.com/ruby/ruby/blob/trunk/re.c#L989-=
L1005 and therefore investigated converging `allocated` and `num_regs` to b=
e less divergent for the common cases
And some more of the 80 byte allocs from strscan with only the first chunk =
referenced:
```
=3D=3D24182=3D=3D -------------------- 283 of 1000 --------------------
=3D=3D24182=3D=3D max-live: 19,520 in 244 blocks
=3D=3D24182=3D=3D tot-alloc: 30,480 in 381 blocks (avg size 80.00)
=3D=3D24182=3D=3D deaths: 381, at avg age 423,950,747 (3.96% of prog l=
ifetime)
=3D=3D24182=3D=3D acc-ratios: 1.95 rd, 4.98 wr (59,728 b-read, 151,920 b-=
written)
=3D=3D24182=3D=3D at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_e=
xp-dhat-amd64-linux.so)
=3D=3D24182=3D=3D by 0x2561E6: onig_region_resize (regexec.c:260)
=3D=3D24182=3D=3D by 0x2561E6: onig_region_resize_clear (regexec.c:298)
=3D=3D24182=3D=3D by 0x2561E6: onig_match (regexec.c:3882)
=3D=3D24182=3D=3D by 0xA4C376B: strscan_do_scan (strscan.c:472)
=3D=3D24182=3D=3D by 0xA4C376B: strscan_skip (strscan.c:570)
=3D=3D24182=3D=3D by 0x2E5B4E: vm_call_cfunc_with_frame (vm_insnhelper.c=
:2207)
=3D=3D24182=3D=3D by 0x2E5B4E: vm_call_cfunc (vm_insnhelper.c:2225)
=3D=3D24182=3D=3D =
=3D=3D24182=3D=3D Aggregated access counts by offset:
=3D=3D24182=3D=3D =
=3D=3D24182=3D=3D [ 0] 26456 26456 26456 26456 26456 26456 26456 26456 0=
0 0 0 0 0 0 0 =
=3D=3D24182=3D=3D [ 16] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 <<<<<<<<<<
=3D=3D24182=3D=3D [ 32] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 <<<<<<<<<< =
=3D=3D24182=3D=3D [ 48] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 <<<<<<<<<<
=3D=3D24182=3D=3D [ 64] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 <<<<<<<<<<
```
-- =
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:[email protected]?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>