[#65451] [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string — ko1@...
Issue #10333 has been updated by Koichi Sasada.
9 messages
2014/10/07
[#65458] Re: [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string
— Eric Wong <normalperson@...>
2014/10/07
[email protected] wrote:
[#65502] Re: [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string
— Eric Wong <normalperson@...>
2014/10/08
Eric Wong <[email protected]> wrote:
[#65538] Re: [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string
— Eric Wong <normalperson@...>
2014/10/09
Eric Wong <[email protected]> wrote:
[#65549] Re: [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string
— SASADA Koichi <ko1@...>
2014/10/09
On 2014/10/09 11:04, Eric Wong wrote:
[#65551] Re: [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string
— Eric Wong <normalperson@...>
2014/10/09
SASADA Koichi <[email protected]> wrote:
[#65453] [ruby-trunk - Feature #10328] [PATCH] make OPT_SUPPORT_JOKE a proper VM option — ko1@...
Issue #10328 has been updated by Koichi Sasada.
3 messages
2014/10/07
[#65559] is there a name for this? — Xavier Noria <fxn@...>
When describing stuff about constants (working in their guide), you often
7 messages
2014/10/09
[#65560] Re: is there a name for this?
— Nobuyoshi Nakada <nobu@...>
2014/10/09
On 2014/10/09 20:41, Xavier Noria wrote:
[#65561] Re: is there a name for this?
— Xavier Noria <fxn@...>
2014/10/09
On Thu, Oct 9, 2014 at 1:59 PM, Nobuyoshi Nakada <[email protected]> wrote:
[#65566] [ruby-trunk - Feature #10351] [Open] [PATCH] prevent CVE-2014-6277 — shyouhei@...
Issue #10351 has been reported by Shyouhei Urabe.
3 messages
2014/10/09
[#65741] Re: [ruby-cvs:55121] normal:r47971 (trunk): test/ruby/test_rubyoptions.rb: fix race — Nobuyoshi Nakada <nobu@...>
On 2014/10/16 10:10, [email protected] wrote:
5 messages
2014/10/16
[#65742] Re: [ruby-cvs:55121] normal:r47971 (trunk): test/ruby/test_rubyoptions.rb: fix race
— Eric Wong <normalperson@...>
2014/10/16
Nobuyoshi Nakada <[email protected]> wrote:
[#65750] Re: [ruby-cvs:55121] normal:r47971 (trunk): test/ruby/test_rubyoptions.rb: fix race
— Tanaka Akira <akr@...>
2014/10/16
2014-10-16 12:48 GMT+09:00 Eric Wong <[email protected]>:
[#65753] [ruby-trunk - Feature #10333] [PATCH 3/1] optimize: "yoda literal" == string — ko1@...
Issue #10333 has been updated by Koichi Sasada.
3 messages
2014/10/16
[#65818] [ruby-trunk - Feature #10351] [PATCH] prevent CVE-2014-6277 — shyouhei@...
Issue #10351 has been updated by Shyouhei Urabe.
3 messages
2014/10/20
[ruby-core:65960] [ruby-trunk - Feature #10440] Optimize keyword and splat argument
From:
normalperson@...
Date:
2014-10-28 23:31:54 UTC
List:
ruby-core #65960
Issue #10440 has been updated by Eric Wong.
Cool.
My only concern is making call_info and iseq structs bigger.
I think most of the iseq->arg_keyword_* fields can be moved to a
separate allocation (like catch table) because they are not common
and space may be saved that way.
We may do that after merging this optimization.
call_info is harder to shrink (but more common than iseq, so
size changes have more effect...)
----------------------------------------
Feature #10440: Optimize keyword and splat argument
https://bugs.ruby-lang.org/issues/10440#change-49694
* Author: Koichi Sasada
* Status: Open
* Priority: Normal
* Assignee: Koichi Sasada
* Category: core
* Target version: current: 2.2.0
----------------------------------------
# Abstract
Change data structure of call_info and rewrite all of method argument fitting code to optimize keyword arguments and a splat argument. My measurement shows that this optimization is x10 faster than current code of method dispatch with keyword argument.
# Background
This feature focuses two issues about keyword arguments and a splat argument.
## (1) Keyword arguments
Caller site of keyword arguments are introduced from Ruby 1.9.3, it is lik calling method with foo(k1: v1, k2: v2). This method invocation means that passing one Hash object as an argument of method foo, like foo({k1: v1, k2: v2}).
Callee site of keyword arguments are introduced from Ruby 2.0.0. We can write method definition like "def foo(k1: v1, k2: v2)". This is compiled to:
```ruby
def foo(_kw) # _kw is implicit keyword
# implicit plologue code
k1 = _kw.key?(:k1) ? _kw[:k1] : v1
k2 = _kw.key?(:k2) ? _kw[:k2] : v2
# method body
...
end
```
foo(k1: v1, ...) makes one Hash object and defined method receives one Hash object. It is consistent between caller site and callee site.
However, there are several overhead.
(1-1) Making Hash object for each method invocation.
(1-2) Hash access code in implicit plologue code is overhead.
I had measured this overhead and result is <http://www.atdot.net/~ko1/diary/201410.html#d11>.
```
def foo0
end
def foo3 a, b, c
end
def foo6 a, b, c, d, e, f
end
def foo_kw6 k1: nil, k2: nil, k3: nil, k4: nil, k5: nil, k6: nil
end
ruby 2.2.0dev (2014-10-10 trunk 47867) [i386-mswin32_110]
user system total real
call foo0
0.140000 0.000000 0.140000 ( 0.134481)
call foo3
0.141000 0.000000 0.141000 ( 0.140427)
call foo6
0.171000 0.000000 0.171000 ( 0.180837)
call foo_kw6 without keywords
0.593000 0.000000 0.593000 ( 0.595162)
call foo_kw6 with 1 keyword
1.778000 0.016000 1.794000 ( 1.787873)
call foo_kw6 with 2 keyword, and so on.
2.028000 0.000000 2.028000 ( 2.034146)
2.247000 0.000000 2.247000 ( 2.255171)
2.464000 0.000000 2.464000 ( 2.470283)
2.621000 0.000000 2.621000 ( 2.639155)
2.855000 0.000000 2.855000 ( 2.863643)
```
You can see that "call foo6" is 5 times faster than "call foo_kw6 with 6 keyworsd".
The fact is that "calling keyword argument is slower than normal method dispatch.
Such small code is compile to the following VM codes.
```ruby
def foo k1: 1, k2: 2
end
```
```
== disasm: <RubyVM::InstructionSequence:foo@../../trunk/test.rb>========
local table (size: 4, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, keyword: 2 @2] s0)
[ 4] k1 [ 3] k2 [ 2] ?
0000 getlocal_OP__WC__0 2 ( 1)
0002 dup
0003 putobject :k1
0005 opt_send_simple <callinfo!mid:key?, argc:1, ARGS_SKIP>
0007 branchunless 18
0009 dup
0010 putobject :k1
0012 opt_send_simple <callinfo!mid:delete, argc:1, ARGS_SKIP>
0014 setlocal_OP__WC__0 4
0016 jump 21
0018 putobject_OP_INT2FIX_O_1_C_
0019 setlocal_OP__WC__0 4
0021 dup
0022 putobject :k2
0024 opt_send_simple <callinfo!mid:key?, argc:1, ARGS_SKIP>
0026 branchunless 37
0028 dup
0029 putobject :k2
0031 opt_send_simple <callinfo!mid:delete, argc:1, ARGS_SKIP>
0033 setlocal_OP__WC__0 3
0035 jump 41
0037 putobject 2
0039 setlocal_OP__WC__0 3
0041 pop
0042 trace 8
0044 putnil
0045 trace 16 ( 2)
0047 leave ( 1)
```
## (2) A Splat argument and a rest parameter
Splat argument is N-length array object and it is handled as N-th normal arguments.
```ruby
ary = [1, 2, 3]
foo(*ary)
foo(1, 2, 3) # These two method invocation is completely same.
```
Also a method can be accept any number of arguments by a rest parameter.
```ruby
def foo(*rest)
p rest
end
foo(1, 2) #=> [1, 2]
foo(1, 2, 3) #=> [1, 2, 3]
```
Combination of this splat argument and rest parameter, we should use very long array.
```ruby
def foo(*rest)
rest.size
end
foo((1..1_000_000).to_a) #=> should be 1000000
```
However, current implementation try to put all elements of a splat argument onto the VM stack, and it causes Stack overflow error.
```
test.rb:5:in `<main>': stack level too deep (SystemStackError)
```
And also delegation methods, which passes a splat argument and receives a rest argument, can be run faster without splatting all elements onto the VM stack.
# Proposal: change call_info and rewrite argument fitting code
Basic idea is to passing caller arguments without any modification with a structure (meta) data.
(in other name, "Let it go" patch # sorry, this patch doesn't increase frozen objects.)
The patch is here: https://github.com/ko1/ruby/compare/kwopt
## For keyword arguments
```ruby
# example code
def foo(k1: default_v1, k2: default_v2)
# method body
end
foo(k1: v1, k2: v2) # line 6
```
On line 6, only push values (v1 and v2) and pass keys (k1 and k2) info via a call_info strucutre.
In `send' instruction (line 2), fill local variables (k1, k2) with passed keyword values (v1, v2) with keys info in call_info.
Especially, default values (default_v1, default_v2) are immediate values such as nil, Fixnum and so on, we record such immediate values in compile time and set default values in send instruction. This technique reduce checking overhead in prologue code.
This case, disassembled code is here:
```
# ----------------------------------------------------------------------
# target program:
# ----------------------------------------------------------------------
# example code
def foo(k1: 1, k2: 2)
# method body
end
foo(k1: 100, k2: 200) # line 6
# ----------------------------------------------------------------------
# disasm result:
# ----------------------------------------------------------------------
== disasm: <RubyVM::InstructionSequence:<main>@../../gitruby/test.rb>===
0000 trace 1 ( 2)
0002 putspecialobject 1
0004 putspecialobject 2
0006 putobject :foo
0008 putiseq foo
0010 opt_send_simple <callinfo!mid:core#define_method, argc:3, ARGS_SKIP>
0012 pop
0013 trace 1 ( 6)
0015 putself
0016 putobject 100
0018 putobject 200
0020 opt_send_simple <callinfo!mid:foo, argc:2, kw:2, FCALL|ARGS_SKIP>
0022 leave
== disasm: <RubyVM::InstructionSequence:foo@../../gitruby/test.rb>======
local table (size: 4, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, keyword: 2@2] s0)
[ 4] k1 [ 3] k2 [ 2] ?
0000 trace 8 ( 2)
0002 putnil
0003 trace 16 ( 4)
0005 leave
# ----------------------------------------------------------------------
```
## Splat argument and rest argument
Instead of pushing all elements of a splat argument, we pass argument with flag (meta-data).
# Evaluation
Benchmark with same program on different platform.
```
current trunk: ruby 2.2.0dev (2014-10-27 trunk 48154) [x86_64-linux]
user system total real
0.070000 0.000000 0.070000 ( 0.063836)
0.070000 0.000000 0.070000 ( 0.067525)
0.070000 0.000000 0.070000 ( 0.074835)
0.270000 0.000000 0.270000 ( 0.271872)
1.170000 0.000000 1.170000 ( 1.166828)
1.320000 0.000000 1.320000 ( 1.322710)
1.480000 0.000000 1.480000 ( 1.484837)
1.680000 0.000000 1.680000 ( 1.675304)
1.780000 0.000000 1.780000 ( 1.785633)
1.970000 0.000000 1.970000 ( 1.966972)
modified: ruby 2.2.0dev (2014-10-27 trunk 48158) [x86_64-linux]
user system total real
0.080000 0.000000 0.080000 ( 0.074382)
0.090000 0.000000 0.090000 ( 0.095778)
0.080000 0.000000 0.080000 ( 0.078085)
0.110000 0.000000 0.110000 ( 0.114086)
0.110000 0.000000 0.110000 ( 0.111416)
0.120000 0.000000 0.120000 ( 0.118595)
0.130000 0.000000 0.130000 ( 0.129644)
0.140000 0.000000 0.140000 ( 0.136531)
0.160000 0.000000 0.160000 ( 0.157686)
0.150000 0.000000 0.150000 ( 0.154985)
```
The performance of keyword arguments are dramatically improved.
And now, we can pass a big splat argument with a rest argument.
```ruby
def foo(*rest)
rest.size
end
p foo(*(1..1_000_000).to_a) #=> 1_000_000
```
Current evaluation of benchmark set is here: http://www.atdot.net/sp/view/gxr4en/readonly
The number is ratio compare with current trunk. Higher is fast (lower is slow than current implementation). This result shows that this patch introduced some overhead, especially yield() syntax. This is because I unify method invocation code and block invocation code, and eliminate fast pass for simple block invocation. I will add this fast pass and the results will be recovered.
--
https://bugs.ruby-lang.org/