Project

General

Profile

« Previous | Next » 

Revision ac6f08df

Added by naruse (Yui NARUSE) over 8 years ago

Use ADD instead of MUL

  • On recent CPUs, 2-operand MUL's latency is 3 cycle but ADD is 1 cycle.
  • clang Optimizes MUL rax,2 into ADD rax,rax but gcc7 doesn't.
  • LONG2FIX is compiled into lea r14,[r15+r15*1+0x1]; this is 1cycle
    and run in parallel if the branch prediction is correct.
  • Note that old (RB_POSFIXABLE(f) && RB_NEGFIXABLE(f)) is usually uses
    following instructions.
    • movabs rax,0x4000000000000000
    • add rax,rdi
    • js
      It needs large immediate and Macro-Fusion is not applied.
      ADD and JO is much smaller though it is also Macro-Fusion unfriendly.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57793 b2dd03c8-39d4-4d8f-98ff-823fe69b080e