[ruby-core:99588] [Ruby master Feature#17115] Optimize String#casecmp? for ASCII strings
Issue #17115 has been updated by Dan0042 (Daniel DeLorme).
In the benchmark you'd need to change the regexp from `/\Afoo\Z/i` to `/\Ac=
onnection\z/i`; if you do so you'll find the regexp performance is similar =
to `casecmp?`
+1 for special-casing ASCII strings though.
Related: #13750, #14055
----------------------------------------
Feature #17115: Optimize String#casecmp? for ASCII strings
https://bugs.ruby-lang.org/issues/17115#change-87065
* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
Patch: https://github.com/ruby/ruby/pull/3369
`casecmp?` is kind of a performance trap as it's much slower than using a c=
ase insensitive regexp or just `casecmp =3D=3D 0`.
```
str =3D "Connection"
cmp =3D "connection"
Benchmark.ips do |x|
x.report('/\A\z/i.match?') { /\Afoo\Z/i.match?(str) }
x.report('casecmp?') { cmp.casecmp?(str) }
x.report('casecmp') { cmp.casecmp(str) =3D=3D 0 }
x.compare!
end
Calculating -------------------------------------
/\A\z/i.match? 11.447M (=B1 1.3%) i/s - 57.814M in 5.051489s
casecmp? 6.197M (=B1 0.9%) i/s - 31.138M in 5.025252s
casecmp 12.753M (=B1 1.2%) i/s - 64.636M in 5.069195s
Comparison:
casecmp: 12752791.6 i/s
/\A\z/i.match?: 11446996.1 i/s - 1.11x (=B1 0.00) slower
casecmp?: 6196886.0 i/s - 2.06x (=B1 0.00) slower
```
This is because contrary to the others it tries to be correct in regards to=
unicode case folding.
However there are cases where fast case insentive equality check of known A=
SCII strings is useful. For instance for matching HTTP headers.
This patch check if both strings use a single byte encoding, and if so then=
do a simple iterative comparison with `TOLOWER()`.
This makes casecmp? sligthly faster than `casecmp =3D=3D 0` when both strin=
gs are ASCII.
```
| |compare-ruby|built-ruby|
|:-----------------------|-----------:|---------:|
|casecmp-1 | 11.618M| 10.757M|
| | 1.08x| -|
|casecmp-10 | 1.849M| 1.723M|
| | 1.07x| -|
|casecmp-100 | 204.490k| 186.798k|
| | 1.09x| -|
|casecmp-1000 | 20.413k| 20.184k|
| | 1.01x| -|
|casecmp-nonascii1 | 19.541M| 20.100M|
| | -| 1.03x|
|casecmp-nonascii10 | 19.489M| 19.914M|
| | -| 1.02x|
|casecmp-nonascii100 | 19.479M| 20.155M|
| | -| 1.03x|
|casecmp-nonascii1000 | 19.462M| 20.064M|
| | -| 1.03x|
|casecmp_p-1 | 2.214M| 12.030M|
| | -| 5.43x|
|casecmp_p-10 | 1.373M| 2.150M|
| | -| 1.57x|
|casecmp_p-100 | 249.292k| 231.041k|
| | 1.08x| -|
|casecmp_p-1000 | 16.173k| 23.592k|
| | -| 1.46x|
|casecmp_p-nonascii1 | 651.921k| 650.572k|
| | 1.00x| -|
|casecmp_p-nonascii10 | 108.253k| 109.006k|
| | -| 1.01x|
|casecmp_p-nonascii100 | 11.749k| 11.889k|
| | -| 1.01x|
|casecmp_p-nonascii1000 | 1.140k| 1.138k|
| =
```
-- =
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:[email protected]?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>