[ruby-core:68329] [Ruby trunk - Feature #10882] Provide Levenshtein distance implementation as part of stdlib

From: shevegen@...
Date: 2015-02-26 15:56:29 UTC
List: ruby-core #68329
Issue #10882 has been updated by Robert A. Heiler.


I would like to see this too - Levensthein distance is used a lot in Bioinf=
ormatics to calculate the edit distance between two sequences (in ruby, thu=
s, two strings) - three operations are Edit, Replace, Delete (+1, -1, chang=
e information at that position).

I have no particular strong feeling about the name (e. g. distance) though =
perhaps it should be

require 'math/distance'

or something like that?

----------------------------------------
Feature #10882: Provide Levenshtein distance implementation as part of stdl=
ib
https://bugs.ruby-lang.org/issues/10882#change-51680

* Author: Yuki Nishijima
* Status: Open
* Priority: Normal
* Assignee:=20
----------------------------------------
[Levenshtein distance algorithm](http://en.wikipedia.org/wiki/Levenshtein_d=
istance) has been used by Rubygems, Bundler, did_you_mean and Rails and I t=
hink it's popular enough to provide it as part of Ruby's stdlib. It still s=
eems a bit too high-level though, but definitely useful (e.g. [adding "did =
you mean?" to rake](https://github.com/ruby/rake/pull/29)).

API-wise, I would like to propose something like the following, but I'm tot=
ally open to hear the core team's opinions as I'm not sue if this is great.

```ruby
require 'distance'

Distance.levenshtein(str1, str2)
```

It would also be interesting to have `#distance` method on `String`:

```ruby
"word".distance("other")
```

which is implemented as:

```ruby
def distance(str, algorithm =3D :levenshtein)
  # calculate the distance here.
end
```

so it can allow to change the algorythm when we add more (e.g. [Jaro=E2=80=
=93Winkler distance](http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_dist=
ance)).




--=20
https://bugs.ruby-lang.org/

In This Thread

Prev Next