|Did you know ...||Search Documentation:|
|Title for pldoc(object(section(2,'4',swi('/doc/packages/nlp.html'))))|
library(isub) implements a similarity measure
between strings, i.e., something similar to the Levenshtein distance.
This method is based on the length of common substrings.
?- isub('E56.Language', 'languange', D, [normalize(true)]). D = 0.4226950354609929. % [-1,1] range ?- isub('E56.Language', 'languange', D, [normalize(true),zero_to_one(true)]). D = 0.7113475177304964. % [0,1] range ?- isub('E56.Language', 'languange', D, ). % without normalization D = 0.19047619047619047. % [-1,1] range ?- isub(aa, aa, D, ). % does not work for short substrings D = -0.8. ?- isub(aa, aa, D, [substring_threshold(0)]). % works with short substrings D = 1.0. % but may give unwanted values % between e.g. 'store' and 'spore'. ?- isub(joe, hoe, D, [substring_threshold(0)]). D = 0.5315315315315314. ?- isub(joe, hoe, D, ). D = -1.0.
This is a new version of isub/4 which replaces the old version while providing backwards compatibility. This new version allows several options to tweak the algorithm.
|Text1||and Text2 are either an atom, string or a list of characters or character codes.|
|Similarity||is a float in the range [-1,1.0], where 1.0 means most similar. The range can be set to [0,1] with the zero_to_one option described below.|
|Options||is a list with elements described
below. Please note that the options are processed at compile time using
goal_expansion to provide much better speed. Supported options are: