vim-patch:8.2.1726: fuzzy matching only works on strings

Problem:    Fuzzy matching only works on strings.
Solution:   Support passing a dict.  Add matchfuzzypos() to also get the match
            positions. (Yegappan Lakshmanan, closes vim/vim#6947)
4f73b8e9cc

Also remove some N/A and seemingly useless NULL checks -- Nvim allocs can't
return NULL. I'm not sure why the retmatchpos stuff in match_fuzzy checks for
NULL too, given that Vim checks for NULL alloc in do_fuzzymatch; assert that the
li stuff is not NULL as that's the one check I'm ever-so-slightly unsure about.

Adjust tests. Note that the text_cb tests actually throw E6000 in Nvim, but we
also can't assert that error due to v8.2.1183 not being ported yet.
This commit is contained in:
Sean Dewar 2022-01-01 16:40:28 +00:00
parent fba00b5e7e
commit 960ea01972
No known key found for this signature in database
GPG Key ID: 08CC2C83AD41B581
8 changed files with 530 additions and 162 deletions

View File

@ -296,7 +296,10 @@ matcharg({nr}) List arguments of |:match|
matchdelete({id} [, {win}]) Number delete match identified by {id}
matchend({expr}, {pat}[, {start}[, {count}]])
Number position where {pat} ends in {expr}
matchfuzzy({list}, {str}) List fuzzy match {str} in {list}
matchfuzzy({list}, {str} [, {dict}])
List fuzzy match {str} in {list}
matchfuzzypos({list}, {str} [, {dict}])
List fuzzy match {str} in {list}
matchlist({expr}, {pat}[, {start}[, {count}]])
List match and submatches of {pat} in {expr}
matchstr({expr}, {pat}[, {start}[, {count}]])
@ -4858,12 +4861,25 @@ matchend({expr}, {pat} [, {start} [, {count}]]) *matchend()*
Can also be used as a |method|: >
GetText()->matchend('word')
matchfuzzy({list}, {str}) *matchfuzzy()*
Returns a list with all the strings in {list} that fuzzy
match {str}. The strings in the returned list are sorted
based on the matching score. {str} is treated as a literal
string and regular expression matching is NOT supported.
The maximum supported {str} length is 256.
matchfuzzy({list}, {str} [, {dict}]) *matchfuzzy()*
If {list} is a list of strings, then returns a list with all
the strings in {list} that fuzzy match {str}. The strings in
the returned list are sorted based on the matching score.
If {list} is a list of dictionaries, then the optional {dict}
argument supports the following items:
key key of the item which is fuzzy matched against
{str}. The value of this item should be a
string.
text_cb |Funcref| that will be called for every item
in {list} to get the text for fuzzy matching.
This should accept a dictionary item as the
argument and return the text for that item to
use for fuzzy matching.
{str} is treated as a literal string and regular expression
matching is NOT supported. The maximum supported {str} length
is 256.
If there are no matching strings or there is an error, then an
empty list is returned. If length of {str} is greater than
@ -4874,11 +4890,38 @@ matchfuzzy({list}, {str}) *matchfuzzy()*
< results in ["clay"]. >
:echo getbufinfo()->map({_, v -> v.name})->matchfuzzy("ndl")
< results in a list of buffer names fuzzy matching "ndl". >
:echo getbufinfo()->matchfuzzy("ndl", {'key' : 'name'})
< results in a list of buffer information dicts with buffer
names fuzzy matching "ndl". >
:echo getbufinfo()->matchfuzzy("spl",
\ {'text_cb' : {v -> v.name}})
< results in a list of buffer information dicts with buffer
names fuzzy matching "spl". >
:echo v:oldfiles->matchfuzzy("test")
< results in a list of file names fuzzy matching "test". >
:let l = readfile("buffer.c")->matchfuzzy("str")
< results in a list of lines in "buffer.c" fuzzy matching "str".
matchfuzzypos({list}, {str} [, {dict}]) *matchfuzzypos()*
Same as |matchfuzzy()|, but returns the list of matched
strings and the list of character positions where characters
in {str} matches.
If {str} matches multiple times in a string, then only the
positions for the best match is returned.
If there are no matching strings or there is an error, then a
list with two empty list items is returned.
Example: >
:echo matchfuzzypos(['testing'], 'tsg')
< results in [['testing'], [[0, 2, 6]]] >
:echo matchfuzzypos(['clay', 'lacy'], 'la')
< results in [['lacy', 'clay'], [[0, 1], [1, 2]]] >
:echo [{'text': 'hello', 'id' : 10}]
\ ->matchfuzzypos('ll', {'key' : 'text'})
< results in [{'id': 10, 'text': 'hello'}] [[2, 3]]
matchlist({expr}, {pat} [, {start} [, {count}]]) *matchlist()*
Same as |match()|, but return a |List|. The first item in the
list is the matched string, same as what matchstr() would

View File

@ -609,6 +609,7 @@ String manipulation: *string-functions*
match() position where a pattern matches in a string
matchend() position where a pattern match ends in a string
matchfuzzy() fuzzy matches a string in a list of strings
matchfuzzypos() fuzzy matches a string in a list of strings
matchstr() match of a pattern in a string
matchstrpos() match and positions of a pattern in a string
matchlist() like matchstr() and also return submatches

View File

@ -249,7 +249,8 @@ return {
matcharg={args=1, base=1},
matchdelete={args={1, 2}, base=1},
matchend={args={2, 4}, base=1},
matchfuzzy={args=2, base=1},
matchfuzzy={args={2, 3}, base=1},
matchfuzzypos={args={2, 3}, base=1},
matchlist={args={2, 4}, base=1},
matchstr={args={2, 4}, base=1},
matchstrpos={args={2,4}, base=1},

View File

@ -98,7 +98,6 @@ PRAGMA_DIAG_POP
#endif
static char *e_listarg = N_("E686: Argument of %s must be a List");
static char *e_listblobarg = N_("E899: Argument of %s must be a List or Blob");
static char *e_invalwindow = N_("E957: Invalid window number");

View File

@ -979,6 +979,7 @@ EXTERN char e_invalidreg[] INIT(= N_("E850: Invalid register name"));
EXTERN char e_dirnotf[] INIT(= N_("E919: Directory not found in '%s': \"%s\""));
EXTERN char e_au_recursive[] INIT(= N_("E952: Autocommand caused recursive behavior"));
EXTERN char e_autocmd_close[] INIT(= N_("E813: Cannot close autocmd window"));
EXTERN char e_listarg[] INIT(= N_("E686: Argument of %s must be a List"));
EXTERN char e_unsupportedoption[] INIT(= N_("E519: Option not supported"));
EXTERN char e_fnametoolong[] INIT(= N_("E856: Filename too long"));
EXTERN char e_float_as_string[] INIT(= N_("E806: using Float as a String"));

View File

@ -48,6 +48,8 @@
#include "nvim/vim.h"
#include "nvim/window.h"
typedef uint32_t matchidx_T;
#ifdef INCLUDE_GENERATED_DECLARATIONS
# include "search.c.generated.h"
#endif
@ -4811,24 +4813,109 @@ the_end:
/// There is not an explicit bonus for an exact match. Unmatched letters receive
/// a penalty. So shorter strings and closer matches are worth more.
typedef struct {
const listitem_T *item;
listitem_T *item;
int score;
list_T *lmatchpos;
} fuzzyItem_T;
static bool fuzzy_match_recursive(const char_u *fuzpat, const char_u *str, int *const outScore,
const char_u *const strBegin, const char_u *const srcMatches,
char_u *const matches, const int maxMatches, int nextMatch,
int *const recursionCount, const int recursionLimit)
FUNC_ATTR_NONNULL_ARG(1, 2, 3, 4, 6, 9) FUNC_ATTR_WARN_UNUSED_RESULT
/// bonus for adjacent matches
#define SEQUENTIAL_BONUS 15
/// bonus if match occurs after a separator
#define SEPARATOR_BONUS 30
/// bonus if match is uppercase and prev is lower
#define CAMEL_BONUS 30
/// bonus if the first letter is matched
#define FIRST_LETTER_BONUS 15
/// penalty applied for every letter in str before the first match
#define LEADING_LETTER_PENALTY -5
/// maximum penalty for leading letters
#define MAX_LEADING_LETTER_PENALTY -15
/// penalty for every letter that doesn't match
#define UNMATCHED_LETTER_PENALTY -1
/// Score for a string that doesn't fuzzy match the pattern
#define SCORE_NONE -9999
#define FUZZY_MATCH_RECURSION_LIMIT 10
/// Maximum number of characters that can be fuzzy matched
#define MAXMATCHES 256
/// Compute a score for a fuzzy matched string. The matching character locations
/// are in 'matches'.
static int fuzzy_match_compute_score(const char_u *const str, const int strSz,
const matchidx_T *const matches, const int numMatches)
FUNC_ATTR_NONNULL_ALL FUNC_ATTR_WARN_UNUSED_RESULT FUNC_ATTR_PURE
{
// Initialize score
int score = 100;
// Apply leading letter penalty
int penalty = LEADING_LETTER_PENALTY * matches[0];
if (penalty < MAX_LEADING_LETTER_PENALTY) {
penalty = MAX_LEADING_LETTER_PENALTY;
}
score += penalty;
// Apply unmatched penalty
const int unmatched = strSz - numMatches;
score += UNMATCHED_LETTER_PENALTY * unmatched;
// Apply ordering bonuses
for (int i = 0; i < numMatches; i++) {
const matchidx_T currIdx = matches[i];
if (i > 0) {
const matchidx_T prevIdx = matches[i - 1];
// Sequential
if (currIdx == prevIdx + 1) {
score += SEQUENTIAL_BONUS;
}
}
// Check for bonuses based on neighbor character value
if (currIdx > 0) {
// Camel case
const char_u *p = str;
int neighbor;
for (matchidx_T sidx = 0; sidx < currIdx; sidx++) {
neighbor = utf_ptr2char(p);
mb_ptr2char_adv(&p);
}
const int curr = utf_ptr2char(p);
if (mb_islower(neighbor) && mb_isupper(curr)) {
score += CAMEL_BONUS;
}
// Separator
const bool neighborSeparator = neighbor == '_' || neighbor == ' ';
if (neighborSeparator) {
score += SEPARATOR_BONUS;
}
} else {
// First letter
score += FIRST_LETTER_BONUS;
}
}
return score;
}
static bool fuzzy_match_recursive(const char_u *fuzpat, const char_u *str, matchidx_T strIdx,
int *const outScore, const char_u *const strBegin,
const int strLen, const matchidx_T *const srcMatches,
matchidx_T *const matches, const int maxMatches, int nextMatch,
int *const recursionCount)
FUNC_ATTR_NONNULL_ARG(1, 2, 4, 5, 8, 11) FUNC_ATTR_WARN_UNUSED_RESULT
{
// Recursion params
bool recursiveMatch = false;
char_u bestRecursiveMatches[256];
matchidx_T bestRecursiveMatches[MAXMATCHES];
int bestRecursiveScore = 0;
// Count recursions
(*recursionCount)++;
if (*recursionCount >= recursionLimit) {
if (*recursionCount >= FUZZY_MATCH_RECURSION_LIMIT) {
return false;
}
@ -4839,119 +4926,59 @@ static bool fuzzy_match_recursive(const char_u *fuzpat, const char_u *str, int *
// Loop through fuzpat and str looking for a match
bool first_match = true;
while (*fuzpat != '\0' && *str != '\0') {
while (*fuzpat != NUL && *str != NUL) {
const int c1 = utf_ptr2char(fuzpat);
const int c2 = utf_ptr2char(str);
// Found match
if (mb_tolower(*fuzpat) == mb_tolower(*str)) {
if (mb_tolower(c1) == mb_tolower(c2)) {
// Supplied matches buffer was too short
if (nextMatch >= maxMatches) {
return false;
}
// "Copy-on-Write" srcMatches into matches
if (first_match && srcMatches) {
memcpy(matches, srcMatches, nextMatch);
if (first_match && srcMatches != NULL) {
memcpy(matches, srcMatches, nextMatch * sizeof(srcMatches[0]));
first_match = false;
}
// Recursive call that "skips" this match
char_u recursiveMatches[256];
matchidx_T recursiveMatches[MAXMATCHES];
int recursiveScore = 0;
if (fuzzy_match_recursive(fuzpat, str + 1, &recursiveScore, strBegin, matches,
recursiveMatches, sizeof(recursiveMatches), nextMatch,
recursionCount, recursionLimit)) {
const char_u *const next_char = str + utfc_ptr2len(str);
if (fuzzy_match_recursive(fuzpat, next_char, strIdx + 1, &recursiveScore, strBegin, strLen,
matches, recursiveMatches,
sizeof(recursiveMatches) / sizeof(recursiveMatches[0]), nextMatch,
recursionCount)) {
// Pick best recursive score
if (!recursiveMatch || recursiveScore > bestRecursiveScore) {
memcpy(bestRecursiveMatches, recursiveMatches, 256);
memcpy(bestRecursiveMatches, recursiveMatches, MAXMATCHES * sizeof(recursiveMatches[0]));
bestRecursiveScore = recursiveScore;
}
recursiveMatch = true;
}
// Advance
matches[nextMatch++] = (char_u)(str - strBegin);
fuzpat++;
matches[nextMatch++] = strIdx;
mb_ptr2char_adv(&fuzpat);
}
str++;
mb_ptr2char_adv(&str);
strIdx++;
}
// Determine if full fuzpat was matched
const bool matched = *fuzpat == '\0';
const bool matched = *fuzpat == NUL;
// Calculate score
if (matched) {
// bonus for adjacent matches
const int sequential_bonus = 15;
// bonus if match occurs after a separator
const int separator_bonus = 30;
// bonus if match is uppercase and prev is lower
const int camel_bonus = 30;
// bonus if the first letter is matched
const int first_letter_bonus = 15;
// penalty applied for every letter in str before the first match
const int leading_letter_penalty = -5;
// maximum penalty for leading letters
const int max_leading_letter_penalty = -15;
// penalty for every letter that doesn't matter
const int unmatched_letter_penalty = -1;
// Iterate str to end
while (*str != '\0') {
str++;
}
// Initialize score
*outScore = 100;
// Apply leading letter penalty
int penalty = leading_letter_penalty * matches[0];
if (penalty < max_leading_letter_penalty) {
penalty = max_leading_letter_penalty;
}
*outScore += penalty;
// Apply unmatched penalty
const int unmatched = (int)(str - strBegin) - nextMatch;
*outScore += unmatched_letter_penalty * unmatched;
// Apply ordering bonuses
for (int i = 0; i < nextMatch; i++) {
const char_u currIdx = matches[i];
if (i > 0) {
const char_u prevIdx = matches[i - 1];
// Sequential
if (currIdx == (prevIdx + 1)) {
*outScore += sequential_bonus;
}
}
// Check for bonuses based on neighbor character value
if (currIdx > 0) {
// Camel case
const char_u neighbor = strBegin[currIdx - 1];
const char_u curr = strBegin[currIdx];
if (islower(neighbor) && isupper(curr)) {
*outScore += camel_bonus;
}
// Separator
const bool neighborSeparator = neighbor == '_' || neighbor == ' ';
if (neighborSeparator) {
*outScore += separator_bonus;
}
} else {
// First letter
*outScore += first_letter_bonus;
}
}
*outScore = fuzzy_match_compute_score(strBegin, strLen, matches, nextMatch);
}
// Return best result
if (recursiveMatch && (!matched || bestRecursiveScore > *outScore)) {
// Recursive score is better than "this"
memcpy(matches, bestRecursiveMatches, maxMatches);
memcpy(matches, bestRecursiveMatches, maxMatches * sizeof(matches[0]));
*outScore = bestRecursiveScore;
return true;
} else if (matched) {
@ -4969,21 +4996,22 @@ static bool fuzzy_match_recursive(const char_u *fuzpat, const char_u *str, int *
/// normalized and varies with pattern.
/// Recursion is limited internally (default=10) to prevent degenerate cases
/// (fuzpat="aaaaaa" str="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa").
/// Uses char_u for match indices. Therefore patterns are limited to 256
/// Uses char_u for match indices. Therefore patterns are limited to MAXMATCHES
/// characters.
///
/// Returns true if fuzpat is found AND calculates a score.
static bool fuzzy_match(const char_u *const str, const char_u *const fuzpat, int *const outScore)
/// @return true if 'fuzpat' matches 'str'. Also returns the match score in
/// 'outScore' and the matching character positions in 'matches'.
static bool fuzzy_match(char_u *const str, const char_u *const fuzpat, int *const outScore,
matchidx_T *const matches, const int maxMatches)
FUNC_ATTR_NONNULL_ALL FUNC_ATTR_WARN_UNUSED_RESULT
{
char_u matches[256];
int recursionCount = 0;
int recursionLimit = 10;
const int len = mb_charlen(str);
*outScore = 0;
return fuzzy_match_recursive(fuzpat, str, outScore, str, NULL, matches, sizeof(matches), 0,
&recursionCount, recursionLimit);
return fuzzy_match_recursive(fuzpat, str, 0, outScore, str, len, NULL, matches, maxMatches, 0,
&recursionCount);
}
/// Sort the fuzzy matches in the descending order of the match score.
@ -4996,70 +5024,188 @@ static int fuzzy_item_compare(const void *const s1, const void *const s2)
return v1 == v2 ? 0 : v1 > v2 ? -1 : 1;
}
/// Fuzzy search the string 'str' in 'strlist' and return the matching strings
/// in 'fmatchlist'.
static void match_fuzzy(const list_T *const strlist, const char_u *const str,
list_T *const fmatchlist)
FUNC_ATTR_NONNULL_ARG(2, 3)
/// Fuzzy search the string 'str' in a list of 'items' and return the matching
/// strings in 'fmatchlist'.
/// If 'items' is a list of strings, then search for 'str' in the list.
/// If 'items' is a list of dicts, then either use 'key' to lookup the string
/// for each item or use 'item_cb' Funcref function to get the string.
/// If 'retmatchpos' is true, then return a list of positions where 'str'
/// matches for each item.
static void match_fuzzy(list_T *const items, char_u *const str, const char_u *const key,
Callback *const item_cb, const bool retmatchpos, list_T *const fmatchlist)
FUNC_ATTR_NONNULL_ARG(2, 4, 6)
{
const long len = tv_list_len(strlist);
const long len = tv_list_len(items);
if (len == 0) {
return;
}
fuzzyItem_T *const ptrs = xmalloc(sizeof(fuzzyItem_T) * len);
fuzzyItem_T *const ptrs = xcalloc(len, sizeof(fuzzyItem_T));
long i = 0;
bool found_match = false;
matchidx_T matches[MAXMATCHES];
// For all the string items in strlist, get the fuzzy matching score
TV_LIST_ITER_CONST(strlist, li, {
// For all the string items in items, get the fuzzy matching score
TV_LIST_ITER(items, li, {
ptrs[i].item = li;
ptrs[i].score = -9999;
// ignore non-string items in the list
ptrs[i].score = SCORE_NONE;
char_u *itemstr = NULL;
typval_T rettv;
rettv.v_type = VAR_UNKNOWN;
const typval_T *const tv = TV_LIST_ITEM_TV(li);
if (tv->v_type == VAR_STRING && tv->vval.v_string != NULL) {
int score;
if (fuzzy_match(tv->vval.v_string, str, &score)) {
ptrs[i].score = score;
found_match = true;
if (tv->v_type == VAR_STRING) { // list of strings
itemstr = tv->vval.v_string;
} else if (tv->v_type == VAR_DICT && (key != NULL || item_cb->type != kCallbackNone)) {
// For a dict, either use the specified key to lookup the string or
// use the specified callback function to get the string.
if (key != NULL) {
itemstr = (char_u *)tv_dict_get_string(tv->vval.v_dict, (const char *)key, false);
} else {
typval_T argv[2];
// Invoke the supplied callback (if any) to get the dict item
tv->vval.v_dict->dv_refcount++;
argv[0].v_type = VAR_DICT;
argv[0].vval.v_dict = tv->vval.v_dict;
argv[1].v_type = VAR_UNKNOWN;
if (callback_call(item_cb, 1, argv, &rettv)) {
if (rettv.v_type == VAR_STRING) {
itemstr = rettv.vval.v_string;
}
}
tv_dict_unref(tv->vval.v_dict);
}
}
int score;
if (itemstr != NULL
&& fuzzy_match(itemstr, str, &score, matches, sizeof(matches) / sizeof(matches[0]))) {
// Copy the list of matching positions in itemstr to a list, if
// 'retmatchpos' is set.
if (retmatchpos) {
const int strsz = mb_charlen(str);
ptrs[i].lmatchpos = tv_list_alloc(strsz);
for (int j = 0; j < strsz; j++) {
tv_list_append_number(ptrs[i].lmatchpos, matches[j]);
}
}
ptrs[i].score = score;
found_match = true;
}
i++;
tv_clear(&rettv);
});
if (found_match) {
// Sort the list by the descending order of the match score
qsort(ptrs, (size_t)len, sizeof(fuzzyItem_T), fuzzy_item_compare);
qsort(ptrs, len, sizeof(fuzzyItem_T), fuzzy_item_compare);
// Copy the matching strings with 'score != -9999' to the return list
// For matchfuzzy(), return a list of matched strings.
// ['str1', 'str2', 'str3']
// For matchfuzzypos(), return a list with two items.
// The first item is a list of matched strings. The second item
// is a list of lists where each list item is a list of matched
// character positions.
// [['str1', 'str2', 'str3'], [[1, 3], [1, 3], [1, 3]]]
list_T *l;
if (retmatchpos) {
const listitem_T *const li = tv_list_find(fmatchlist, 0);
assert(li != NULL && TV_LIST_ITEM_TV(li)->vval.v_list != NULL);
l = TV_LIST_ITEM_TV(li)->vval.v_list;
} else {
l = fmatchlist;
}
// Copy the matching strings with a valid score to the return list
for (i = 0; i < len; i++) {
if (ptrs[i].score == -9999) {
if (ptrs[i].score == SCORE_NONE) {
break;
}
const typval_T *const tv = TV_LIST_ITEM_TV(ptrs[i].item);
tv_list_append_string(fmatchlist, (const char *)tv->vval.v_string, -1);
tv_list_append_tv(l, TV_LIST_ITEM_TV(ptrs[i].item));
}
// next copy the list of matching positions
if (retmatchpos) {
const listitem_T *const li = tv_list_find(fmatchlist, -1);
assert(li != NULL && TV_LIST_ITEM_TV(li)->vval.v_list != NULL);
l = TV_LIST_ITEM_TV(li)->vval.v_list;
for (i = 0; i < len; i++) {
if (ptrs[i].score == SCORE_NONE) {
break;
}
tv_list_append_list(l, ptrs[i].lmatchpos);
}
}
}
xfree(ptrs);
}
/// "matchfuzzy()" function
void f_matchfuzzy(typval_T *argvars, typval_T *rettv, FunPtr fptr)
/// Do fuzzy matching. Returns the list of matched strings in 'rettv'.
/// If 'retmatchpos' is true, also returns the matching character positions.
static void do_fuzzymatch(const typval_T *const argvars, typval_T *const rettv,
const bool retmatchpos)
FUNC_ATTR_NONNULL_ALL
{
if (argvars[0].v_type != VAR_LIST) {
emsg(_(e_listreq));
return;
}
if (argvars[0].vval.v_list == NULL) {
// validate and get the arguments
if (argvars[0].v_type != VAR_LIST || argvars[0].vval.v_list == NULL) {
semsg(_(e_listarg), retmatchpos ? "matchfuzzypos()" : "matchfuzzy()");
return;
}
if (argvars[1].v_type != VAR_STRING || argvars[1].vval.v_string == NULL) {
semsg(_(e_invarg2), tv_get_string(&argvars[1]));
return;
}
match_fuzzy(argvars[0].vval.v_list, (const char_u *)tv_get_string(&argvars[1]),
tv_list_alloc_ret(rettv, kListLenUnknown));
Callback cb = CALLBACK_NONE;
const char_u *key = NULL;
if (argvars[2].v_type != VAR_UNKNOWN) {
if (argvars[2].v_type != VAR_DICT || argvars[2].vval.v_dict == NULL) {
emsg(_(e_dictreq));
return;
}
// To search a dict, either a callback function or a key can be
// specified.
dict_T *const d = argvars[2].vval.v_dict;
const dictitem_T *const di = tv_dict_find(d, "key", -1);
if (di != NULL) {
if (di->di_tv.v_type != VAR_STRING || di->di_tv.vval.v_string == NULL
|| *di->di_tv.vval.v_string == NUL) {
semsg(_(e_invarg2), tv_get_string(&di->di_tv));
return;
}
key = (const char_u *)tv_get_string(&di->di_tv);
} else if (!tv_dict_get_callback(d, "text_cb", -1, &cb)) {
semsg(_(e_invargval), "text_cb");
return;
}
}
// get the fuzzy matches
tv_list_alloc_ret(rettv, retmatchpos ? 2 : kListLenUnknown);
if (retmatchpos) {
// For matchfuzzypos(), a list with two items are returned. First item
// is a list of matching strings and the second item is a list of
// lists with matching positions within each string.
tv_list_append_list(rettv->vval.v_list, tv_list_alloc(kListLenUnknown));
tv_list_append_list(rettv->vval.v_list, tv_list_alloc(kListLenUnknown));
}
match_fuzzy(argvars[0].vval.v_list, (char_u *)tv_get_string(&argvars[1]), key, &cb, retmatchpos,
rettv->vval.v_list);
callback_free(&cb);
}
/// "matchfuzzy()" function
void f_matchfuzzy(typval_T *argvars, typval_T *rettv, FunPtr fptr)
{
do_fuzzymatch(argvars, rettv, false);
}
/// "matchfuzzypos()" function
void f_matchfuzzypos(typval_T *argvars, typval_T *rettv, FunPtr fptr)
{
do_fuzzymatch(argvars, rettv, true);
}
/// Find identifiers or defines in included files.

View File

@ -1731,32 +1731,6 @@ func Test_nr2char()
call assert_equal("\x80\xfc\b\xfd\x80\xfeX\x80\xfeX\x80\xfeX\x80\xfeX\x80\xfeX", eval('"\<M-' .. nr2char(0x40000000) .. '>"'))
endfunc
" Test for matchfuzzy()
func Test_matchfuzzy()
call assert_fails('call matchfuzzy(10, "abc")', 'E714:')
" Needs v8.2.1183.
" call assert_fails('call matchfuzzy(["abc"], [])', 'E730:')
call assert_fails('call matchfuzzy(["abc"], [])', 'E475:')
call assert_equal([], matchfuzzy([], 'abc'))
call assert_equal([], matchfuzzy(['abc'], ''))
call assert_equal(['abc'], matchfuzzy(['abc', 10], 'ac'))
call assert_equal([], matchfuzzy([10, 20], 'ac'))
call assert_equal(['abc'], matchfuzzy(['abc'], 'abc'))
call assert_equal(['crayon', 'camera'], matchfuzzy(['camera', 'crayon'], 'cra'))
call assert_equal(['aabbaa', 'aaabbbaaa', 'aaaabbbbaaaa', 'aba'], matchfuzzy(['aba', 'aabbaa', 'aaabbbaaa', 'aaaabbbbaaaa'], 'aa'))
call assert_equal(['one'], matchfuzzy(['one', 'two'], 'one'))
call assert_equal(['oneTwo', 'onetwo'], matchfuzzy(['onetwo', 'oneTwo'], 'oneTwo'))
call assert_equal(['one_two', 'onetwo'], matchfuzzy(['onetwo', 'one_two'], 'oneTwo'))
call assert_equal(['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'], matchfuzzy(['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'], 'aa'))
call assert_equal([], matchfuzzy([repeat('a', 300)], repeat('a', 257)))
%bw!
eval ['somebuf', 'anotherone', 'needle', 'yetanotherone']->map({_, v -> bufadd(v) + bufload(v)})
let l = getbufinfo()->map({_, v -> v.name})->matchfuzzy('ndl')
call assert_equal(1, len(l))
call assert_match('needle', l[0])
endfunc
" Test for getcurpos() and setpos()
func Test_getcurpos_setpos()
new

View File

@ -0,0 +1,203 @@
" Tests for fuzzy matching
source shared.vim
source check.vim
" Test for matchfuzzy()
func Test_matchfuzzy()
call assert_fails('call matchfuzzy(10, "abc")', 'E686:')
" Needs v8.2.1183; match the final error that's thrown for now
" call assert_fails('call matchfuzzy(["abc"], [])', 'E730:')
call assert_fails('call matchfuzzy(["abc"], [])', 'E475:')
call assert_fails("let x = matchfuzzy(v:_null_list, 'foo')", 'E686:')
call assert_fails('call matchfuzzy(["abc"], v:_null_string)', 'E475:')
call assert_equal([], matchfuzzy([], 'abc'))
call assert_equal([], matchfuzzy(['abc'], ''))
call assert_equal(['abc'], matchfuzzy(['abc', 10], 'ac'))
call assert_equal([], matchfuzzy([10, 20], 'ac'))
call assert_equal(['abc'], matchfuzzy(['abc'], 'abc'))
call assert_equal(['crayon', 'camera'], matchfuzzy(['camera', 'crayon'], 'cra'))
call assert_equal(['aabbaa', 'aaabbbaaa', 'aaaabbbbaaaa', 'aba'], matchfuzzy(['aba', 'aabbaa', 'aaabbbaaa', 'aaaabbbbaaaa'], 'aa'))
call assert_equal(['one'], matchfuzzy(['one', 'two'], 'one'))
call assert_equal(['oneTwo', 'onetwo'], matchfuzzy(['onetwo', 'oneTwo'], 'oneTwo'))
call assert_equal(['one_two', 'onetwo'], matchfuzzy(['onetwo', 'one_two'], 'oneTwo'))
call assert_equal(['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'], matchfuzzy(['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'], 'aa'))
call assert_equal(256, matchfuzzy([repeat('a', 256)], repeat('a', 256))[0]->len())
call assert_equal([], matchfuzzy([repeat('a', 300)], repeat('a', 257)))
" Tests for match preferences
" preference for camel case match
call assert_equal(['oneTwo', 'onetwo'], ['onetwo', 'oneTwo']->matchfuzzy('onetwo'))
" preference for match after a separator (_ or space)
call assert_equal(['one_two', 'one two', 'onetwo'], ['onetwo', 'one_two', 'one two']->matchfuzzy('onetwo'))
" preference for leading letter match
call assert_equal(['onetwo', 'xonetwo'], ['xonetwo', 'onetwo']->matchfuzzy('onetwo'))
" preference for sequential match
call assert_equal(['onetwo', 'oanbectdweo'], ['oanbectdweo', 'onetwo']->matchfuzzy('onetwo'))
" non-matching leading letter(s) penalty
call assert_equal(['xonetwo', 'xxonetwo'], ['xxonetwo', 'xonetwo']->matchfuzzy('onetwo'))
" total non-matching letter(s) penalty
call assert_equal(['one', 'onex', 'onexx'], ['onexx', 'one', 'onex']->matchfuzzy('one'))
%bw!
eval ['somebuf', 'anotherone', 'needle', 'yetanotherone']->map({_, v -> bufadd(v) + bufload(v)})
let l = getbufinfo()->map({_, v -> v.name})->matchfuzzy('ndl')
call assert_equal(1, len(l))
call assert_match('needle', l[0])
let l = [{'id' : 5, 'val' : 'crayon'}, {'id' : 6, 'val' : 'camera'}]
call assert_equal([{'id' : 6, 'val' : 'camera'}], matchfuzzy(l, 'cam', {'text_cb' : {v -> v.val}}))
call assert_equal([{'id' : 6, 'val' : 'camera'}], matchfuzzy(l, 'cam', {'key' : 'val'}))
call assert_equal([], matchfuzzy(l, 'day', {'text_cb' : {v -> v.val}}))
call assert_equal([], matchfuzzy(l, 'day', {'key' : 'val'}))
call assert_fails("let x = matchfuzzy(l, 'cam', 'random')", 'E715:')
call assert_equal([], matchfuzzy(l, 'day', {'text_cb' : {v -> []}}))
call assert_equal([], matchfuzzy(l, 'day', {'text_cb' : {v -> 1}}))
call assert_fails("let x = matchfuzzy(l, 'day', {'text_cb' : {a, b -> 1}})", 'E119:')
call assert_equal([], matchfuzzy(l, 'cam'))
" Nvim's callback implementation is different, so E6000 is expected instead,
" but we need v8.2.1183 to assert it
" call assert_fails("let x = matchfuzzy(l, 'cam', {'text_cb' : []})", 'E921:')
" call assert_fails("let x = matchfuzzy(l, 'cam', {'text_cb' : []})", 'E6000:')
call assert_fails("let x = matchfuzzy(l, 'cam', {'text_cb' : []})", 'E475:')
" call assert_fails("let x = matchfuzzy(l, 'foo', {'key' : []})", 'E730:')
call assert_fails("let x = matchfuzzy(l, 'foo', {'key' : []})", 'E475:')
call assert_fails("let x = matchfuzzy(l, 'cam', v:_null_dict)", 'E715:')
call assert_fails("let x = matchfuzzy(l, 'foo', {'key' : v:_null_string})", 'E475:')
" Nvim doesn't have null functions
" call assert_fails("let x = matchfuzzy(l, 'foo', {'text_cb' : test_null_function()})", 'E475:')
let l = [{'id' : 5, 'name' : 'foo'}, {'id' : 6, 'name' : []}, {'id' : 7}]
call assert_fails("let x = matchfuzzy(l, 'foo', {'key' : 'name'})", 'E730:')
" Test in latin1 encoding
let save_enc = &encoding
" Nvim supports utf-8 encoding only
" set encoding=latin1
call assert_equal(['abc'], matchfuzzy(['abc'], 'abc'))
let &encoding = save_enc
endfunc
" Test for the fuzzymatchpos() function
func Test_matchfuzzypos()
call assert_equal([['curl', 'world'], [[2,3], [2,3]]], matchfuzzypos(['world', 'curl'], 'rl'))
call assert_equal([['curl', 'world'], [[2,3], [2,3]]], matchfuzzypos(['world', 'one', 'curl'], 'rl'))
call assert_equal([['hello', 'hello world hello world'],
\ [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]],
\ matchfuzzypos(['hello world hello world', 'hello', 'world'], 'hello'))
call assert_equal([['aaaaaaa'], [[0, 1, 2]]], matchfuzzypos(['aaaaaaa'], 'aaa'))
call assert_equal([[], []], matchfuzzypos(['world', 'curl'], 'ab'))
let x = matchfuzzypos([repeat('a', 256)], repeat('a', 256))
call assert_equal(range(256), x[1][0])
call assert_equal([[], []], matchfuzzypos([repeat('a', 300)], repeat('a', 257)))
call assert_equal([[], []], matchfuzzypos([], 'abc'))
" match in a long string
call assert_equal([[repeat('x', 300) .. 'abc'], [[300, 301, 302]]],
\ matchfuzzypos([repeat('x', 300) .. 'abc'], 'abc'))
" preference for camel case match
call assert_equal([['xabcxxaBc'], [[6, 7, 8]]], matchfuzzypos(['xabcxxaBc'], 'abc'))
" preference for match after a separator (_ or space)
call assert_equal([['xabx_ab'], [[5, 6]]], matchfuzzypos(['xabx_ab'], 'ab'))
" preference for leading letter match
call assert_equal([['abcxabc'], [[0, 1]]], matchfuzzypos(['abcxabc'], 'ab'))
" preference for sequential match
call assert_equal([['aobncedone'], [[7, 8, 9]]], matchfuzzypos(['aobncedone'], 'one'))
" best recursive match
call assert_equal([['xoone'], [[2, 3, 4]]], matchfuzzypos(['xoone'], 'one'))
let l = [{'id' : 5, 'val' : 'crayon'}, {'id' : 6, 'val' : 'camera'}]
call assert_equal([[{'id' : 6, 'val' : 'camera'}], [[0, 1, 2]]],
\ matchfuzzypos(l, 'cam', {'text_cb' : {v -> v.val}}))
call assert_equal([[{'id' : 6, 'val' : 'camera'}], [[0, 1, 2]]],
\ matchfuzzypos(l, 'cam', {'key' : 'val'}))
call assert_equal([[], []], matchfuzzypos(l, 'day', {'text_cb' : {v -> v.val}}))
call assert_equal([[], []], matchfuzzypos(l, 'day', {'key' : 'val'}))
call assert_fails("let x = matchfuzzypos(l, 'cam', 'random')", 'E715:')
call assert_equal([[], []], matchfuzzypos(l, 'day', {'text_cb' : {v -> []}}))
call assert_equal([[], []], matchfuzzypos(l, 'day', {'text_cb' : {v -> 1}}))
call assert_fails("let x = matchfuzzypos(l, 'day', {'text_cb' : {a, b -> 1}})", 'E119:')
call assert_equal([[], []], matchfuzzypos(l, 'cam'))
" Nvim's callback implementation is different, so E6000 is expected instead,
" but we need v8.2.1183 to assert it
" call assert_fails("let x = matchfuzzypos(l, 'cam', {'text_cb' : []})", 'E921:')
" call assert_fails("let x = matchfuzzypos(l, 'cam', {'text_cb' : []})", 'E6000:')
call assert_fails("let x = matchfuzzypos(l, 'cam', {'text_cb' : []})", 'E475:')
" call assert_fails("let x = matchfuzzypos(l, 'foo', {'key' : []})", 'E730:')
call assert_fails("let x = matchfuzzypos(l, 'foo', {'key' : []})", 'E475:')
call assert_fails("let x = matchfuzzypos(l, 'cam', v:_null_dict)", 'E715:')
call assert_fails("let x = matchfuzzypos(l, 'foo', {'key' : v:_null_string})", 'E475:')
" Nvim doesn't have null functions
" call assert_fails("let x = matchfuzzypos(l, 'foo', {'text_cb' : test_null_function()})", 'E475:')
let l = [{'id' : 5, 'name' : 'foo'}, {'id' : 6, 'name' : []}, {'id' : 7}]
call assert_fails("let x = matchfuzzypos(l, 'foo', {'key' : 'name'})", 'E730:')
endfunc
func Test_matchfuzzy_mbyte()
CheckFeature multi_lang
call assert_equal(['ンヹㄇヺヴ'], matchfuzzy(['ンヹㄇヺヴ'], 'ヹヺ'))
" reverse the order of characters
call assert_equal([], matchfuzzy(['ンヹㄇヺヴ'], 'ヺヹ'))
call assert_equal(['αβΩxxx', 'xαxβxΩx'],
\ matchfuzzy(['αβΩxxx', 'xαxβxΩx'], 'αβΩ'))
call assert_equal(['ππbbππ', 'πππbbbπππ', 'ππππbbbbππππ', 'πbπ'],
\ matchfuzzy(['πbπ', 'ππbbππ', 'πππbbbπππ', 'ππππbbbbππππ'], 'ππ'))
" preference for camel case match
call assert_equal(['oneĄwo', 'oneąwo'],
\ ['oneąwo', 'oneĄwo']->matchfuzzy('oneąwo'))
" preference for match after a separator (_ or space)
call assert_equal(['Ⅱa_bㄟㄠ', 'Ⅱa bㄟㄠ', 'Ⅱabㄟㄠ'],
\ ['Ⅱabㄟㄠ', 'Ⅱa_bㄟㄠ', 'Ⅱa bㄟㄠ']->matchfuzzy('Ⅱabㄟㄠ'))
" preference for leading letter match
call assert_equal(['ŗŝţũŵż', 'xŗŝţũŵż'],
\ ['xŗŝţũŵż', 'ŗŝţũŵż']->matchfuzzy('ŗŝţũŵż'))
" preference for sequential match
call assert_equal(['ㄞㄡㄤfffifl', 'ㄞaㄡbㄤcffdfiefl'],
\ ['ㄞaㄡbㄤcffdfiefl', 'ㄞㄡㄤfffifl']->matchfuzzy('ㄞㄡㄤfffifl'))
" non-matching leading letter(s) penalty
call assert_equal(['xㄞㄡㄤfffifl', 'xxㄞㄡㄤfffifl'],
\ ['xxㄞㄡㄤfffifl', 'xㄞㄡㄤfffifl']->matchfuzzy('ㄞㄡㄤfffifl'))
" total non-matching letter(s) penalty
call assert_equal(['ŗŝţ', 'ŗŝţx', 'ŗŝţxx'],
\ ['ŗŝţxx', 'ŗŝţ', 'ŗŝţx']->matchfuzzy('ŗŝţ'))
endfunc
func Test_matchfuzzypos_mbyte()
CheckFeature multi_lang
call assert_equal([['こんにちは世界'], [[0, 1, 2, 3, 4]]],
\ matchfuzzypos(['こんにちは世界'], 'こんにちは'))
call assert_equal([['ンヹㄇヺヴ'], [[1, 3]]], matchfuzzypos(['ンヹㄇヺヴ'], 'ヹヺ'))
" reverse the order of characters
call assert_equal([[], []], matchfuzzypos(['ンヹㄇヺヴ'], 'ヺヹ'))
call assert_equal([['αβΩxxx', 'xαxβxΩx'], [[0, 1, 2], [1, 3, 5]]],
\ matchfuzzypos(['αβΩxxx', 'xαxβxΩx'], 'αβΩ'))
call assert_equal([['ππbbππ', 'πππbbbπππ', 'ππππbbbbππππ', 'πbπ'],
\ [[0, 1], [0, 1], [0, 1], [0, 2]]],
\ matchfuzzypos(['πbπ', 'ππbbππ', 'πππbbbπππ', 'ππππbbbbππππ'], 'ππ'))
call assert_equal([['ααααααα'], [[0, 1, 2]]],
\ matchfuzzypos(['ααααααα'], 'ααα'))
call assert_equal([[], []], matchfuzzypos(['ンヹㄇ', 'ŗŝţ'], 'fffifl'))
let x = matchfuzzypos([repeat('Ψ', 256)], repeat('Ψ', 256))
call assert_equal(range(256), x[1][0])
call assert_equal([[], []], matchfuzzypos([repeat('✓', 300)], repeat('✓', 257)))
" match in a long string
call assert_equal([[repeat('♪', 300) .. '✗✗✗'], [[300, 301, 302]]],
\ matchfuzzypos([repeat('♪', 300) .. '✗✗✗'], '✗✗✗'))
" preference for camel case match
call assert_equal([['xѳѵҁxxѳѴҁ'], [[6, 7, 8]]], matchfuzzypos(['xѳѵҁxxѳѴҁ'], 'ѳѵҁ'))
" preference for match after a separator (_ or space)
call assert_equal([['xちだx_ちだ'], [[5, 6]]], matchfuzzypos(['xちだx_ちだ'], 'ちだ'))
" preference for leading letter match
call assert_equal([[ѵҁxѳѵҁ'], [[0, 1]]], matchfuzzypos([ѵҁxѳѵҁ'], 'ѳѵ'))
" preference for sequential match
call assert_equal([['aンbヹcㄇdンヹㄇ'], [[7, 8, 9]]], matchfuzzypos(['aンbヹcㄇdンヹㄇ'], 'ンヹㄇ'))
" best recursive match
call assert_equal([['xффйд'], [[2, 3, 4]]], matchfuzzypos(['xффйд'], 'фйд'))
endfunc
" vim: shiftwidth=2 sts=2 expandtab