mirror of
https://github.com/neovim/neovim.git
synced 2025-02-25 18:55:25 -06:00
perf(filetype): implement parent pattern pre-matching (#29660)
Problem: calling `vim.filetype.match()` has performance bottleneck in that it has to match a lot of Lua patterns against several versions of input file name. This might be the problem if users need to call it synchronously a lot of times. Solution: add "parent pattern pre-matching" which can be used to quickly reject several potential pattern matches at (usually rare) cost of adding time for one extra Lua pattern match. "Parent pattern" is a manually added/tracked grouping of filetype patterns which should have two properties: - Match at least the same set of strings as its filetype patterns. But not too much more. - Be fast to match. For them to be effective, group should consist from at least three filetype patterns. Example: for a filetpye pattern ".*/etc/a2ps/.*%.cfg", both "/etc/" and "%.cfg" are good parent patterns (prefer the one which can group more filetype patterns). After this commit, `vim.filetype.match()` on most inputs runs ~3.4 times faster (while some inputs may see less impact if they match many parent patterns).
This commit is contained in:
parent
c69ea53c9d
commit
f61efe3fe7
@ -302,4 +302,40 @@ used in new documentation:
|
||||
- `{Only when compiled with ...}`: the vast majority of features have been
|
||||
made non-optional (see https://github.com/neovim/neovim/wiki/Introduction)
|
||||
|
||||
==============================================================================
|
||||
FILETYPE DETECTION *dev-vimpatch-filetype*
|
||||
|
||||
Nvim's filetype detection behavior matches Vim, but is implemented as part of
|
||||
|vim.filetype| (see $VIMRUNTIME/lua/vim/filetype.lua).
|
||||
|
||||
Pattern matching has several differences:
|
||||
- It is done using explicit Lua patterns (without implicit anchoring) instead
|
||||
of Vim regexes: >
|
||||
"*/debian/changelog" -> "/debian/changelog$"
|
||||
"*/bind/db.*" -> "/bind/db%."
|
||||
<
|
||||
- Filetype patterns are grouped by their parent pattern to improve matching
|
||||
performance. For this to work properly, parent pattern should:
|
||||
- Match at least the same set of strings as filetype patterns inside it.
|
||||
But not too much more.
|
||||
- Be fast to match.
|
||||
|
||||
When adding a new filetype with pattern matching, consider the following:
|
||||
- If there is already a group with appropriate parent pattern, use it.
|
||||
- If there can be a fast and specific enough pattern to group at least
|
||||
3 filetype patterns, add it as a separate grouped entry.
|
||||
|
||||
Good new parent pattern should be:
|
||||
- Fast. Good rule of thumb is that it should be a short explicit string
|
||||
(i.e. no quantifiers or character sets).
|
||||
- Specific. Good rules of thumb (from best to worst):
|
||||
- Full directory name (like "/etc/", "/log/").
|
||||
- Part of a rare enough directory name (like "/conf", "git/").
|
||||
- String reasonably rarely used in real full paths (like "nginx").
|
||||
|
||||
Example:
|
||||
- Filetype pattern: ".*/etc/a2ps/.*%.cfg"
|
||||
- Good parent: "/etc/"; "%.cfg$"
|
||||
- Bad parent: "%." - fast, not specific; "/a2ps/.*%." - slow, specific
|
||||
|
||||
vim:tw=78:ts=8:noet:ft=help:norl:
|
||||
|
File diff suppressed because it is too large
Load Diff
Loading…
Reference in New Issue
Block a user