If the `python-levenshtein` 3rd-party package is installed, it will improve the calculation time. refs #1426.

This commit is contained in:
Takayuki Shimizukawa 2014-10-05 22:25:50 +09:00
parent ee98decec1
commit fa30e7b52a
3 changed files with 16 additions and 1 deletions

View File

@ -22,6 +22,8 @@ Incompatible changes
method to make the `any` role work properly.
* gettext builder: gettext doesn't emit uuid information to generated pot files
by default. Please set ``True`` to `gettext_uuid` to emit uuid information.
Additionally, if the ``python-levenshtein`` 3rd-party package is installed,
it will improve the calculation time.
* gettext builder: disable extracting/apply 'index' node by default. Please set
'index' to :confval:`gettext_enables` to enable extracting index entries.

View File

@ -430,6 +430,10 @@ documentation on :ref:`intl` for details.
* Calculate similarity between new msgids and previously saved old msgids.
This calculation take many time.
If you need a speed for the calculation, you can use ``python-levenshtein``
3rd-party package written in C by using
:command:`pip install python-levenshtein`.
The default is ``False``.
.. versionadded:: 1.3

View File

@ -16,6 +16,11 @@ from itertools import product
from six import iteritems
from six.moves import range, zip_longest
try:
import Levenshtein
IS_SPEEDUP = True
except ImportError:
IS_SPEEDUP = False
# anything below that ratio is considered equal/changed
VERSIONING_RATIO = 65
@ -109,6 +114,10 @@ def get_ratio(old, new):
"""
if not all([old, new]):
return VERSIONING_RATIO
if IS_SPEEDUP:
return Levenshtein.distance(old, new) / (len(old) / 100.0)
else:
return levenshtein_distance(old, new) / (len(old) / 100.0)