If the `python-levenshtein` 3rd-party package is installed, it will improve the calculation time. refs #1426.

2025-02-25 18:55:22 -06:00 · 2014-10-05 22:25:50 +09:00 · 2014-10-05 22:25:50 +09:00 · fa30e7b52a
commit fa30e7b52a
parent ee98decec1
3 changed files with 16 additions and 1 deletions
--- a/2
+++ b/2
@ -22,6 +22,8 @@ Incompatible changes
  method to make the `any` role work properly.
 * gettext builder: gettext doesn't emit uuid information to generated pot files
  by default. Please set ``True`` to `gettext_uuid` to emit uuid information.
+  Additionally, if the ``python-levenshtein`` 3rd-party package is installed,
+  it will improve the calculation time.
 * gettext builder: disable extracting/apply 'index' node by default. Please set
  'index' to :confval:`gettext_enables` to enable extracting index entries.

--- a/doc/config.rst
+++ b/doc/config.rst
@ -430,6 +430,10 @@ documentation on :ref:`intl` for details.
   * Calculate similarity between new msgids and previously saved old msgids.
     This calculation take many time.

+   If you need a speed for the calculation, you can use ``python-levenshtein``
+   3rd-party package written in C by using
+   :command:`pip install python-levenshtein`.
+
   The default is ``False``.

   .. versionadded:: 1.3
--- a/sphinx/versioning.py
+++ b/sphinx/versioning.py
@ -16,6 +16,11 @@ from itertools import product
 from six import iteritems
 from six.moves import range, zip_longest

+try:
+    import Levenshtein
+    IS_SPEEDUP = True
+except ImportError:
+    IS_SPEEDUP = False

 # anything below that ratio is considered equal/changed
 VERSIONING_RATIO = 65
@ -109,6 +114,10 @@ def get_ratio(old, new):
    """
    if not all([old, new]):
        return VERSIONING_RATIO
+
+    if IS_SPEEDUP:
+        return Levenshtein.distance(old, new) / (len(old) / 100.0)
+    else:
        return levenshtein_distance(old, new) / (len(old) / 100.0)