sphinx/tests/test_util_inventory.py

117 lines
4.0 KiB
Python
Raw Normal View History

2022-02-19 21:05:56 -06:00
"""Test inventory util functions."""
import os
import posixpath
2018-02-19 07:39:14 -06:00
import zlib
from io import BytesIO
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
from sphinx.testing.util import SphinxTestApp
from sphinx.util.inventory import InventoryFile
inventory_v1 = b'''\
# Sphinx inventory version 1
# Project: foo
# Version: 1.0
module mod foo.html
module.cls class foo.html
'''
inventory_v2 = b'''\
# Sphinx inventory version 2
# Project: foo
# Version: 2.0
# The remainder of this file is compressed with zlib.
''' + zlib.compress(b'''\
module1 py:module 0 foo.html#module-module1 Long Module desc
module2 py:module 0 foo.html#module-$ -
module1.func py:function 1 sub/foo.html#$ -
module1.Foo.bar py:method 1 index.html#foo.Bar.baz -
CFunc c:function 2 cfunc.html#CFunc -
std cpp:type 1 index.html#std -
std::uint8_t cpp:type 1 index.html#std_uint8_t -
foo::Bar cpp:class 1 index.html#cpp_foo_bar -
foo::Bar::baz cpp:function 1 index.html#cpp_foo_bar_baz -
foons cpp:type 1 index.html#foons -
foons::bartype cpp:type 1 index.html#foons_bartype -
a term std:term -1 glossary.html#term-a-term -
ls.-l std:cmdoption 1 index.html#cmdoption-ls-l -
docname std:doc -1 docname.html -
foo js:module 1 index.html#foo -
foo.bar js:class 1 index.html#foo.bar -
foo.bar.baz js:method 1 index.html#foo.bar.baz -
foo.bar.qux js:data 1 index.html#foo.bar.qux -
a term including:colon std:term -1 glossary.html#term-a-term-including-colon -
''')
inventory_v2_not_having_version = b'''\
# Sphinx inventory version 2
# Project: foo
# Version:
# The remainder of this file is compressed with zlib.
''' + zlib.compress(b'''\
module1 py:module 0 foo.html#module-module1 Long Module desc
''')
def test_read_inventory_v1():
f = BytesIO(inventory_v1)
invdata = InventoryFile.load(f, '/util', posixpath.join)
assert invdata['py:module']['module'] == \
('foo', '1.0', '/util/foo.html#module-module', '-')
assert invdata['py:class']['module.cls'] == \
('foo', '1.0', '/util/foo.html#module.cls', '-')
def test_read_inventory_v2():
f = BytesIO(inventory_v2)
invdata = InventoryFile.load(f, '/util', posixpath.join)
assert len(invdata['py:module']) == 2
assert invdata['py:module']['module1'] == \
('foo', '2.0', '/util/foo.html#module-module1', 'Long Module desc')
assert invdata['py:module']['module2'] == \
('foo', '2.0', '/util/foo.html#module-module2', '-')
assert invdata['py:function']['module1.func'][2] == \
'/util/sub/foo.html#module1.func'
assert invdata['c:function']['CFunc'][2] == '/util/cfunc.html#CFunc'
assert invdata['std:term']['a term'][2] == \
'/util/glossary.html#term-a-term'
assert invdata['std:term']['a term including:colon'][2] == \
'/util/glossary.html#term-a-term-including-colon'
def test_read_inventory_v2_not_having_version():
f = BytesIO(inventory_v2_not_having_version)
invdata = InventoryFile.load(f, '/util', posixpath.join)
assert invdata['py:module']['module1'] == \
('foo', '', '/util/foo.html#module-module1', 'Long Module desc')
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
def _write_appconfig(dir, language, prefix=None):
prefix = prefix or language
os.makedirs(dir / prefix, exist_ok=True)
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
(dir / prefix / 'conf.py').write_text(f'language = "{language}"', encoding='utf8')
(dir / prefix / 'index.rst').write_text('index.rst', encoding='utf8')
assert sorted(os.listdir(dir / prefix)) == ['conf.py', 'index.rst']
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
assert (dir / prefix / 'index.rst').exists()
return dir / prefix
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
def _build_inventory(srcdir):
app = SphinxTestApp(srcdir=srcdir)
app.build()
app.cleanup()
return (app.outdir / 'objects.inv')
def test_inventory_localization(tmp_path):
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
# Build an app using Estonian (EE) locale
srcdir_et = _write_appconfig(tmp_path, "et")
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
inventory_et = _build_inventory(srcdir_et)
# Build the same app using English (US) locale
srcdir_en = _write_appconfig(tmp_path, "en")
Disable localisation when ``SOURCE_DATE_EPOCH`` is set (#10949) This commit disables Sphinx's localisation features when reproducible builds are requested, as determined by a non-empty SOURCE_DATE_EPOCH_ environment variable. The `Reproducible Builds`_ project aims to provide confidence to consumers of packaged software that the artefacts they're downloading and installing have not been altered by the environment they were built in, and can be replicated at a later date if required. Builds of localised documentation using Sphinx currently account for a large category of reproducible build testing failures, because the builders intentionally use varying environment locales at build-time. This can affect the contents of the ``objects.inv`` file. During investigation, it turned out that many ``gettext``-localised values (particularly in Python modules under ``sphinx.domains``) were being translated at module-load-time and would not subsequently be re-localised. This creates two unusual effects: 1. Attempting to write a test case to build the same application in two different languages was not initially possible, as the first-loaded translation catalogue (as found in the ``sphinx.locale.translators`` global variable) would remain in-use for subsequent application builds under different locales. 2. Localisation of strings could vary depending on whether the relevant modules were loaded before or after the resource catalogues were populated. We fix this by performing all translations lazily so that module imports can occur in any order and localisation of inventory entries should occur only when translations of those items are requested. Localisation can then be disabled by configuring the ``gettext`` language to the ISO-639-3 'undetermined' code (``'und'``), as this should not have an associated translation catalogue. We also want to prevent ``gettext`` from attempting to determine the host's locale from environment variables (including ``LANGUAGE``). .. _SOURCE_DATE_EPOCH: https://reproducible-builds.org/docs/source-date-epoch/ .. _Reproducible Builds: https://www.reproducible-builds.org/
2023-04-07 11:49:36 -05:00
inventory_en = _build_inventory(srcdir_en)
# Ensure that the inventory contents differ
assert inventory_et.read_bytes() != inventory_en.read_bytes()