The .po files we use for translations have two shortcomings when used in Git:
- They include file locations, which change each time the source is updated.
This results in large, unreadable diffs that don't merge well.
- They include source strings for untranslated messages, wasting space
unnecessarily.
Update the Makefile so that the extraneous information is stripped when the
files are updated or pulled form Transifex, and empty translation files are
removed entirely.
Also, translations are normalized to a common style. This should help diffs
and merges.
The validator requires file location comments to identify the programming
language, and to produce good error reports.
To make this work, merge the comments in before validation.
First patch for: https://fedorahosted.org/freeipa/ticket/2435
* Add bootstrap-autogen depdenency to lint target to force
generated files to be created.
* Add validate-src-strings to lint rules
* Add validate-src-strings as dependency to lint targett
* Remove obsolete test_lang frm test target
* Add diagnostic message to validation command in i18n.py
that outputs how many objects were scanned. Formerly it only
output a message if there were errors. This made it impossible to
distinguish an empty file from one with no errors.
* While adding the validation counts it was discovered plurals had
been omitted for some of the validation checks. Added the missing
checks for plural forms.
* Also distinguished between errors and warnings. Permit warnings to
be emitted but do not fail the validatition unless actual errors
were also detected.
We use custom gettext classes (e.g. GettextFactory &
NGettextFactory). We should exercise those classes with an installed
binary mo file to demonstrate we are actually returning the expected
translated strings for all strings defined as being translatable.
The test logic in install/po/test_i18n.py was recently enhanced to
make this type of testing easier and more complete.
tests/test_ipalib/test_text.py should import the new i18n test support
and run it.
Previously tests/test_ipalib/test_text.py made a feeble but incomplete
attempt to do the above but even that was often not run because the
test would skip because the necessary test files were not available
unless they had been manually created in the install/po subdir. It is
now possible to correct those deficiencies in the test.
This patch does the following:
* Moves the location of i18n test code and adjust references to it.
install/po/test_i18n.py was moved to tests/i18n.py. This permits
tests/test_ipalib/test_text.py to import the i18n test utilities
in a clean fashion. The Makefile in install/po now calls this
same file.
* Modfies test function in test_i18n.py to accept function pointers
for retreiving a translation.
* Imports test_i18n.py from the install/po directory in the tree
* Creates a tmp directory for the test localedir
* Parses the current ipa.pot file in install/po and generates
a test po and mo file with special unicode markers. It installs
the test mo file in the tmp localedir. This is accomplished by
calling create_po() from the test_i18n.py file.
* If any of the above does not work it raises nose.SkipTest with
the reason, and skips the test.
* It sets up functions to get a translation and a plural translation
via our text.GettextFactory class and text.NGettextFactory class
respectively. This are the functions we use intenally to get
translations. It set the localdir and lang which are used by those
classes to match our test configuration. It then runs a validation
test on every translation and it's plural found in the test.po file
by calling po_file_iterate and passed it the function pointers to
our internal routines.
* At the conclusion of the test it cleans up after itself.
Note: extraneous files are not created in the tree, only a tmp
directory is utilized.
Validating msgid's in C code was insufficient.
* Make the discovery of format conversions much more robust by authoring
a new function parse_printf_fmt() that is able to discover each
format conversion in a string and break it into it's individual
subparts. One of those subparts is the argument selector index. In c
code we need to know if the argumenet selector index is present to
know if translator can reorder the substitution strings.
This replaces the simplistic python_anonymous_substitutions_regexp
which was insufficient to deal with other programming languages
(e.g. c).
* Add get_prog_langs() function to return the set of programming
languages a msgid appears in. This is necessar because the msdid
validation is programming language specific.
https://fedorahosted.org/freeipa/ticket/2582
We had been using shell scripts and sed to test our translations. But
trying to edit pot and po files with sed is nearly impossible because
the file format can vary significantly and the sed editing was failing
and gettext tools were complaining about our test strategy. We had
been using a Python script (test_i18n.py) to perform the actual test
after using shell, sed, and gettext tools to create the files. There
is a Python library (polib) which can read/write/edit pot/po/mo files
(used internally by Transifex, our translation portal). The strategy
now is to do everything in Python (in test_i18n.py). This is easier,
more robust and allows us to do more things.
* add python-polib to BuildRequires
* Remove the logic for creating the test lang from Makefile.in and
replace it with calls to test_i18n.py
* add argument parsing, usage, configuration parameters, etc. to
test_i18n.py to make it easier to use and configurable.
* add function to generate a test po and mo file. It also
writes the files and creates the test directory structure.
* Took the existing validate code and refactored it into validation
function. It used to just pick one string and test it, now it
iterates over all strings and all plural forms.
* Validate anonymous Python format substitutions in pot file
* added support for plural forms.
* Add pot po file validation for variable substitution
* In install/po subdir you can now do:
$ make test
$ make validate-pot
$ make validate-po
* The options for running test_i18n.py are:
$ ./test_i18n.py --help
Usage:
test_i18n.py --test-gettext
test_i18n.py --create-test
test_i18n.py --validate-pot [pot_file1, ...]
test_i18n.py --validate-po po_file1 [po_file2, ...]
Options:
-h, --help show this help message and exit
-s, --show-strings show the offending string when an error is detected
--pedantic be aggressive when validating
-v, --verbose be informative
--traceback print the traceback when an exception occurs
Operational Mode:
You must select one these modes to run in
-g, --test-gettext create the test translation file(s) and exercise them
-c, --create-test create the test translation file(s)
-P, --validate-pot validate pot file(s)
-p, --validate-po validate po file(s)
Run Time Parameters:
These may be used to modify the run time defaults
--test-lang=TEST_LANG
test po file uses this as it's basename (default=test)
--lang=LANG lang used for locale, MUST be a valid lang
(default=xh_ZA)
--domain=DOMAIN translation domain used during test (default=ipa)
--locale=LOCALE locale used during test (default=test_locale)
--pot-file=POT_FILE
default pot file, used when validating pot file or
generating test po and mo files (default=ipa.pot)
https://fedorahosted.org/freeipa/ticket/2044
This patch reverts the use of pygettext for i18n string extraction. It
was originally introduced because the help documentation for commands
are in the class docstring and module docstring.
Docstrings are a Python construct whereby any string which immediately
follows a class declaration, function/method declaration or appears
first in a module is taken to be the documentation for that
object. Python automatically assigns that string to the __doc__
variable associated with the object. Explicitly assigning to the
__doc__ variable is equivalent and permitted.
We mark strings in the source for i18n translation by embedding them
in _() or ngettext(). Specialized extraction tools (e.g. xgettext)
scan the source code looking for strings with those markers and
extracts the string for inclusion in a translation catalog.
It was mistakingly assumed one could not mark for translation Python
docstrings. Since some docstrings are vital for our command help
system some method had to be devised to extract docstrings for the
translation catalog. pygettext has the ability to locate and extract
docstrings and it was introduced to acquire the documentation for our
commands located in module and class docstrings.
However pygettext was too large a hammer for this task, it lacked any
fined grained ability to extract only the docstrings we were
interested in. In practice it extracted EVERY docstring in each file
it was presented with. This caused a large number strings to be
extracted for translation which had no reason to be translated, the
string might have been internal code documentation never meant to be
seen by users. Often the superfluous docstrings were long, complex and
likely difficult to translate. This placed an unnecessary burden on
our volunteer translators.
Instead what is needed is some method to extract only those strings
intended for translation. We already have such a mechanism and it is
already widely used, namely wrapping strings intended for translation
in calls to _() or _negettext(), i.e. marking a string for i18n
translation. Thus the solution to the docstring translation problem is
to mark the docstrings exactly as we have been doing, it only requires
that instead of a bare Python docstring we instead assign the marked
string to the __doc__ variable. Using the hypothetical class foo as
an example.
class foo(Command):
'''
The foo command takes out the garbage.
'''
Would become:
class foo(Command):
__doc__ = _('The foo command takes out the garbage.')
But which docstrings need to be marked for translation? The makeapi
tool knows how to iterate over every command in our public API. It was
extended to validate every command's documentation and report if any
documentation is missing or not marked for translation. That
information was then used to identify each docstring in the code which
needed to be transformed.
In summary what this patch does is:
* Remove the use of pygettext (modification to install/po/Makefile.in)
* Replace every docstring with an explicit assignment to __doc__ where
the rhs of the assignment is an i18n marking function.
* Single line docstrings appearing in multi-line string literals
(e.g. ''' or """) were replaced with single line string literals
because the multi-line literals were introducing unnecessary
whitespace and newlines in the string extracted for translation. For
example:
'''
The foo command takes out the garbage.
'''
Would appear in the translation catalog as:
"\n
The foo command takes out the garbage.\n
"
The superfluous whitespace and newlines are confusing to translators
and requires us to strip leading and trailing whitespace from the
translation at run time.
* Import statements were moved from below the docstring to above
it. This was necessary because the i18n markers are imported
functions and must be available before the the doc is
parsed. Technically only the import of the i18n markers had to
appear before the doc but stylistically it's better to keep all the
imports together.
* It was observed during the docstring editing process that the
command documentation was inconsistent with respect to the use of
periods to terminate a sentence. Some doc had a trailing period,
others didn't. Consistency was enforced by adding a period to end of
every docstring if one was missing.
ticket 1650 (https://fedorahosted.org/freeipa/ticket/1650) has
an extensive discussion of the issues, please refer to that.
This patch does the following:
* does not count fuzzy translations when computing translation
statistics via the "msg-stats" make target in install/po
* adds a new make target called "pull-po" which pulls updated po files
from Transifex (configure.ac includes some trailing whitespace fixes)
* turns off the generation of fuzzy translation suggestions during the
message merge phase.
Pull the new translations for Spanish (es) and Ukrainian (uk)
Update the LINGUAS file to add comment showing the friendly
name for the language abbreviation.
The make target msg-stats which produces a report about the state
of the translations no longer maintained it's column alignment
due to larger numbers so the formating was tweaked to maintain
column alignment.
A dogtag replica file is created as usual. When the replica is installed
dogtag is optional and not installed by default. Adding the --setup-ca
option will configure it when the replica is installed.
A new tool ipa-ca-install will configure dogtag if it wasn't configured
when the replica was initially installed.
This moves a fair bit of code out of ipa-replica-install into
installutils and cainstance to avoid duplication.
https://fedorahosted.org/freeipa/ticket/1251
When connection between a master machine and future replica is not
sane, the replica installation may fail unexpectedly with
inconvenient error messages. One common problem is misconfigured
firewall.
This patch adds a program ipa-replica-conncheck which tests the
connection using the following procedure:
1) Execute the on-replica check testing the connection to master
2) Open required ports on local machine
3) Ask user to run the on-master part of the check OR run it
automatically:
a) kinit to master as default admin user with given password
b) run the on-master part using ssh
4) When master part is executed, it checks connection back to
the replica and prints the check result
This program is run by ipa-replica-install as mandatory part. It
can, however, be skipped using --skip-conncheck option.
ipa-replica-install now requires password for admin user to run
the command on remote master.
https://fedorahosted.org/freeipa/ticket/1107
This patch replaces xgettext with a custom pygettext to generate
translatable strings from plugin files in ipalib/plugins. pygettext
was modified to handle plural forms (credit goes to Jan Hendrik Goellner)
and had some bugs fixed by myself. We only use it for plugins, because
it's the only place where we need to extract docstrings for the built-in
help system.
I also had to make some changes to the way the built-in documentation
systems gets docstrings from modules for this to work.
This has been completely abandoned since ipa v1 and is not built by default.
Instead of carrying dead weight, let's remove it for now.
Fixes: https://fedorahosted.org/freeipa/ticket/761
Add automatic creation of python an C file lists for potfiles
Deletes useless copy of Makefile in install/po
Remove duplicate maintainer-clean target
Add debug target that prints file lists
Unbreak update-po target, merges in patch from John
Performing I18N completely on the server, to leverage the
existing gettext architecture.
Also, the browser does not have access to the Language header.
Added the additional po files for a set of required languages
conflict with install/static/ipa.js was resolved.
Note that the addition of the .po files in this patch is necessary.
In order to get Transifex support, we need to update the LINGUAS
file with the languages for which we want support. If we don't
add the .po files in, they get automatically generated by the rpmbuild
process. Our implementation of gettext has a bug in it (It might
be F13 thing) where the the Plurals line is not getting correctly
transformed, which causes a build failure. However, since the
RPM would have the .po files anyway, we should revision control
the ones we have, even if they are empty.
Fixed the Bug reporting url to the original value.
Corrected the Chartype encoding for UK
We want to manually make the .pot file, we shouldn't have anything
in the Makefile which will cause the .pot file to be rebuilt
because of dependencies.
There was a typo in install/po/Makefile.in which caused (some) of
the .po files to be overwritten because the test to see if a po
file existed had a typo in it.
This patch also removes the unnecessary rebuilding of the pot which was
happening when using the "all" target (the default). The pot file now
must be manually remade, which is what we want.
Added a new target "mo-files" to manually generate the .mo files.
This is useful to run before checking in a new .po file just to
assure it "compiles" and we don't have to discover this during a
build.
Formerly running 'make update-pot' would write an insignificantly
different pot file even if the msgid's xgettext found were unmodfied.
Now the result of running xgettext is compared to the existing
pot file after adjusting for things like timestamps, and only
copies the result of xgettext to the new pot file if there
were differences.
This will help eliminate git commits on the pot file if all one
did was see if the pot file was up to date, if it was up to date
git won't see any modifications. It used to be that timestamps
would be different in the pot file just by virtue of checking
if the pot file was current.
fix exit status; replace POT with $(DOMAN).pot
Add new Russian translation.
Update the Polish translation.
Add count of how many po translations we have in msg-stats.
Current translation statistics with this patch:
ipa.pot has 133 messages. There are 5 po translation files.
bn_IN: 14/133 10.5% 106 po untranslated, 13 missing, 119 untranslated
id: 107/133 80.5% 13 po untranslated, 13 missing, 26 untranslated
kn: 4/133 3.0% 116 po untranslated, 13 missing, 129 untranslated
pl: 133/133 100.0% 0 po untranslated, 0 missing, 0 untranslated
ru: 120/133 90.2% 0 po untranslated, 13 missing, 13 untranslated
The goal is to get the statistics to line up in
columns and not exceed an 80 character line which might
cause wrapping. Removing .po suffix from the translation
name gives us 3 extra characters. Formatting problems were
observed when bn_IN.po was added.
The Makefile in install/po has a new target "msg-stats" which
prints out statistics concerning the current pot and po files.
Here is an example:
% make msg-stats
ipa.pot has 133 messages
id.po: 107/133 80.5% 13 po untranslated, 13 missing, 26 untranslated
kn.po: 4/133 3.0% 116 po untranslated, 13 missing, 129 untranslated
pl.po: 120/133 90.2% 0 po untranslated, 13 missing, 13 untranslated
Also update configure.ac to search for msgcmp, awk & sed programs.
msginit should have been passed the locale because the resulting
.po file is parameterized from the locale. Also, if the target
locale is not specified it defaults to the current locale.
If the target locale is Engish msgid's are copied to their msgstr's
resulting in a fully translated .po instead of a fully untranslated
.po.
Add some comments to better explain some of the cryptic sed commands.
A new directory install/po has been added which contains all
the translations for all files in IPA.
The build has been agumented to build these files. Also the
autogen.sh script was mostly replaced by autoreconf, the preferred
method. The old autogen.sh sript also had some serious bugs in the
way it compared versions which caused it to run old versions of some
of the tools, using standared autoreconf is much better.