API Reference
This page contains the documentation of all the members exported by toolforge_i18n. Not all of them are useful to tools; some should arguably not really be used by any tools. Members are also not listed in any useful order. For these reasons, it doesn’t really make sense to read through this document front to back. Please follow the other documentation instead, and consult this page only if you are interested in details on a particular member.
- class toolforge_i18n.CommaSeparatedListFormatter(*, locale_identifier: str, **kwargs: object)[source]
A string formatter supporting a
!llist conversion.Format string example:
"We went to {cities!l}."For iterable values converted with
!l, the format spec is applied to each list element. Afterwards, the list elements are joined into a standard list using the locale specified in the constructor. (For English, this means separating most items with an ASCII comma plus a space, and the final two with an extra “and”; Chinese and Japanese, for instance, use a fullwidth comma instead.) Attempting to convert non-iterable values with!lis an error.
- class toolforge_i18n.GenderFormatter(*, get_gender: Callable[[object], Literal['m', 'f', 'n']], **kwargs: object)[source]
A string formatter supporting a
!ggrammatical gender conversion.Format string examples:
"Leave a message on {user!g:m=his:f=her:n=their} talk page." "Ci dispiace, ma non sei {user!g:m=autorizzato:f=autorizzata:n=autorizzato/a} a usare il caricamento di massa."
The formatted value, which can be anything as far as this formatter is concerned, is passed into a function specified in the constructor, which should return one of the values
"m","f", or"n", to select the grammatically masculine, feminine, or neutral replacement, respectively. The format spec specifies these three replacements separated by colons. Gender values not specified in the format spec fall back to"m".
- class toolforge_i18n.HyperlinkFormatter(**_kwargs: object)[source]
A string formatter supporting an
!hhyperlink conversion.Format string example:
"You need to {url!h:log in} before you can edit."The formatted value is interpreted as the
hrefattribute of an HTML<a>element, whose inner HTML is given by the format spec.
- class toolforge_i18n.I18nFormatter(*, locale_identifier: str, **kwargs: object)[source]
A string formatter supporting
!p(plural),!l(list),!g(gender) and!h(hyperlink) conversions.See
PluralFormatter,CommaSeparatedListFormatter,HyperlinkFormatterandGenderFormatterfor details.Flask-based tools don’t need to use this class directly (it’s used by
message).
- class toolforge_i18n.PluralFormatter(*, locale_identifier: str, **kwargs: object)[source]
A string formatter supporting a
!pplural conversion.Format string examples:
"I ate {count!p:0=no apples:one={count} apple:other={count} apples}." "{size} (0x{size:04X}) {size!p:one=bajt:two=bajtaj:few=bajty:other=bajtow}"
For numeric values converted with
!p, the format spec is interpreted differently: it consists of a set ofkey=textspecs, separated by colons. The key should be one of the CLDR plural rule tags, currently “zero”, “one”, “two”, “few”, “many”, or “other”, or an explicit value. The text for the matching value or tag, according to the plural rules of the locale specified in the constructor, is substituted into the message. Attempting to convert non-numeric values with!pis an error.Note that most languages do not use all possible tags, and only exactly those tags used in a language should occur in the format string. For example, even though there is a “zero” tag, English only uses the “one” and “other” ones, and to make a special case for a value of zero with a
PluralFormatter('en'), you need to use the key “0”, not “zero”. On the other hand, failing to specify all tags used in a language may make the formatter raise a KeyError: for instance, if the first example above used the key “1” instead of “one”, then it would fail when given a count of-1or1.0.Value keys always take precedence over tag keys, no matter in which order they are specified in the format spec. To match the value, they must be identical to the
str()of the value: for instance, a “1” key will not match a1.0value or vice versa.
- class toolforge_i18n.ToolforgeI18n(app: ~flask.app.Flask | None = None, interface_language_code: ~collections.abc.Callable[[dict[str, dict[str, str]]], str] = <function interface_language_code_from_request>)[source]
Flask extension for toolforge_i18n.
Basic usage:
app = flask.Flask(__name__) i18n = ToolforgeI18n(app)
- class toolforge_i18n.TranslationsConfig(directory: str = 'i18n/', variables: ~collections.abc.Mapping[str, ~collections.abc.Sequence[str]] = <factory>, derived_messages: ~collections.abc.Mapping[str, tuple[str, ~collections.abc.Callable[[str], str]]] = <factory>, language_code_to_babel: ~collections.abc.Callable[[str], str] = <function language_code_to_babel>, allowed_html_elements: dict[str, set[str]] = <factory>, allowed_global_attributes: set[str] = <factory>, get_gender: ~collections.abc.Callable[[~typing.Any], ~typing.Literal['m', 'f', 'n']] = <function get_gender_by_user_name>, check_translations: bool = True)[source]
Configuration for loading message translations.
To use this library, a tool should define a
tool_translations_configmodule which exports aconfigmember of this type, like so:# tool_translations_config.py import TranslationsConfig from toolforge_i18n config = TranslationsConfig( # ... )
The most important config to define is
variables, which most tools will need (unless all your messages have no variables); the others may or may not be necessary depending on the tool.- allowed_global_attributes: set[str]
HTML attributes that should be allowed on any element in messages.
This is similar to
allowed_html_elements, but the given attribute names are allowed regardless of element name.
- allowed_html_elements: dict[str, set[str]]
HTML elements that should be allowed in messages.
The key is an element name, and the value is a set of attributes that are allowed on that element. All other elements and attributes will cause a test failure. (See also
allowed_global_attributes.)
- check_translations: bool = True
Whether to check translations when they are loaded.
By default, translations are checked as soon as they are loaded, and if there is a problem with the translations, an error is raised and the translations cannot be used. (This generally means that the tool cannot run; you will probably have to revert the latest localisation updates and fix the translation on translatewiki.net.) This protects against broken or even malicious messages.
If you have set up Continuous Integration (CI), e.g. using GitLab CI or GitHub actions, and you are running
pytestas part of your tests, then the translations checks will also be registered as tests (you should see variousi18n/*.jsonfiles in pytest’s output). In this case, assuming CI also runs on translatewiki.net exports (and you won’t merge any localisation updates where CI fails), you can set this config toFalseto disable the runtime checks; this will speed up translation loading and therefore the tool’s startup (for a well-translated tool, by more than a second).Beware that, if you set this to
False, only the pytest integration in CI protects your tool from malicious translations. You must be confident that CI will run, and will run all pytest tests, and if possible you should configure your repository so that localisation updates cannot be merged if CI fails (in GitLab: Settings > Merge requests > Pipelines must succeed; no direct equivalent in GitHub). Otherwise, it is always safe to leave this set toTrue.
- derived_messages: Mapping[str, tuple[str, Callable[[str], str]]]
Messages that are derived from other messages.
The key is a message key that is not expected in the JSON files, but that is instead generated by taking another message (whose key is the first element of the tuple) and sending it through the callable in the second element of the tuple. Examples for that callable include the identity function (to copy a message) or simple case transformations.
- directory: str = 'i18n/'
The path to the directory to load message files from.
- get_gender() Literal['m', 'f', 'n']
Get the gender of a named user on Wikimedia sites.
This gets the gender from Meta-Wiki – hopefully the user set it as a global preference, not just on one other wiki.
Nonemay be used to represent an unknown user (e.g. not logged in), who will be treated as having neuter gender.
- language_code_to_babel() str
Default implementation to map a MediaWiki language code to Babel.
This implementation is conservative and only maps language codes where Babel has an alternative that does not lose any information (at least as far as toolforge_i18n is concerned). MediaWiki also supports many language codes that (as far as I know) have no lossless equivalent in Babel, such as (at the time of writing) sh-latn (Serbo-Croatian in Latin script). If your tool is translated into one of those languages, you will have to configure a custom
language_code_to_babelimplementation in yourtool_translations_configand pick some lossy fallback (e.g. hr, Croatian, for sh-latn). Your implementation should generally delegate to this one first, for instance:def language_code_to_babel(code: str) -> str: mapped = toolforge_i18n.language_code_to_babel(code) if mapped != code: return mapped return { 'sh-latn': 'hr', # ... }.get(code, code.partition('-')[0])
- variables: Mapping[str, Sequence[str]]
Variable names used in messages.
The source messages use $1, $2 etc., but the Python format strings use named variables, whose names are specified here. The variable name (or its prefix) encodes the type:
url,url_*- hyperlink:[$1 text]=>{url!h:text}user_name,user_name_*- gender:{{GENDER:$1|he|she|they}}=>{user_name!g:m=he:f=she:n=they}num,num_*- plural:{{PLURAL:$1|one egg|$1 eggs}}=>{num_eggs!p:one=one egg:other={num_eggs} eggs}list,list_*- list:$1=>{list_chicken_names!l}anything else - markup without further formatting:
$1=>{description}
- exception toolforge_i18n.UnknownMessageWarning(message_code: str, language_codes: list[str])[source]
Warning issued by
message()when a message is not defined.This warning usually indicates one of two problems:
a typo in the message key (whether in the
message()call or ini18n/en.json), ora message that was not added to
i18n/en.jsonyet.
- toolforge_i18n.add_lang_if_needed(message: Markup, language_code: str) Markup[source]
Wrap the given message in a language-tagged element if necessary.
Given a (formatted) message in a certain language (MediaWiki language code), wrap it in a
<span>withlang=anddir=attributes if the current language on top of the stack is different. Note thatmessage()calls this function automatically, so you generally don’t need to use this function yourself.
- toolforge_i18n.get_gender_by_user_name(user_name: str | None) Literal['m', 'f', 'n'][source]
Get the gender of a named user on Wikimedia sites.
This gets the gender from Meta-Wiki – hopefully the user set it as a global preference, not just on one other wiki.
Nonemay be used to represent an unknown user (e.g. not logged in), who will be treated as having neuter gender.
- toolforge_i18n.get_user_agent() str[source]
Get the user agent string used by toolforge_i18n.
The user agent string may be set by
set_user_agent(); otherwise, try to get a user agent previously set up bytoolforge.set_user_agent().Code outside of toolforge_i18n generally shouldn’t use this function.
- toolforge_i18n.interface_language_code_from_request(translations: dict[str, dict[str, str]]) str[source]
Default implementation to determine the language code of a request.
This function supports the
?uselang=URL parameter and otherwise determines the language based on the request’sAccept-Languageheader. You may want to override this method to implement a persistent language preference; to keep the features mentioned above, your implementation should generally look like this:from toolforge_i18n import interface_language_code_from_request def interface_language_code(translations): # ?uselang= takes precedence if present if 'uselang' in flask.request.args: return interface_language_code_from_request(translations) # try persistent language preference (e.g. from flask.session) next # ... # finally, fall back to Accept-Language: return interface_language_code_from_request(translations) # ...later, pass the implementation into ToolforgeI18n: i18n = ToolforgeI18n(app, interface_language_code)
- toolforge_i18n.lang_autonym(code: str) str | None[source]
Get the autonym of the given language code, according to MediaWiki.
- toolforge_i18n.lang_bcp47_to_mw(code: str) str[source]
Get the MediaWiki language code of the given BCP-47 language code.
- toolforge_i18n.lang_dir(code: str) Literal['ltr', 'rtl', 'auto'][source]
Get the directionality of the given language code, according to MediaWiki.
- toolforge_i18n.lang_fallbacks(code: str) list[str][source]
Get the fallback languages of the given language code, according to MediaWiki.
- toolforge_i18n.lang_mw_to_bcp47(code: str) str[source]
Get the BCP-47 language code of the given MediaWiki language code.
- toolforge_i18n.language_code_to_babel(code: str) str[source]
Default implementation to map a MediaWiki language code to Babel.
This implementation is conservative and only maps language codes where Babel has an alternative that does not lose any information (at least as far as toolforge_i18n is concerned). MediaWiki also supports many language codes that (as far as I know) have no lossless equivalent in Babel, such as (at the time of writing) sh-latn (Serbo-Croatian in Latin script). If your tool is translated into one of those languages, you will have to configure a custom
language_code_to_babelimplementation in yourtool_translations_configand pick some lossy fallback (e.g. hr, Croatian, for sh-latn). Your implementation should generally delegate to this one first, for instance:def language_code_to_babel(code: str) -> str: mapped = toolforge_i18n.language_code_to_babel(code) if mapped != code: return mapped return { 'sh-latn': 'hr', # ... }.get(code, code.partition('-')[0])
- toolforge_i18n.load_translations(config: TranslationsConfig) tuple[dict[str, dict[str, str]], dict[str, str]][source]
Load the translations according to the given
config.Returns a tuple of
translations, documentationwheretranslationsis a nesteddictfrom language code to message key to message, anddocumentationis adictfrom message key to message documentation. The messages intranslationsare Python format strings intended to be formatted byI18nFormatter.If
check_translationsis enabled in theconfig, the translation checks are run before this function returns, ensuring that the translations are safe to use.Flask-based tools don’t need to call this function directly (it’s called by
ToolforgeI18n).
- toolforge_i18n.message(message_code: str, **kwargs: object) Markup[source]
Format an interface message in the user interface language.
The kwargs may contain (named) arguments, using the argument names defined in
variables.This method is available as a template global, and is usually used there (but may also be imported and called from Python code).
- toolforge_i18n.pop_html_lang(language_code: str) Markup[source]
Pop an HTML language code from the stack.
See
push_html_lang()for details.
- toolforge_i18n.push_html_lang(language_code: str) Markup[source]
Push an HTML language code to the stack.
Many tools will not need to call this, as it’s called by the
message()function automatically. However, if you also add localized text from other sources than messages, you should call this function with the MediaWiki language code you are using; for example, in a Jinja2 template:<span {{ push_html_lang(label.language) }}> {{ label.value }} </span{{ pop_html_lang(label.language) }}>
- toolforge_i18n.set_user_agent(user_agent: str) None[source]
Set the user agent string used by toolforge_i18n.
Most tools should call
toolforge.set_user_agent()instead, which also sets the user agent for other code. It is typically called during early initialization.See the User-Agent policy for the format.