storage¶
Classes that represent various storage formats for localization.
base¶
Base classes for storage interfaces.
-
exception
translate.storage.base.
ParseError
(inner_exc)¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
class
translate.storage.base.
TranslationStore
(unitclass=None, encoding=None)¶ Base class for stores for multiple translation units of type UnitClass.
-
Extensions
= None¶ A list of file extentions associated with this store type
-
Mimetypes
= None¶ A list of MIME types associated with this store type
-
Name
= 'Base translation store'¶ The human usable name of this store type
-
UnitClass
¶ The class of units that will be instantiated and used by this class
alias of
TranslationUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(data)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
suggestions_in_format
= False¶ Indicates if format can store suggestions and alternative translation for a unit
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.base.
TranslationUnit
(source=None)¶ Base class for translation units.
Our concept of a translation unit is influenced heavily by XLIFF.
As such most of the method- and variable names borrows from XLIFF terminology.
A translation unit consists of the following:
- A source string. This is the original translatable text.
- A target string. This is the translation of the source.
- Zero or more notes on the unit. Notes would typically be some comments from a translator on the unit, or some comments originating from the source code.
- Zero or more locations. Locations indicate where in the original source code this unit came from.
- Zero or more errors. Some tools (eg.
pofilter
) can run checks on translations and produce error messages.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_parsers
= []¶ A list of functions to use for parsing a string into a rich string tree.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
benchmark¶
-
class
translate.storage.benchmark.
TranslateBenchmarker
(test_dir, storeclass)¶ class to aid in benchmarking Translate Toolkit stores
-
clear_test_dir
()¶ removes the given directory
-
create_sample_files
(num_dirs, files_per_dir, strings_per_file, source_words_per_string, target_words_per_string)¶ creates sample files for benchmarking
-
parse_files
(file_dir=None)¶ parses all the files in the test directory into memory
-
parse_placeables
()¶ parses placeables
-
bundleprojstore¶
-
class
translate.storage.bundleprojstore.
BundleProjectStore
(fname)¶ Represents a translate project bundle (zip archive).
-
append_file
(afile, fname, ftype='trans', delete_orig=False)¶ Append the given file to the project with the given filename, marked to be of type
ftype
(‘src’, ‘trans’, ‘tgt’).Parameters: delete_orig – If True
, as set byconvert_forward()
,afile
is deleted after appending, if possible.Note
For this implementation, the appended file will be deleted from disk if
delete_orig
isTrue
.
-
cleanup
()¶ Clean up our mess: remove temporary files.
-
get_file
(fname)¶ Retrieve a project file (source, translation or target file) from the project archive.
-
get_filename_type
(fname)¶ Get the type of file (‘src’, ‘trans’, ‘tgt’) with the given name.
-
get_proj_filename
(realfname)¶ Try and find a project file name for the given real file name.
-
load
(zipname)¶ Load the bundle project from the zip file of the given name.
-
remove_file
(fname, ftype=None)¶ Remove the file with the given project name from the project.
-
save
(filename=None)¶ Save all project files to the bundle zip file.
-
sourcefiles
¶ Read-only access to
self._sourcefiles
.
-
targetfiles
¶ Read-only access to
self._targetfiles
.
-
transfiles
¶ Read-only access to
self._transfiles
.
-
update_file
(pfname, infile)¶ Updates the file with the given project file name with the contents of
infile
.Returns: the results from BundleProjStore.append_file()
.
-
catkeys¶
Manage the Haiku catkeys translation format
The Haiku catkeys format is the translation format used for localisation of the Haiku operating system.
It is a bilingual base class derived format with CatkeysFile
and
CatkeysUnit
providing file and unit level access. The file format is
described here:
http://www.haiku-os.org/blog/pulkomandy/2009-09-24_haiku_locale_kit_translator_handbook
- Implementation
The implementation covers the full requirements of a catkeys file. The files are simple Tab Separated Value (TSV) files that can be read by Microsoft Excel and other spreadsheet programs. They use the .txt extension which does make it more difficult to automatically identify such files.
The dialect of the TSV files is specified by
CatkeysDialect
.- Encoding
- The files are UTF-8 encoded.
- Header
CatkeysHeader
provides header management support.- Escaping
catkeys seem to escape things like in C++ (strings are just extracted from the source code unchanged, it seems.
Functions allow for
_escape()
and_unescape()
.
-
class
translate.storage.catkeys.
CatkeysDialect
¶ Describe the properties of a catkeys generated TAB-delimited file.
-
class
translate.storage.catkeys.
CatkeysFile
(inputfile=None, **kwargs)¶ A catkeys translation memory file
-
UnitClass
¶ alias of
CatkeysUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parse the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(newlang)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.catkeys.
CatkeysHeader
(header=None)¶ A catkeys translation memory header
-
settargetlanguage
(newlang)¶ Set a human readable target language
-
-
class
translate.storage.catkeys.
CatkeysUnit
(source=None)¶ A catkeys translation memory unit
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
dict
¶ Get the dictionary of values for a catkeys line
-
getcontext
()¶ Get the message context.
-
getdict
()¶ Get the dictionary of values for a catkeys line
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(present=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setdict
(newdict)¶ Set the dictionary of values for a catkeys line
Parameters: newdict (Dict) – a new dictionary with catkeys line elements
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.catkeys.
FIELDNAMES
= ['source', 'context', 'comment', 'target']¶ Field names for a catkeys TU
-
translate.storage.catkeys.
FIELDNAMES_HEADER
= ['version', 'language', 'mimetype', 'checksum']¶ Field names for the catkeys header
-
translate.storage.catkeys.
FIELDNAMES_HEADER_DEFAULTS
= {'checksum': '', 'language': '', 'mimetype': '', 'version': '1'}¶ Default or minimum header entries for a catkeys file
cpo¶
Classes that hold units of .po files (pounit) or entire files (pofile).
Gettext-style .po (or .pot) files are used in translations for KDE, GNOME and many other projects.
This uses libgettextpo from the gettext package. Any version before 0.17 will at least cause some subtle bugs or may not work at all. Developers might want to have a look at gettext-tools/libgettextpo/gettext-po.h from the gettext package for the public API of the library.
-
translate.storage.cpo.
get_libgettextpo_version
()¶ Returns the libgettextpo version
Return type: three-value tuple Returns: libgettextpo version in the following format:: (major version, minor version, subminor version)
-
translate.storage.cpo.
lsep
= ' '¶ Separator for #: entries
-
class
translate.storage.cpo.
po_error_handler
¶
-
class
translate.storage.cpo.
po_file
¶
-
translate.storage.cpo.
po_file_t
¶ A po_file_t represents a PO file.
alias of
translate.storage.cpo.LP_po_file
-
class
translate.storage.cpo.
po_filepos
¶
-
translate.storage.cpo.
po_filepos_t
¶ A po_filepos_t represents the position in a PO file.
alias of
translate.storage.cpo.LP_po_filepos
-
class
translate.storage.cpo.
po_iterator
¶
-
translate.storage.cpo.
po_iterator_t
¶ A po_iterator_t represents an iterator through a PO file.
alias of
translate.storage.cpo.LP_po_iterator
-
class
translate.storage.cpo.
po_message
¶
-
translate.storage.cpo.
po_message_t
¶ A po_message_t represents a message in a PO file.
alias of
translate.storage.cpo.LP_po_message
-
class
translate.storage.cpo.
po_xerror_handler
¶
-
class
translate.storage.cpo.
pofile
(inputfile=None, noheader=False, **kwargs)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Return the project based on information in the header.
- The project is determined in the following sequence:
- Use the ‘X-Project-Style’ entry in the header.
- Use ‘Report-Msgid-Bug-To’ entry
- Use the ‘X-Accelerator’ entry
- Use the Project ID
- Analyse the file itself (not yet implemented)
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Return the target language based on information in the header.
- The target language is determined in the following sequence:
- Use the ‘Language’ entry in the header.
- Poedit’s custom headers.
- Analysing the ‘Language-Team’ entry.
-
getunits
()¶ Return a list of all units in this store.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
isempty
()¶ Returns True if the object doesn’t contain any translation units.
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
removeduplicates
(duplicatestyle='merge')¶ make sure each msgid is unique ; merge comments etc from duplicates into original
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project in the header.
Parameters: project_style (str) – the new project
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(lang)¶ Set the target language in the header.
This removes any custom Poedit headers if they exist.
Parameters: lang (str) – the new target language code
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
class
translate.storage.cpo.
pounit
(source=None, encoding='utf-8', gpo_message=None)¶ -
CPO_ENC
= 'utf-8'¶ fixed encoding that is always used for cPO structure (self._gpo_message)
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit, encoding=None)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
-
getid
()¶ The unique identifier for this unit according to the conventions in .mo files.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(present=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review. Adds an optional explanation as a note.
-
merge
(otherpo, overwrite=False, comments=True, authoritative=False)¶ Merges the otherpo (with the same msgid) into this one.
Overwrite non-blank self.msgstr only if overwrite is True merge comments only if comments is True
-
msgidcomment
¶ Extract KDE style msgid comments from the unit.
Return type: String Returns: Returns the extracted msgidcomments found in this unit’s msgid.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
csvl10n¶
classes that hold units of comma-separated values (.csv) files (csvunit) or entire files (csvfile) for use with localisation
-
class
translate.storage.csvl10n.
DefaultDialect
¶
-
class
translate.storage.csvl10n.
csvfile
(inputfile=None, fieldnames=None, encoding='auto')¶ This class represents a .csv file with various lines. The default format contains three columns: location, source, target
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(csvsrc)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write to file
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.csvl10n.
csvunit
(source=None)¶ -
add_spreadsheet_escapes
(source, target)¶ add common spreadsheet escapes to two strings
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
match_header
()¶ see if unit might be a header
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
remove_spreadsheet_escapes
(source, target)¶ remove common spreadsheet escapes from two strings
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(value)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.csvl10n.
detect_header
(sample, dialect, fieldnames)¶ Test if file has a header or not, also returns number of columns in first row
-
translate.storage.csvl10n.
valid_fieldnames
(fieldnames)¶ Check if fieldnames are valid, that is at least one field is identified as the source.
directory¶
This module provides functionality to work with directories.
-
class
translate.storage.directory.
Directory
(dir=None)¶ This class represents a directory.
-
file_iter
()¶ Iterator over (dir, filename) for all files in this directory.
-
getfiles
()¶ Returns a list of (dir, filename) tuples for all the file names in this directory.
-
getunits
()¶ List of all the units in all the files in this directory.
-
scanfiles
()¶ Populate the internal file data.
-
unit_iter
()¶ Iterator over all the units in all the files in this directory.
-
dtd¶
Classes that hold units of .dtd files (dtdunit
) or entire files
(dtdfile
).
These are specific .dtd files for localisation used by mozilla.
- Specifications
The following information is provided by Mozilla:
There is a grammar for entity definitions, which isn’t really precise, as the spec says. There’s no formal specification for DTD files, it’s just “whatever makes this work” basically. The whole piece is clearly not the strongest point of the xml spec
XML elements are allowed in entity values. A number of things that are allowed will just break the resulting document, Mozilla forbids these in their DTD parser.
- Dialects
There are two dialects:
- Regular DTD
- Android DTD
Both dialects are similar, but the Android DTD uses some particular escapes that regular DTDs don’t have.
- Escaping in regular DTD
In DTD usually there are characters escaped in the entities. In order to ease the translation some of those escaped characters are unescaped when reading from, or converting, the DTD, and that are escaped again when saving, or converting to a DTD.
In regular DTD the following characters are usually or sometimes escaped:
- The % character is escaped using % or % or %
- The ” character is escaped using "
- The ‘ character is escaped using ' (partial roundtrip)
- The & character is escaped using &
- The < character is escaped using < (not yet implemented)
- The > character is escaped using > (not yet implemented)
Besides the previous ones there are a lot of escapes for a huge number of characters. This escapes usually have the form of &#NUMBER; where NUMBER represents the numerical code for the character.
There are a few particularities in DTD escaping. Some of the escapes are not yet implemented since they are not really necessary, or because its implementation is too hard.
A special case is the ‘ escaping using ' which doesn’t provide a full roundtrip conversion in order to support some special Mozilla DTD files.
Also the ” character is never escaped in the case that the previous character is = (the sequence =” is present on the string) in order to avoid escaping the ” character indicating an attribute assignment, for example in a href attribute for an a tag in HTML (anchor tag).
- Escaping in Android DTD
It has the sames escapes as in regular DTD, plus this ones:
- The ‘ character is escaped using ' or ‘ or ‘
- The ” character is escaped using "
-
translate.storage.dtd.
accesskeysuffixes
= ('.accesskey', '.accessKey', '.akey')¶ Accesskey Suffixes: entries with this suffix may be combined with labels ending in
labelsuffixes
into accelerator notation
-
class
translate.storage.dtd.
dtdfile
(inputfile=None, android=False)¶ A .dtd file made up of dtdunits.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ makes self.id_index dictionary keyed on entities
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(dtdsrc)¶ read the source code of a dtd file in and include them as dtdunits in self.units
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write content to file
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.dtd.
dtdunit
(source='', android=False)¶ An entity definition from a DTD file (and any associated comments).
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Set the entity to the given “location”.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ Return the entity as location (identifier).
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
getoutput
()¶ convert the dtd entity back to string form
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isnull
()¶ returns whether this dtdunit doesn’t actually have an entity definition
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
parse
(dtdsrc)¶ read the first dtd element from the source code into this object, return linesprocessed
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(new_id)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
source
¶ gets the unquoted source string
-
target
¶ gets the unquoted target string
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.dtd.
labelsuffixes
= ('.label', '.title')¶ Label suffixes: entries with this suffix are able to be comibed with accesskeys found in in entries ending with
accesskeysuffixes
-
translate.storage.dtd.
quoteforandroid
(source)¶ Escapes a line for Android DTD files.
-
translate.storage.dtd.
quotefordtd
(source)¶ Quotes and escapes a line for regular DTD files.
-
translate.storage.dtd.
removeinvalidamps
(name, value)¶ Find and remove ampersands that are not part of an entity definition.
A stray & in a DTD file can break an application’s ability to parse the file. In Mozilla localisation this is very important and these can break the parsing of files used in XUL and thus break interface rendering. Tracking down the problem is very difficult, thus by removing potential broken ampersand and warning the users we can ensure that the output DTD will always be parsable.
Parameters: - name (String) – Entity name
- value (String) – Entity text value
Return type: String
Returns: Entity value without bad ampersands
-
translate.storage.dtd.
unquotefromandroid
(source)¶ Unquotes a quoted Android DTD definition.
-
translate.storage.dtd.
unquotefromdtd
(source)¶ unquotes a quoted dtd definition
_factory_classes¶
Py2exe can’t find stuff that we import dynamically, so we have this file just for the sake of the Windows installer to easily pick up all the stuff that we need and ensure they make it into the installer.
factory¶
factory methods to build real storage objects that conform to base.py
-
translate.storage.factory.
getclass
(storefile, localfiletype=None, ignore=None, classes=None, classes_str=None, hiddenclasses=None)¶ Factory that returns the applicable class for the type of file presented. Specify ignore to ignore some part at the back of the name (like .gz).
-
translate.storage.factory.
getobject
(storefile, localfiletype=None, ignore=None, classes=None, classes_str=None, hiddenclasses=None)¶ Factory that returns a usable object for the type of file presented.
Parameters: storefile (file or str) – File object or file name. Specify ignore to ignore some part at the back of the name (like .gz).
-
translate.storage.factory.
supported_files
()¶ Returns data about all supported files
Returns: list of type that include (name, extensions, mimetypes) Return type: list
fpo¶
Classes for the support of Gettext .po and .pot files.
This implementation assumes that cpo is working. This should not be used directly, but can be used once cpo has been established to work.
-
translate.storage.fpo.
lsep
= ' '¶ Separator for #: entries
-
class
translate.storage.fpo.
pofile
(inputfile=None, **kwargs)¶ A .po file containing various units
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Return the project based on information in the header.
- The project is determined in the following sequence:
- Use the ‘X-Project-Style’ entry in the header.
- Use ‘Report-Msgid-Bug-To’ entry
- Use the ‘X-Accelerator’ entry
- Use the Project ID
- Analyse the file itself (not yet implemented)
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Return the target language based on information in the header.
- The target language is determined in the following sequence:
- Use the ‘Language’ entry in the header.
- Poedit’s custom headers.
- Analysing the ‘Language-Team’ entry.
-
getunits
()¶ Return a list of all units in this store.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
parse
(input)¶ Parses the given file or file source string.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
removeduplicates
(duplicatestyle='merge')¶ Make sure each msgid is unique ; merge comments etc from duplicates into original
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write content to file
-
setprojectstyle
(project_style)¶ Set the project in the header.
Parameters: project_style (str) – the new project
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(lang)¶ Set the target language in the header.
This removes any custom Poedit headers if they exist.
Parameters: lang (str) – the new target language code
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
class
translate.storage.fpo.
pounit
(source=None, **kwargs)¶ -
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add a location to sourcecomments in the PO unit.
Parameters: location (String) – Text location e.g. ‘file.c:23’ does not include #:
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ This is modeled on the XLIFF method. See xliff.py::xliffunit.addnote
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
-
getid
()¶ Returns a unique identifier for this unit.
-
getlocations
()¶ Get a list of locations from sourcecomments in the PO unit.
rtype: List return: A list of the locations with ‘#: ‘ stripped
-
getnotes
(origin=None)¶ Return comments based on origin value (programmer, developer, source code and translator)
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasmarkedcomment
(commentmarker)¶ Check whether the given comment marker is present as # (commentmarker) …
-
hasplural
()¶ returns whether this pounit contains plural strings…
-
hastypecomment
(typecomment)¶ Check whether the given type comment is present
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Makes this unit obsolete
-
markfuzzy
(present=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review. Adds an optional explanation as a note.
-
merge
(otherpo, overwrite=False, comments=True, authoritative=False)¶ Merges the otherpo (with the same msgid) into this one.
Overwrite non-blank self.msgstr only if overwrite is True merge comments only if comments is True
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes (other comments)
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settypecomment
(typecomment, present=True)¶ Alters whether a given typecomment is present
-
target
¶ Returns the unescaped msgstr
-
unit_iter
()¶ Iterator that only returns this unit.
-
html¶
module for parsing html files for translation
-
class
translate.storage.html.
POHTMLParser
(includeuntaggeddata=None, inputfile=None, callback=None)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
buildtag
(tag, attrs=None, startend=False)¶ Create an HTML tag
-
close
()¶ Handle any buffered data.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
do_encoding
(htmlsrc)¶ Return the html text properly encoded based on a charset.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
feed
(data)¶ Feed data to the parser.
Call this as often as you want, with as little or as much text as you want (may include ‘n’).
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
get_starttag_text
()¶ Return full source of start tag: ‘<…>’.
-
getids
(filename=None)¶ return a list of unit ids
-
getpos
()¶ Return current line number and offset.
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
guess_encoding
(htmlsrc)¶ Returns the encoding of the html text.
We look for ‘charset=’ within a meta tag to do this.
-
handle_charref
(name)¶ Handle entries in the form &#NNNN; e.g. ⃡
-
handle_entityref
(name)¶ Handle named entities of the form &aaaa; e.g. ’
-
has_translatable_content
(text)¶ Check if the supplied HTML snippet has any content that needs to be translated.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(htmlsrc)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
pi_escape
(text)¶ Replaces all instances of process instruction with placeholders, and returns the new text and a dictionary of tags. The current implementation replaces <?foo?> with <?md5(foo)?>. The hash => code conversions are stored in self.pidict for later use in restoring the real PHP.
The purpose of this is to remove all potential “tag-like” code from inside PHP. The hash looks nothing like an HTML tag, but the following PHP:
$a < $b ? $c : ($d > $e ? $f : $g)
looks like it contains an HTML tag:
< $b ? $c : ($d >
to nearly any regex. Hence, we replace all contents of PHP with simple strings to help our regexes out.
-
pi_unescape
(text)¶ Replaces the PHP placeholders in text with the real code
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
reset
()¶ Reset this instance. Loses all unprocessed data.
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.html.
htmlfile
(includeuntaggeddata=None, inputfile=None, callback=None)¶ -
INCLUDEATTRS
= ['alt', 'abbr', 'content', 'standby', 'summary', 'title']¶ Text from these attributes are extracted
-
MARKINGATTRS
= []¶ Text from tags with these attributes will be extracted from the HTML document
-
MARKINGTAGS
= ['address', 'caption', 'div', 'dt', 'dd', 'figcaption', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'li', 'p', 'pre', 'title', 'th', 'td']¶ Text in these tags that will be extracted from the HTML document
-
SELF_CLOSING_TAGS
= ['area', 'base', 'basefont', 'br', 'col', 'frame', 'hr', 'img', 'input', 'link', 'meta', 'param']¶ HTML self-closing tags. Tags that should be specified as <img /> but might be <img>. Reference
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
buildtag
(tag, attrs=None, startend=False)¶ Create an HTML tag
-
close
()¶ Handle any buffered data.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
do_encoding
(htmlsrc)¶ Return the html text properly encoded based on a charset.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
feed
(data)¶ Feed data to the parser.
Call this as often as you want, with as little or as much text as you want (may include ‘n’).
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
get_starttag_text
()¶ Return full source of start tag: ‘<…>’.
-
getids
(filename=None)¶ return a list of unit ids
-
getpos
()¶ Return current line number and offset.
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
guess_encoding
(htmlsrc)¶ Returns the encoding of the html text.
We look for ‘charset=’ within a meta tag to do this.
-
handle_charref
(name)¶ Handle entries in the form &#NNNN; e.g. ⃡
-
handle_entityref
(name)¶ Handle named entities of the form &aaaa; e.g. ’
-
has_translatable_content
(text)¶ Check if the supplied HTML snippet has any content that needs to be translated.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(htmlsrc)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
pi_escape
(text)¶ Replaces all instances of process instruction with placeholders, and returns the new text and a dictionary of tags. The current implementation replaces <?foo?> with <?md5(foo)?>. The hash => code conversions are stored in self.pidict for later use in restoring the real PHP.
The purpose of this is to remove all potential “tag-like” code from inside PHP. The hash looks nothing like an HTML tag, but the following PHP:
$a < $b ? $c : ($d > $e ? $f : $g)
looks like it contains an HTML tag:
< $b ? $c : ($d >
to nearly any regex. Hence, we replace all contents of PHP with simple strings to help our regexes out.
-
pi_unescape
(text)¶ Replaces the PHP placeholders in text with the real code
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
reset
()¶ Reset this instance. Loses all unprocessed data.
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.html.
htmlunit
(source=None)¶ A unit of translatable/localisable HTML content
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.html.
normalize_html
(text)¶ Remove double spaces from HTML snippets
-
translate.storage.html.
safe_escape
(html)¶ Escape &, < and >
-
translate.storage.html.
strip_html
(text)¶ Strip unnecessary html from the text.
HTML tags are deemed unnecessary if it fully encloses the translatable text, eg. ‘<a href=”index.html”>Home Page</a>’.
HTML tags that occurs within the normal flow of text will not be removed, eg. ‘This is a link to the <a href=”index.html”>Home Page</a>.’
ical¶
Class that manages iCalender files for translation.
iCalendar files follow the RFC2445 specification.
The iCalendar specification uses the following naming conventions:
- Component: an event, journal entry, timezone, etc
- Property: a property of a component: summary, description, start time, etc
- Attribute: an attribute of a property, e.g. language
The following are localisable in this implementation:
- VEVENT component: SUMMARY, DESCRIPTION, COMMENT and LOCATION properties
While other items could be localised this is not seen as important until use cases arise. In such a case simply adjusting the component.name and property.name lists to include these will allow expanded localisation.
- LANGUAGE Attribute
- While the iCalendar format allows items to have a language attribute this is not used. The reason being that for most of the items that we localise they are only allowed to occur zero or once. Thus ‘summary’ would ideally be present in multiple languages in one file, the format does not allow such multiple entries. This is unfortunate as it prevents the creation of a single multilingual iCalendar file.
- Future Format Support
- As this format used vobject which supports various formats including vCard it is possible to expand this format to understand those if needed.
-
class
translate.storage.ical.
icalfile
(inputfile=None, **kwargs)¶ An ical file
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parse the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.ical.
icalunit
(source=None, **kwargs)¶ An ical entry that is translatable
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
ini¶
Class that manages .ini files for translation
# a comment ; a comment
[Section] a = a string b : a string
-
class
translate.storage.ini.
Dialect
¶ Base class for differentiating dialect options and functions
-
class
translate.storage.ini.
DialectDefault
¶
-
class
translate.storage.ini.
DialectInno
¶
-
class
translate.storage.ini.
inifile
(inputfile=None, dialect='default', **kwargs)¶ An INI file
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ Parse the given file or file source string.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.ini.
iniunit
(source=None, **kwargs)¶ A INI file entry
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.ini.
register_dialect
(dialect)¶ Decorator that registers the dialect.
jsonl10n¶
Class that manages JSON data files for translation
JSON is an acronym for JavaScript Object Notation, it is an open standard designed for human-readable data interchange.
JSON basic types:
- Number (integer or real)
- String (double-quoted Unicode with backslash escaping)
- Boolean (true or false)
- Array (an ordered sequence of values, comma-separated and enclosed in square brackets)
- Object (a collection of key:value pairs, comma-separated and enclosed in curly braces)
- null
Example:
{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}
TODO:
- Handle
\u
and other escapes in Unicode - Manage data type storage and conversion. True –> “True” –> True
-
class
translate.storage.jsonl10n.
I18NextFile
(inputfile=None, filter=None, **kwargs)¶ A i18next v3 format, this is nested JSON with several additions.
-
UnitClass
¶ alias of
I18NextUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parse the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.jsonl10n.
I18NextUnit
(source=None, item=None, notes=None, **kwargs)¶ A i18next v3 format, JSON with plurals.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
getvalue
()¶ Return value to be stored in JSON file.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
class
translate.storage.jsonl10n.
JsonFile
(inputfile=None, filter=None, **kwargs)¶ A JSON file
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parse the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.jsonl10n.
JsonNestedFile
(inputfile=None, filter=None, **kwargs)¶ A JSON file with nested keys
-
UnitClass
¶ alias of
JsonNestedUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parse the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.jsonl10n.
JsonNestedUnit
(source=None, item=None, notes=None, **kwargs)¶ -
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
getvalue
()¶ Return value to be stored in JSON file.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
class
translate.storage.jsonl10n.
JsonUnit
(source=None, item=None, notes=None, **kwargs)¶ A JSON entry
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
getvalue
()¶ Return value to be stored in JSON file.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
class
translate.storage.jsonl10n.
WebExtensionJsonFile
(inputfile=None, filter=None, **kwargs)¶ WebExtension JSON file
See following URLs for doc:
https://developer.chrome.com/extensions/i18n https://developer.mozilla.org/en-US/Add-ons/WebExtensions/Internationalization
-
UnitClass
¶ alias of
WebExtensionJsonUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parse the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.jsonl10n.
WebExtensionJsonUnit
(source=None, item=None, notes=None, **kwargs)¶ -
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
getvalue
()¶ Return value to be stored in JSON file.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
lisa¶
Parent class for LISA standards (TMX, TBX, XLIFF)
-
class
translate.storage.lisa.
LISAfile
(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶ A class representing a file store for one of the LISA file formats.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Method to be overridden to initialise headers, etc.
-
addsourceunit
(source)¶ Adds and returns a new unit with the given string as first entry.
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
initbody
()¶ Initialises self.body so it never needs to be retrieved from the XML again.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out=None)¶ Converts to a string containing the file’s XML
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.lisa.
LISAunit
(source, empty=False, **kwargs)¶ A single unit in the file. Provisional work is done to make several languages possible.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
createlanguageNode
(lang, text, purpose=None)¶ Returns a xml Element setup with given parameters to represent a single language entry. Has to be overridden.
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ Returns a list of all nodes that contain per language information.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettarget
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the “target” string (second language), or alternatively appends to the list
-
unit_iter
()¶ Iterator that only returns this unit.
-
mo¶
Module for parsing Gettext .mo files for translation.
The coding of .mo files was produced from Gettext documentation, Pythons msgfmt.py and by observing and testing existing .mo files in the wild.
The hash algorithm is implemented for MO files, this should result in faster access of the MO file. The hash is optional for Gettext and is not needed for reading or writing MO files, in this implementation it is always on and does produce sometimes different results to Gettext in very small files.
-
class
translate.storage.mo.
mofile
(inputfile=None, **kwargs)¶ A class representing a .mo file.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Return the project based on information in the header.
- The project is determined in the following sequence:
- Use the ‘X-Project-Style’ entry in the header.
- Use ‘Report-Msgid-Bug-To’ entry
- Use the ‘X-Accelerator’ entry
- Use the Project ID
- Analyse the file itself (not yet implemented)
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Return the target language based on information in the header.
- The target language is determined in the following sequence:
- Use the ‘Language’ entry in the header.
- Poedit’s custom headers.
- Analysing the ‘Language-Team’ entry.
-
getunits
()¶ Return a list of all units in this store.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
parse
(input)¶ parses the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Output a string representation of the MO data file
-
setprojectstyle
(project_style)¶ Set the project in the header.
Parameters: project_style (str) – the new project
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(lang)¶ Set the target language in the header.
This removes any custom Poedit headers if they exist.
Parameters: lang (str) – the new target language code
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
class
translate.storage.mo.
mounit
(source=None, **kwargs)¶ A class representing a .mo translation message.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Is this a header entry?
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Is this message translateable?
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.mo.
mounpack
(filename='messages.mo')¶ Helper to unpack Gettext MO files into a Python string
mozilla_lang¶
A class to manage Mozilla .lang files.
See https://github.com/mozilla-l10n/langchecker/wiki/.lang-files-format for specifications on the format.
-
class
translate.storage.mozilla_lang.
LangStore
(inputfile=None, mark_active=False, **kwargs)¶ We extend TxtFile, since that has a lot of useful stuff for encoding
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(lines)¶ Read in text lines and create txtunits from the blocks of text
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.mozilla_lang.
LangUnit
(source=None)¶ This is just a normal unit with a weird string output
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
odf_io¶
omegat¶
Manage the OmegaT glossary format
OmegaT glossary format is used by the OmegaT computer aided translation tool.
It is a bilingual base class derived format with OmegaTFile
and OmegaTUnit
providing file and unit level access.
- Format Implementation
The OmegaT glossary format is a simple Tab Separated Value (TSV) file with the columns: source, target, comment.
The dialect of the TSV files is specified by
OmegaTDialect
.- Encoding
- The files are either UTF-8 or encoded using the system default. UTF-8 encoded files use the .utf8 extension while system encoded files use the .tab extension.
-
translate.storage.omegat.
OMEGAT_FIELDNAMES
= ['source', 'target', 'comment']¶ Field names for an OmegaT glossary unit
-
class
translate.storage.omegat.
OmegaTDialect
¶ Describe the properties of an OmegaT generated TAB-delimited glossary file.
-
class
translate.storage.omegat.
OmegaTFile
(inputfile=None, **kwargs)¶ An OmegaT glossary file
-
UnitClass
¶ alias of
OmegaTUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parsese the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.omegat.
OmegaTFileTab
(inputfile=None, **kwargs)¶ An OmegaT glossary file in the default system encoding
-
UnitClass
¶ alias of
OmegaTUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parsese the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.omegat.
OmegaTUnit
(source=None)¶ An OmegaT glossary unit
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
dict
¶ Get the dictionary of values for a OmegaT line
-
getcontext
()¶ Get the message context.
-
getdict
()¶ Get the dictionary of values for a OmegaT line
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setdict
(newdict)¶ Set the dictionary of values for a OmegaT line
Parameters: newdict (Dict) – a new dictionary with OmegaT line elements
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
oo¶
Classes that hold units of .oo files (oounit) or entire files (oofile).
These are specific .oo files for localisation exported by OpenOffice.org - SDF format (previously knows as GSI files).
The behaviour in terms of escaping is explained in detail in the programming comments.
-
translate.storage.oo.
escape_help_text
(text)¶ Escapes the help text as it would be in an SDF file.
<, >, ” are only escaped in <[[:lower:]]> tags. Some HTML tags make it in in lowercase so those are dealt with. Some OpenOffice.org help tags are not escaped.
-
translate.storage.oo.
escape_text
(text)¶ Escapes SDF text to be suitable for unit consumption.
-
translate.storage.oo.
makekey
(ookey, long_keys)¶ converts an oo key tuple into a unique identifier
Parameters: - ookey (tuple) – an oo key
- long_keys (Boolean) – Use long keys
Return type: str
Returns: unique ascii identifier
-
translate.storage.oo.
normalizefilename
(filename)¶ converts any non-alphanumeric (standard roman) characters to _
-
class
translate.storage.oo.
oofile
(input=None)¶ this represents an entire .oo file
-
addline
(thisline)¶ adds a parsed line to the file
-
getoutput
(skip_source=False, fallback_lang=None)¶ converts all the lines back to tab-delimited form
-
parse
(input)¶ parses lines and adds them to the file
-
serialize
(out, skip_source=False, fallback_lang=None)¶ convert to a string. double check that unicode is handled
-
-
class
translate.storage.oo.
ooline
(parts=None)¶ this represents one line, one translation in an .oo file
-
getkey
()¶ get the key that identifies the resource
-
getoutput
()¶ return a line in tab-delimited form
-
getparts
()¶ return a list of parts in this line
-
gettext
()¶ Obtains the text column and handle escaping.
-
setparts
(parts)¶ create a line from its tab-delimited parts
-
settext
(text)¶ Sets the text column and handle escaping.
-
text
¶ Obtains the text column and handle escaping.
-
-
class
translate.storage.oo.
oomultifile
(filename, mode=None, multifilestyle='single')¶ this takes a huge GSI file and represents it as multiple smaller files…
-
createsubfileindex
()¶ reads in all the lines and works out the subfiles
-
getoofile
(subfile)¶ returns an oofile built up from the given subfile’s lines
-
getsubfilename
(line)¶ looks up the subfile name for the line
-
getsubfilesrc
(subfile)¶ returns the list of lines matching the subfile
-
listsubfiles
()¶ returns a list of subfiles in the file
-
openinputfile
(subfile)¶ returns a pseudo-file object for the given subfile
-
openoutputfile
(subfile)¶ returns a pseudo-file object for the given subfile
-
-
class
translate.storage.oo.
oounit
¶ this represents a number of translations of a resource
-
addline
(line)¶ add a line to the oounit
-
getoutput
(skip_source=False, fallback_lang=None)¶ return the lines in tab-delimited form
-
-
translate.storage.oo.
unescape_help_text
(text)¶ Unescapes normal text to be suitable for writing to the SDF file.
-
translate.storage.oo.
unescape_text
(text)¶ Unescapes SDF text to be suitable for unit consumption.
-
class
translate.storage.oo.
unormalizechar
(normalchars)¶ -
clear
() → None. Remove all items from D.¶
-
copy
() → a shallow copy of D¶
-
fromkeys
()¶ Create a new dictionary with keys from iterable and values set to value.
-
get
()¶ Return the value for key if key is in the dictionary, else default.
-
items
() → a set-like object providing a view on D's items¶
-
keys
() → a set-like object providing a view on D's keys¶
-
pop
(k[, d]) → v, remove specified key and return the corresponding value.¶ If key is not found, d is returned if given, otherwise KeyError is raised
-
popitem
() → (k, v), remove and return some (key, value) pair as a¶ 2-tuple; but raise KeyError if D is empty.
-
setdefault
()¶ Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
-
update
([E, ]**F) → None. Update D from dict/iterable E and F.¶ If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
-
values
() → an object providing a view on D's values¶
-
placeables¶
This module implements basic functionality to support placeables.
- A placeable is used to represent things like:
Substitutions
For example, in ODF, footnotes appear in the ODF XML where they are defined; so if we extract a paragraph with some footnotes, the translator will have a lot of additional XML to with; so we separate the footnotes out into separate translation units and mark their positions in the original text with placeables.
Hiding of inline formatting data
The translator doesn’t want to have to deal with all the weird formatting conventions of wherever the text came from.
Marking variables
This is an old issue - translators translate variable names which should remain untranslated. We can wrap placeables around variable names to avoid this.
The placeables model follows the XLIFF standard’s list of placeables. Please refer to the XLIFF specification to get a better understanding.
base¶
Contains base placeable classes with names based on XLIFF placeables. See the XLIFF standard for more information about what the names mean.
-
class
translate.storage.placeables.base.
Bpt
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
Ept
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
Ph
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
It
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
G
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
Bx
(id=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
Ex
(id=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
X
(id=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.base.
Sub
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
general¶
Contains general placeable implementations. That is placeables that does not fit into any other sub-category.
-
class
translate.storage.placeables.general.
AltAttrPlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ Placeable for the “alt=…” attributes inside XML tags.
-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
classmethod
parse
(pstr)¶ A parser method to extract placeables from a string based on a regular expression. Use this function as the
@parse()
method of a placeable class.
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.general.
XMLEntityPlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ Placeable handling XML entities (
&xxxxx;
-style entities).-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
classmethod
parse
(pstr)¶ A parser method to extract placeables from a string based on a regular expression. Use this function as the
@parse()
method of a placeable class.
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.general.
XMLTagPlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ Placeable handling XML tags.
-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
classmethod
parse
(pstr)¶ A parser method to extract placeables from a string based on a regular expression. Use this function as the
@parse()
method of a placeable class.
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
interfaces¶
- This file contains abstract (semantic) interfaces for placeable
- implementations.
-
class
translate.storage.placeables.interfaces.
BasePlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ Base class for all placeables.
-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.interfaces.
InvisiblePlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.interfaces.
MaskingPlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.interfaces.
ReplacementPlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.interfaces.
SubflowPlaceable
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
lisa¶
parse¶
Contains the parse
function that parses normal strings into StringElem-
based “rich” string element trees.
-
translate.storage.placeables.parse.
parse
(tree, parse_funcs)¶ Parse placeables from the given string or sub-tree by using the parsing functions provided.
The output of this function is heavily dependent on the order of the parsing functions. This is because of the algorithm used.
An over-simplification of the algorithm: the leaves in the
StringElem
tree are expanded to the output of the first parsing function inparse_funcs
. The next level of recursion is then started on the new set of leaves with the used parsing function removed fromparse_funcs
.Parameters: tree (unicode|StringElem) – The string or string element sub-tree to parse.
strelem¶
Contains the base StringElem
class that represents a node in a
parsed rich-string tree. It is the base class of all placeables.
-
exception
translate.storage.placeables.strelem.
ElementNotFoundError
¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
class
translate.storage.placeables.strelem.
StringElem
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ This class represents a sub-tree of a string parsed into a rich structure. It is also the base class of all placeables.
-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
has_content
= True¶ Whether this string can have sub-elements.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
iseditable
= True¶ Whether this string should be changable by the user. Not used at the moment.
-
isfragile
= False¶ Whether this element should be deleted in its entirety when partially deleted. Only checked when
iseditable = False
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
istranslatable
= True¶ Whether this string is translatable into other languages.
-
isvisible
= True¶ Whether this string should be visible to the user. Not used at the moment.
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
classmethod
parse
(pstr)¶ Parse an instance of this class from the start of the given string. This method should be implemented by any sub-class that wants to parseable by
translate.storage.placeables.parse
.Parameters: pstr (unicode) – The string to parse into an instance of this class. Returns: An instance of the current class, or None
if the string not parseable by this class.
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
renderer
= None¶ An optional function that returns the Unicode representation of the string.
-
sub
= []¶ The sub-elements that make up this this string.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
terminology¶
Contains the placeable that represents a terminology term.
-
class
translate.storage.placeables.terminology.
TerminologyPlaceable
(*args, **kwargs)¶ Terminology distinguished from the rest of a string by being a placeable.
-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
matchers
= []¶ A list of matcher objects to use to identify terminology.
-
classmethod
parse
(pstr)¶ Parse an instance of this class from the start of the given string. This method should be implemented by any sub-class that wants to parseable by
translate.storage.placeables.parse
.Parameters: pstr (unicode) – The string to parse into an instance of this class. Returns: An instance of the current class, or None
if the string not parseable by this class.
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
translations
= []¶ The available translations for this placeable.
-
xliff¶
Contains XLIFF-specific placeables.
-
class
translate.storage.placeables.xliff.
Bpt
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
Ept
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
X
(id=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
Bx
(id=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
Ex
(id=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
G
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
It
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
Sub
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
Ph
(sub=None, id=None, rid=None, xid=None, **kwargs)¶ -
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
-
class
translate.storage.placeables.xliff.
UnknownXML
(sub=None, id=None, rid=None, xid=None, xml_node=None, **kwargs)¶ Placeable for unrecognized or umimplemented XML nodes. It’s main purpose is to preserve all associated XML data.
-
apply_to_strings
(f)¶ Apply
f
to all actual strings in the tree.Parameters: f – Must take one (str or unicode) argument and return a string or unicode.
-
copy
()¶ Returns a copy of the sub-tree. This should be overridden in sub-classes with more data.
Note
self.renderer
is not copied.
-
delete_range
(start_index, end_index)¶ Delete the text in the range given by the string-indexes
start_index
andend_index
.Partial nodes will only be removed if they are editable.
Returns: A StringElem
representing the removed sub-string, the parent node from which it was deleted as well as the offset at which it was deleted from.None
is returned for the parent value if the root was deleted. If the parent and offset values are notNone
,parent.insert(offset, deleted)
effectively undoes the delete.
-
depth_first
(filter=None)¶ Returns a list of the nodes in the tree in depth-first order.
-
elem_at_offset
(offset)¶ Get the
StringElem
in the tree that contains the string rendered at the given offset.
-
elem_offset
(elem)¶ Find the offset of
elem
in the current tree.This cannot be reliably used if
self.renderer
is used and even less so if the rendering function renders the string differently upon different calls. In Virtaal theStringElemGUI.index()
method is used as replacement for this one.Returns: The string index where element e
starts, or -1 ife
was not found.
-
encode
(encoding='utf-8')¶ More
unicode
class emulation.
-
find
(x)¶ Find sub-string
x
in this string tree and return the position at which it starts.
-
find_elems_with
(x)¶ Find all elements in the current sub-tree containing
x
.
-
flatten
(filter=None)¶ Flatten the tree by returning a depth-first search over the tree’s leaves.
-
get_index_data
(index)¶ Get info about the specified range in the tree.
Returns: A dictionary with the following items: - elem: The element in which
index
resides. - index: Copy of the
index
parameter - offset: The offset of
index
into'elem'
.
- elem: The element in which
-
get_parent_elem
(child)¶ Searches the current sub-tree for and returns the parent of the
child
element.
-
insert
(offset, text, preferred_parent=None)¶ Insert the given text at the specified offset of this string-tree’s string (Unicode) representation.
-
insert_between
(left, right, text)¶ Insert the given text between the two parameter
StringElem
s.
-
isleaf
()¶ Whether or not this instance is a leaf node in the
StringElem
tree.A node is a leaf node if it is a
StringElem
(not a sub-class) and contains only sub-elements of typestr
orunicode
.Return type: bool
-
iter_depth_first
(filter=None)¶ Iterate through the nodes in the tree in dept-first order.
-
map
(f, filter=None)¶ Apply
f
to all nodes for whichfilter
returnedTrue
(optional).
-
classmethod
parse
(pstr)¶ Parse an instance of this class from the start of the given string. This method should be implemented by any sub-class that wants to parseable by
translate.storage.placeables.parse
.Parameters: pstr (unicode) – The string to parse into an instance of this class. Returns: An instance of the current class, or None
if the string not parseable by this class.
-
print_tree
(indent=0, verbose=False)¶ Print the tree from the current instance’s point in an indented manner.
-
prune
()¶ Remove unnecessary nodes to make the tree optimal.
-
remove_type
(ptype)¶ Replace nodes with type
ptype
with baseStringElem
s, containing the same sub-elements. This is only applicable to elements below the element tree root node.
-
translate
()¶ Transform the sub-tree according to some class-specific needs. This method should be either overridden in implementing sub-classes or dynamically replaced by specific applications.
Returns: The transformed Unicode string representing the sub-tree.
-
php¶
Classes that hold units of PHP localisation files phpunit
or
entire files phpfile
. These files are used in translating many
PHP based applications.
Only PHP files written with these conventions are supported:
<?php
$lang['item'] = "vale"; # Array of values
$some_entity = "value"; # Named variables
define("ENTITY", "value");
$lang = array(
'item1' => 'value1' , #Supports space before comma
'item2' => 'value2',
);
$lang = array( # Nested arrays
'item1' => 'value1',
'item2' => array(
'key' => 'value' , #Supports space before comma
'key2' => 'value2',
),
);
Nested arrays without key for nested array are not supported:
<?php
$lang = array(array('key' => 'value'));
The working of PHP strings and specifically the escaping conventions which differ between single quote (‘) and double quote (“) characters are implemented as outlined in the PHP documentation for the String type.
-
translate.storage.php.
phpdecode
(text, quotechar="'")¶ Convert PHP escaped string to a Python string.
-
translate.storage.php.
phpencode
(text, quotechar="'")¶ Convert Python string to PHP escaping.
The encoding is implemented for ‘single quote’ and “double quote” syntax.
heredoc and nowdoc are not implemented and it is not certain whether this would ever be needed for PHP localisation needs.
-
class
translate.storage.php.
phpfile
(inputfile=None, **kwargs)¶ This class represents a PHP file, made up of phpunits.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(phpsrc)¶ Read the source of a PHP file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Convert the units back to lines.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.php.
phpunit
(source='')¶ A unit of a PHP file: a name, a value, and any comments associated.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
getoutput
(indent='', name=None)¶ Convert the unit back into formatted lines for a php file.
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Return whether this is a blank element, containing only comments.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.php.
wrap_production
(func)¶ Decorator for production functions to store lexer positions.
pocommon¶
-
translate.storage.pocommon.
extract_msgid_comment
(text)¶ The one definitive way to extract a msgid comment out of an unescaped unicode string that might contain it.
Return type: unicode
-
class
translate.storage.pocommon.
pofile
(inputfile=None, **kwargs)¶ -
UnitClass
¶
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Return the project based on information in the header.
- The project is determined in the following sequence:
- Use the ‘X-Project-Style’ entry in the header.
- Use ‘Report-Msgid-Bug-To’ entry
- Use the ‘X-Accelerator’ entry
- Use the Project ID
- Analyse the file itself (not yet implemented)
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Return the target language based on information in the header.
- The target language is determined in the following sequence:
- Use the ‘Language’ entry in the header.
- Poedit’s custom headers.
- Analysing the ‘Language-Team’ entry.
-
getunits
()¶ Return a list of all units in this store.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
parse
(data)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project in the header.
Parameters: project_style (str) – the new project
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(lang)¶ Set the target language in the header.
This removes any custom Poedit headers if they exist.
Parameters: lang (str) – the new target language code
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
class
translate.storage.pocommon.
pounit
(source=None)¶ -
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(present=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review. Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.pocommon.
quote_plus
(text)¶ Quote the query fragment of a URL; replacing ‘ ‘ with ‘+’
-
translate.storage.pocommon.
unquote_plus
(text)¶ unquote(‘%7e/abc+def’) -> ‘~/abc def’
poheader¶
class that handles all header functions for a header in a po file
-
translate.storage.poheader.
parseheaderstring
(input)¶ Parses an input string with the definition of a PO header and returns the interpreted values as a dictionary.
-
class
translate.storage.poheader.
poheader
¶ This class implements functionality for manipulation of po file headers. This class is a mix-in class and useless on its own. It must be used from all classes which represent a po file
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getprojectstyle
()¶ Return the project based on information in the header.
- The project is determined in the following sequence:
- Use the ‘X-Project-Style’ entry in the header.
- Use ‘Report-Msgid-Bug-To’ entry
- Use the ‘X-Accelerator’ entry
- Use the Project ID
- Analyse the file itself (not yet implemented)
-
gettargetlanguage
()¶ Return the target language based on information in the header.
- The target language is determined in the following sequence:
- Use the ‘Language’ entry in the header.
- Poedit’s custom headers.
- Analysing the ‘Language-Team’ entry.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
setprojectstyle
(project_style)¶ Set the project in the header.
Parameters: project_style (str) – the new project
-
settargetlanguage
(lang)¶ Set the target language in the header.
This removes any custom Poedit headers if they exist.
Parameters: lang (str) – the new target language code
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
translate.storage.poheader.
tzstring
()¶ Returns the timezone as a string in the format [+-]0000, eg +0200.
Return type: str
-
translate.storage.poheader.
update
(existing, add=False, **kwargs)¶ Update an existing header dictionary with the values in kwargs, adding new values only if add is true.
Returns: Updated dictionary of header entries Return type: dict of strings
poparser¶
-
translate.storage.poparser.
decode_header
(unit, decode)¶ The header has been arbitrarily decoded with a single-byte encoding. We re-encode it to decode values with the proper encoding defined in the header (using decode_list above).
-
translate.storage.poparser.
read_obsolete_lines
(parse_state)¶ Read all the lines belonging to the current unit if obsolete.
-
translate.storage.poparser.
read_prevmsgid_lines
(parse_state)¶ Read all the lines belonging starting with #|. These lines contain the previous msgid and msgctxt info. We strip away the leading ‘#| ‘ and read until we stop seeing #|.
po¶
A class loader that will load C or Python implementations of the PO class
depending on the USECPO
variable.
Use the environment variable USECPO=2
(or USECPO=1
) to choose the
C implementation which uses Gettext’s libgettextpo for high parsing speed.
Otherwise the local Python based parser is used (slower but very well
tested).
poxliff¶
XLIFF classes specifically suited for handling the PO representation in XLIFF.
This way the API supports plurals as if it was a PO file, for example.
-
class
translate.storage.poxliff.
PoXliffFile
(*args, **kwargs)¶ a file for the po variant of Xliff files
-
UnitClass
¶ alias of
PoXliffUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Initialise the file header.
-
addplural
(source, target, filename, createifmissing=False)¶ This method should now be unnecessary, but is left for reference
-
addsourceunit
(source, filename='NoName', createifmissing=False)¶ adds the given trans-unit to the last used body node if the filename has changed it uses the slow method instead (will create the nodes required if asked). Returns success
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
createfilenode
(filename, sourcelanguage='en-US', datatype='po')¶ creates a filenode with the given filename. All parameters are needed for XLIFF compliance.
-
creategroup
(filename='NoName', createifmissing=False, restype=None)¶ adds a group tag into the specified file
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getbodynode
(filenode, createifmissing=False)¶ finds the body node for the given filenode
-
getdatatype
(filename=None)¶ Returns the datatype of the stored file. If no filename is given, the datatype of the first file is given.
-
getdate
(filename=None)¶ Returns the date attribute for the file.
If no filename is given, the date of the first file is given. If the date attribute is not specified, None is returned.
Returns: Date attribute of file Return type: Date or None
-
getfilename
(filenode)¶ returns the name of the given file
-
getfilenames
()¶ returns all filenames in this XLIFF file
-
getfilenode
(filename, createifmissing=False)¶ finds the filenode with the given name
-
getheadernode
(filenode, createifmissing=False)¶ finds the header node for the given filenode
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
initbody
()¶ Initialises self.body so it never needs to be retrieved from the XML again.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
classmethod
parsestring
(storestring)¶ Parses the string to return the correct file object
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
removedefaultfile
()¶ We want to remove the default file-tag as soon as possible if we know if still present and empty.
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a string containing the file’s XML
-
setfilename
(filenode, filename)¶ set the name of the given file
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(language)¶ Set the source language for this store.
-
settargetlanguage
(language)¶ Set the target language for this store.
-
switchfile
(filename, createifmissing=False)¶ Adds the given trans-unit (will create the nodes required if asked).
Returns: Success Return type: Boolean
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
class
translate.storage.poxliff.
PoXliffUnit
(source=None, empty=False, **kwargs)¶ A class to specifically handle the plural units created from a po file.
-
addalttrans
(txt, origin=None, lang=None, sourcetxt=None, matchquality=None)¶ Adds an alt-trans tag and alt-trans components to the unit.
Parameters: txt (String) – Alternative translation of the source text.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Add a note specifically in a “note” tag
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
correctorigin
(node, origin)¶ Check against node tag’s origin (e.g note or alt-trans)
-
createcontextgroup
(name, contexts=None, purpose=None)¶ Add the context group to the trans-unit with contexts a list with (type, text) tuples describing each context.
-
createlanguageNode
(lang, text, purpose)¶ Returns an xml Element setup with given parameters.
-
delalttrans
(alternative)¶ Removes the supplied alternative from the list of alt-trans tags
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
get_rich_target
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
getalttrans
(origin=None)¶ Returns <alt-trans> for the given origin as a list of units. No origin means all alternatives.
-
getautomaticcomments
()¶ Returns the automatic comments (x-po-autocomment), which corresponds to the #. style po comments.
-
getcontext
()¶ Get the message context.
-
getcontextgroups
(name)¶ Returns the contexts in the context groups with the specified name
-
geterrors
()¶ Get all error messages.
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ We override this to get source and target nodes.
-
getlocations
()¶ Returns all the references (source locations)
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
getrestype
()¶ returns the restype attribute in the trans-unit tag
-
gettarget
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
gettranslatorcomments
()¶ Returns the translator comments (x-po-trancomment), which corresponds to the # style po comments.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isapproved
()¶ States whether this unit is approved.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ States whether this unit needs to be reviewed
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markapproved
(value=True)¶ Mark this unit as approved.
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
(origin='translator')¶ Remove all the translator notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(id)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the target string to the given value.
-
unit_iter
()¶ Iterator that only returns this unit.
-
project¶
-
class
translate.storage.project.
Project
(projstore=None)¶ Manages a project store as well as the processes involved in a project workflow.
-
add_source
(srcfile, src_fname=None)¶ Proxy for
self.store.append_sourcefile()
.
-
add_source_convert
(srcfile, src_fname=None, convert_options=None, extension=None)¶ Convenience method that calls
add_source()
andconvert_forward()
and returns the results from both.
-
close
()¶ Proxy for
self.store.close()
.
-
convert_forward
(input_fname, template=None, output_fname=None, **options)¶ Convert the given input file to the next type in the process:
Source document (eg. ODT) -> Translation file (eg. XLIFF) -> Translated document (eg. ODT).
Parameters: - input_fname (basestring) – The project name of the file to convert
- convert_options (Dictionary (optional)) – Passed as-is to
translate.convert.factory.convert()
.
Returns 2-tuple: the converted file object and its project name.
-
export_file
(fname, destfname)¶ Export the file with the specified filename to the given destination. This method will raise
FileNotInProjectError
via the call toget_file()
if fname is not found in the project.
-
get_file
(fname)¶ Proxy for
self.store.get_file()
.
-
get_proj_filename
(realfname)¶ Proxy for
self.store.get_proj_filename()
.
-
get_real_filename
(projfname)¶ Try and find a real file name for the given project file name.
-
remove_file
(projfname, ftype=None)¶ Proxy for
self.store.remove_file()
.
-
save
(filename=None)¶ Proxy for
self.store.save()
.
-
update_file
(proj_fname, infile)¶ Proxy for
self.store.update_file()
.
-
projstore¶
-
exception
translate.storage.projstore.
FileExistsInProjectError
¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
translate.storage.projstore.
FileNotInProjectError
¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
class
translate.storage.projstore.
ProjectStore
¶ Basic project file container.
-
append_file
(afile, fname, ftype='trans', delete_orig=False)¶ Append the given file to the project with the given filename, marked to be of type
ftype
(‘src’, ‘trans’, ‘tgt’).Parameters: delete_orig (bool) – Whether or not the original (given) file should be deleted after being appended. This is set to True
byconvert_forward()
. Not used in this class.
-
get_file
(fname, mode='rb')¶ Retrieve the file with the given name from the project store.
The file is looked up in the
self._files
dictionary. The values in this dictionary may beNone
, to indicate that the file is not cacheable and needs to be retrieved in a special way. This special way must be defined in this method of sub-classes. The value may also be a string, which indicates that it is a real file accessible viaopen
.Parameters: mode (str) – The mode in which to re-open the file (if it is closed).
-
get_filename_type
(fname)¶ Get the type of file (‘src’, ‘trans’, ‘tgt’) with the given name.
-
get_proj_filename
(realfname)¶ Try and find a project file name for the given real file name.
-
load
(*args, **kwargs)¶ Load the project in some way. Undefined for this (base) class.
-
remove_file
(fname, ftype=None)¶ Remove the file with the given project name from the project. If the file type (‘src’, ‘trans’ or ‘tgt’) is not given, it is guessed.
-
save
(filename=None, *args, **kwargs)¶ Save the project in some way. Undefined for this (base) class.
-
sourcefiles
¶ Read-only access to
self._sourcefiles
.
-
targetfiles
¶ Read-only access to
self._targetfiles
.
-
transfiles
¶ Read-only access to
self._transfiles
.
-
update_file
(pfname, infile)¶ Remove the project file with name
pfname
and add the contents frominfile
to the project under the same file name.Returns: the results from ProjectStore.append_file()
.
-
properties¶
Classes that hold units of .properties, and similar, files that are used in translating Java, Mozilla, MacOS and other software.
The propfile
class is a monolingual class with propunit
providing unit level access.
The .properties store has become a general key value pair class with
Dialect
providing the ability to change the behaviour of the
parsing and handling of the various dialects.
Currently we support:
- Java .properties
- Mozilla .properties
- Adobe Flex files
- MacOS X .strings files
- Skype .lang files
The following provides references and descriptions of the various dialects supported:
- Java
Java .properties are supported completely except for the ability to drop pairs that are not translated.
The following .properties file description gives a good references to the .properties specification.
Properties file may also hold Java MessageFormat messages. No special handling is provided in this storage class for MessageFormat, but this may be implemented in future.
All delimiter types, comments, line continuations and spaces handling in delimeters are supported.
- Mozilla
- Mozilla files use ‘=’ as a delimiter, are UTF-8 encoded and thus don’t need u escaping. Any U values will be converted to correct Unicode characters.
- Strings
- Mac OS X strings files are implemented using these two articles as references.
- Flex
- Adobe Flex files seem to be normal .properties files but in UTF-8 just like Mozilla files. This page provides the information used to implement the dialect.
- Skype
- Skype .lang files seem to be UTF-16 encoded .properties files.
A simple summary of what is permissible follows.
Comments supported:
# a comment
// a comment (only at the beginning of a line)
# The following are # escaped to render in docs
# ! is standard but not widely supported
#! a comment
# /* is non-standard but used on some implementations
#/* a comment (not across multiple lines) */
Name and Value pairs:
# Delimiters
key = value
key : value
# Whitespace delimiter
# key[sp]value
# Space in key and around value
\ key\ = \ value
# Note that the b and c are escaped for reST rendering
b = a string with escape sequences \t \n \r \\ \" \' \ (space) ģ
c = a string with a continuation line \
continuation line
# Special cases
# key with no value
//key (escaped; doesn't render in docs)
# value no key (extractable in prop2po but not mergeable in po2prop)
=value
# .strings specific
"key" = "value";
-
class
translate.storage.properties.
Dialect
¶ Settings for the various behaviours in key=value files.
-
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectFlex
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectGaia
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectJava
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectJavaUtf8
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectJoomla
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectMozilla
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectSkype
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectStrings
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
class
translate.storage.properties.
DialectStringsUtf8
¶ -
classmethod
encode
(string, encoding=None)¶ Encode the string
-
classmethod
find_delimiter
(line)¶ Find the type and position of the delimiter in a property line.
Property files can be delimited by “=”, “:” or whitespace (space for now). We find the position of each delimiter, then find the one that appears first.
Parameters: - line (str) – A properties line
- delimiters (list) – valid delimiters
Returns: delimiter character and offset within line
Return type: Tuple (delimiter char, Offset Integer)
-
classmethod
key_strip
(key)¶ Strip unneeded characters from the key
-
classmethod
value_strip
(value)¶ Strip unneeded characters from the value
-
classmethod
-
translate.storage.properties.
accesskeysuffixes
= ('.accesskey', '.accessKey', '.akey')¶ Accesskey Suffixes: entries with this suffix may be combined with labels ending in
labelsuffixes
into accelerator notation
-
translate.storage.properties.
is_comment_end
(line)¶ Determine whether a line ends a new multi-line comment.
Parameters: line (unicode) – A properties line Returns: True if line ends a new multi-line comment Return type: bool
-
translate.storage.properties.
is_comment_one_line
(line)¶ Determine whether a line is a one-line comment.
Parameters: line (unicode) – A properties line Returns: True if line is a one-line comment Return type: bool
-
translate.storage.properties.
is_comment_start
(line)¶ Determine whether a line starts a new multi-line comment.
Parameters: line (unicode) – A properties line Returns: True if line starts a new multi-line comment Return type: bool
-
translate.storage.properties.
is_line_continuation
(line)¶ Determine whether line has a line continuation marker.
.properties files can be terminated with a backslash () indicating that the ‘value’ continues on the next line. Continuation is only valid if there are an odd number of backslashses (an even number would result in a set of N/2 slashes not an escape)
Parameters: line (str) – A properties line Returns: Does line end with a line continuation Return type: Boolean
-
class
translate.storage.properties.
javafile
(*args, **kwargs)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(propsrc)¶ Read the source of a properties file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.properties.
javautf8file
(*args, **kwargs)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(propsrc)¶ Read the source of a properties file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.properties.
joomlafile
(*args, **kwargs)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(propsrc)¶ Read the source of a properties file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
translate.storage.properties.
labelsuffixes
= ('.label', '.title')¶ Label suffixes: entries with this suffix are able to be comibed with accesskeys found in in entries ending with
accesskeysuffixes
-
class
translate.storage.properties.
propfile
(inputfile=None, personality='java', encoding=None)¶ this class represents a .properties file, made up of propunits
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(propsrc)¶ Read the source of a properties file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.properties.
propunit
(source='', personality='java')¶ An element of a properties file i.e. a name and value, and any comments associated.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
getoutput
()¶ Convert the element back into formatted lines for a .properties file
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ returns whether this is a blank element, containing only comments.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.properties.
register_dialect
(dialect)¶ Decorator that registers the dialect.
-
class
translate.storage.properties.
stringsfile
(*args, **kwargs)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(propsrc)¶ Read the source of a properties file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.properties.
stringsutf8file
(*args, **kwargs)¶ -
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(propsrc)¶ Read the source of a properties file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
pypo¶
Classes that hold units of Gettext .po files (pounit) or entire files (pofile).
-
translate.storage.pypo.
escapeforpo
(line)¶ Escapes a line for po format. assumes no occurs in the line.
param line: unescaped text
-
translate.storage.pypo.
extractpoline
(line)¶ Remove quote and unescape line from po file.
Parameters: line – a quoted line from a po file (msgid or msgstr) Deprecated since version 1.10: Replaced by
unescape()
.extractpoline()
is kept to allow tests of correctness, and in case of external users.
-
translate.storage.pypo.
lsep
= '\n#: '¶ Separator for #: entries
-
class
translate.storage.pypo.
pofile
(inputfile=None, width=None, **kwargs)¶ A .po file containing various units
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
decode
(lines)¶ decode any non-unicode strings in lines with self.encoding
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
encode
(lines)¶ encode any unicode strings in lines in self.encoding
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getheaderplural
()¶ Returns the nplural and plural values from the header.
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Return the project based on information in the header.
- The project is determined in the following sequence:
- Use the ‘X-Project-Style’ entry in the header.
- Use ‘Report-Msgid-Bug-To’ entry
- Use the ‘X-Accelerator’ entry
- Use the Project ID
- Analyse the file itself (not yet implemented)
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Return the target language based on information in the header.
- The target language is determined in the following sequence:
- Use the ‘Language’ entry in the header.
- Poedit’s custom headers.
- Analysing the ‘Language-Team’ entry.
-
getunits
()¶ Return a list of all units in this store.
-
header
()¶ Returns the header element, or None. Only the first element is allowed to be a header. Note that this could still return an empty header element, if present.
-
init_headers
(charset='UTF-8', encoding='8bit', **kwargs)¶ sets default values for po headers
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeheader
(**kwargs)¶ Create a header for the given filename.
Check .makeheaderdict() for information on parameters.
-
makeheaderdict
(charset='CHARSET', encoding='ENCODING', project_id_version=None, pot_creation_date=None, po_revision_date=None, last_translator=None, language_team=None, mime_version=None, plural_forms=None, report_msgid_bugs_to=None, **kwargs)¶ Create a header dictionary with useful defaults.
pot_creation_date can be None (current date) or a value (datetime or string) po_revision_date can be None (form), False (=pot_creation_date), True (=now), or a value (datetime or string)
Returns: Dictionary with the header items Return type: dict of strings
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
-
mergeheaders
(otherstore)¶ Merges another header with this header.
This header is assumed to be the template.
-
parse
(input)¶ Parses the given file or file source string.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
parseheader
()¶ Parses the PO header and returns the interpreted values as a dictionary.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
removeduplicates
(duplicatestyle='merge')¶ Make sure each msgid is unique ; merge comments etc from duplicates into original
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write to file
-
setprojectstyle
(project_style)¶ Set the project in the header.
Parameters: project_style (str) – the new project
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(lang)¶ Set the target language in the header.
This removes any custom Poedit headers if they exist.
Parameters: lang (str) – the new target language code
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
updatecontributor
(name, email=None)¶ Add contribution comments if necessary.
-
updateheader
(add=False, **kwargs)¶ Updates the fields in the PO style header.
This will create a header if add == True.
-
updateheaderplural
(nplurals, plural)¶ Update the Plural-Form PO header.
-
-
class
translate.storage.pypo.
pounit
(source=None, wrapper=None, **kwargs)¶ -
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add a location to sourcecomments in the PO unit
Parameters: location (String) – Text location e.g. ‘file.c:23’ does not include #:
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ This is modeled on the XLIFF method.
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getalttrans
()¶ Return a list of alternate units.
Previous msgid and current msgstr is combined to form a single alternative unit.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
-
getid
()¶ Returns a unique identifier for this unit.
-
getlocations
()¶ Get a list of locations from sourcecomments in the PO unit
rtype: List return: A list of the locations with ‘#: ‘ stripped
-
getnotes
(origin=None)¶ Return comments based on origin value.
Parameters: origin – programmer, developer, source code, translator or None
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasmarkedcomment
(commentmarker)¶ Check whether the given comment marker is present.
These should appear as:
# (commentmarker) ...
-
hasplural
()¶ returns whether this pounit contains plural strings…
-
hastypecomment
(typecomment)¶ Check whether the given type comment is present
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Makes this unit obsolete
-
markfuzzy
(present=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review. Adds an optional explanation as a note.
-
merge
(otherpo, overwrite=False, comments=True, authoritative=False)¶ Merges the otherpo (with the same msgid) into this one.
Overwrite non-blank self.msgstr only if overwrite is True merge comments only if comments is True
-
msgidcomment
¶ Extract KDE style msgid comments from the unit.
Return type: String Returns: Returns the extracted msgidcomments found in this unit’s msgid.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
prev_source
¶ Returns the unescaped msgid
-
removenotes
()¶ Remove all the translator’s notes (other comments)
-
resurrect
()¶ Makes an obsolete unit normal
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settypecomment
(typecomment, present=True)¶ Alters whether a given typecomment is present
-
source
¶ Returns the unescaped msgid
-
target
¶ Returns the unescaped msgstr
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.pypo.
quoteforpo
(text, wrapper_obj=None)¶ Quotes the given text for a PO file, returning quoted and escaped lines
-
translate.storage.pypo.
splitlines
(text)¶ Split lines based on first newline char.
Can not use univerzal newlines as they match any newline like character inside text and that breaks on files with unix newlines and LF chars inside comments.
-
translate.storage.pypo.
unescape
(line)¶ Unescape the given line.
Quotes on either side should already have been removed.
qm¶
Module for parsing Qt .qm files.
Note
Based on documentation from Gettext’s .qm implementation (see write-qt.c) and on observation of the output of lrelease.
Note
Certain deprecated section tags are not implemented. These will break and print out the missing tag. They are easy to implement and should follow the structure in 03 (Translation). We could find no examples that use these so we’d rather leave it unimplemented until we actually have test data.
Note
Many .qm files are unable to be parsed as they do not have the source text. We assume that since they use a hash table to lookup the data there is actually no need for the source text. It seems however that in Qt4’s lrelease all data is included in the resultant .qm file.
Note
We can only parse, not create, a .qm file. The main issue is that we need to implement the hashing algorithm (which seems to be identical to the Gettext hash algorithm). Unlike Gettext it seems that the hash is required, but that has not been validated.
Note
The code can parse files correctly. But it could be cleaned up to be more readable, especially the part that breaks the file into sections.
http://qt.gitorious.org/+kde-developers/qt/kde-qt/blobs/master/tools/linguist/shared/qm.cpp Plural information QLocale languages
-
class
translate.storage.qm.
qmfile
(inputfile=None, **kwargs)¶ A class representing a .qm file.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ Parses the given file or file source string.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Output a string representation of the .qm data file
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.qm.
qmunit
(source=None)¶ A class representing a .qm translation message.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.qm.
qmunpack
(file_='messages.qm')¶ Helper to unpack Qt .qm files into a Python string
qph¶
Module for handling Qt Linguist Phrase Book (.qph) files.
Extract from the Qt Linguist Manual: Translators: .qph Qt Phrase Book Files are human-readable XML files containing standard phrases and their translations. These files are created and updated by Qt Linguist and may be used by any number of projects and applications.
A DTD to define the format does not seem to exist, but the following code provides the reference implementation for the Qt Linguist product.
-
class
translate.storage.qph.
QphFile
(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶ Class representing a QPH file store.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Method to be overridden to initialise headers, etc.
-
addsourceunit
(source)¶ Adds and returns a new unit with the given string as first entry.
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this .qph file.
We don’t implement setsourcelanguage as users really shouldn’t be altering the source language in .qph files, it should be set correctly by the extraction tools.
Returns: ISO code e.g. af, fr, pt_BR Return type: String
-
gettargetlanguage
()¶ Get the target language for this .qph file.
Returns: ISO code e.g. af, fr, pt_BR Return type: String
-
getunits
()¶ Return a list of all units in this store.
-
initbody
()¶ Initialises self.body so it never needs to be retrieved from the XML again.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the XML document to the file out.
- We have to override this to ensure mimic the Qt convention:
- no XML declaration
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this .qph file to targetlanguage.
Parameters: targetlanguage (String) – ISO code e.g. af, fr, pt_BR
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.qph.
QphUnit
(source, empty=False, **kwargs)¶ A single term in the qph file.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Add a note specifically in a “definition” tag
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
createlanguageNode
(lang, text, purpose)¶ Returns an xml Element setup with given parameters.
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ We override this to get source and target nodes.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettarget
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
()¶ Remove all the translator notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the “target” string (second language), or alternatively appends to the list
-
unit_iter
()¶ Iterator that only returns this unit.
-
rc¶
Classes that hold units of .rc files (rcunit
) or entire files
(rcfile
) used in translating Windows Resources.
-
translate.storage.rc.
escape_to_python
(string)¶ Escape a given .rc string into a valid Python string.
-
translate.storage.rc.
escape_to_rc
(string)¶ Escape a given Python string into a valid .rc string.
-
class
translate.storage.rc.
rcfile
(inputfile=None, lang=None, sublang=None, **kwargs)¶ This class represents a .rc file, made up of rcunits.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(rcsrc)¶ Read the source of a .rc file in and include them as units.
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the units back to file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.rc.
rcunit
(source='', **kwargs)¶ A unit of an rc file
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
getoutput
()¶ Convert the element back into formatted lines for a .rc file.
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Returns whether this is a blank element, containing only comments.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
statistics¶
Module to provide statistics and related functionality.
-
class
translate.storage.statistics.
Statistics
(sourcelanguage='en', targetlanguage='en', checkerstyle=None)¶ Manages statistics for storage objects.
-
classifyunit
(unit)¶ Returns a list of the classes that the unit belongs to.
Parameters: unit – the unit to classify
-
classifyunits
()¶ Makes a dictionary of which units fall into which classifications.
This method iterates over all units.
-
countwords
()¶ Counts the source and target words in each of the units.
-
fuzzy_unitcount
()¶ Returns the number of fuzzy units.
-
fuzzy_units
()¶ Return a list of fuzzy units.
-
get_source_text
(units)¶ Joins the unit source strings in a single string of text.
-
getunits
()¶ Returns a list of all units in this object.
-
reclassifyunit
(item)¶ Updates the classification of a unit in self.classification.
Parameters: item – an integer that is an index in .getunits().
-
source_wordcount
()¶ Returns the number of words in the source text.
-
translated_unitcount
()¶ Returns the number of translated units.
-
translated_units
()¶ Return a list of translated units.
-
translated_wordcount
()¶ Returns the number of translated words in this object.
-
untranslated_unitcount
()¶ Returns the number of untranslated units.
-
untranslated_units
()¶ Return a list of untranslated units.
-
untranslated_wordcount
()¶ Returns the number of untranslated words in this object.
-
wordcount
(text)¶ Returns the number of words in the given text.
-
statsdb¶
Module to provide a cache of statistics in a database.
-
class
translate.storage.statsdb.
StatsCache
¶ An object instantiated as a singleton for each statsfile that provides access to the database cache from a pool of StatsCache objects.
-
con
= None¶ This cache’s connection
-
cur
= None¶ The current cursor
-
filechecks
(filename, checker, store=None)¶ Retrieves the error statistics for the given file if possible, otherwise delegates to cachestorechecks().
-
filestatestats
(filename, store=None, extended=False)¶ Return a dictionary of unit stats mapping sets of unit indices with those states
-
filestats
(filename, checker, store=None, extended=False)¶ Return a dictionary of property names mapping sets of unit indices with those properties.
-
filetotals
(filename, store=None, extended=False)¶ Retrieves the statistics for the given file if possible, otherwise delegates to cachestore().
-
unitstats
(filename, _lang=None, store=None)¶ Return a dictionary of property names mapping to arrays which map unit indices to property values.
Please note that this is different from filestats, since filestats supplies sets of unit indices with a given property, whereas this method supplies arrays which map unit indices to given values.
-
-
translate.storage.statsdb.
emptyfiletotals
()¶ Returns a dictionary with all statistics initalised to 0.
-
translate.storage.statsdb.
statefordb
(unit)¶ Returns the numeric database state for the unit.
-
translate.storage.statsdb.
transaction
(f)¶ Modifies f to commit database changes if it executes without exceptions. Otherwise it rolls back the database.
ALL publicly accessible methods in StatsCache MUST be decorated with this decorator.
-
translate.storage.statsdb.
wordsinunit
(unit)¶ Counts the words in the unit’s source and target, taking plurals into account. The target words are only counted if the unit is translated.
subtitles¶
Class that manages subtitle files for translation.
This class makes use of the subtitle functionality of gaupol
.
See also
gaupol/agents/open.py::open_main
A patch to gaupol is required to open utf-8 files successfully.
-
class
translate.storage.subtitles.
AdvSubStationAlphaFile
(*args, **kwargs)¶ specialized class for SubRipFile’s only
-
UnitClass
¶ alias of
SubtitleUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ parse the given file
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.subtitles.
MicroDVDFile
(*args, **kwargs)¶ specialized class for SubRipFile’s only
-
UnitClass
¶ alias of
SubtitleUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ parse the given file
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.subtitles.
SubRipFile
(*args, **kwargs)¶ specialized class for SubRipFile’s only
-
UnitClass
¶ alias of
SubtitleUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ parse the given file
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.subtitles.
SubStationAlphaFile
(*args, **kwargs)¶ specialized class for SubRipFile’s only
-
UnitClass
¶ alias of
SubtitleUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ parse the given file
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.subtitles.
SubtitleFile
(inputfile=None, **kwargs)¶ A subtitle file
-
UnitClass
¶ alias of
SubtitleUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ parse the given file
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.subtitles.
SubtitleUnit
(source=None, **kwargs)¶ A subtitle entry that is translatable
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
symbian¶
tbx¶
module for handling TBX glossary files
-
class
translate.storage.tbx.
tbxfile
(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶ Class representing a TBX file store.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Initialise headers with TBX specific things.
-
addsourceunit
(source)¶ Adds and returns a new unit with the given string as first entry.
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
initbody
()¶ Initialises self.body so it never needs to be retrieved from the XML again.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out=None)¶ Converts to a string containing the file’s XML
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.tbx.
tbxunit
(source, empty=False, **kwargs)¶ A single term in the TBX file. Provisional work is done to make several languages possible.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Add a note specifically in a “note” tag
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
createlanguageNode
(lang, text, purpose)¶ returns a langset xml Element setup with given parameters
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ Returns a list of all nodes that contain per language information.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettarget
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
(origin='translator')¶ Remove all the translator notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the “target” string (second language), or alternatively appends to the list
-
unit_iter
()¶ Iterator that only returns this unit.
-
tiki¶
Class that manages TikiWiki files for translation. Tiki files are <strike>ugly and inconsistent</strike> formatted as a single large PHP array with several special sections identified by comments. Example current as of 2008-12-01:
<?php
// Many comments at the top
$lang=Array(
// ### Start of unused words
"aaa" => "zzz",
// ### end of unused words
// ### start of untranslated words
// "bbb" => "yyy",
// ### end of untranslated words
// ### start of possibly untranslated words
"ccc" => "xxx",
// ### end of possibly untranslated words
"ddd" => "www",
"###end###"=>"###end###");
?>
In addition there are several auto-generated //-style comments scattered through the page and array, some of which matter when being parsed.
This has all been gleaned from the TikiWiki source. As far as I know no detailed documentation exists for the tiki language.php files.
-
class
translate.storage.tiki.
TikiStore
(inputfile=None)¶ Represents a tiki language.php file.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ Parse the given input into source units.
Parameters: input – the source, either a string or filehandle
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Will return a formatted tiki-style language.php file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.tiki.
TikiUnit
(source=None, **kwargs)¶ A tiki unit entry.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Location is defined by the comments in the file. This function will only set valid locations.
Parameters: location – Where the string is located in the file. Must be a valid location.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ Returns the a list of the location(s) of the string.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
tmdb¶
Module to provide a translation memory database.
tmx¶
module for parsing TMX translation memeory files
-
class
translate.storage.tmx.
tmxfile
(inputfile=None, sourcelanguage='en', targetlanguage=None, **kwargs)¶ Class representing a TMX file store.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Method to be overridden to initialise headers, etc.
-
addsourceunit
(source)¶ Adds and returns a new unit with the given string as first entry.
-
addtranslation
(source, srclang, translation, translang, comment=None)¶ addtranslation method for testing old unit tests
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
initbody
()¶ Initialises self.body so it never needs to be retrieved from the XML again.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out=None)¶ Converts to a string containing the file’s XML
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(sourcetext, sourcelang=None, targetlang=None)¶ method to test old unit tests
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.tmx.
tmxunit
(source, empty=False, **kwargs)¶ A single unit in the TMX file.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Add a note specifically in a “note” tag.
The origin parameter is ignored
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
copy
()¶ Make a copy of the translation unit.
We don’t want to make a deep copy - this could duplicate the whole XML tree. For now we just serialise and reparse the unit’s XML.
-
createlanguageNode
(lang, text, purpose)¶ returns a langset xml Element setup with given parameters
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
-
getid
()¶ Returns the identifier for this unit. The optional tuid property is used if available, otherwise we inherit .getid(). Note that the tuid property is only mandated to be unique from TMX 2.0.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ Returns a list of all nodes that contain per language information.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettarget
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
()¶ Remove all the translator notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the “target” string (second language), or alternatively appends to the list
-
unit_iter
()¶ Iterator that only returns this unit.
-
trados¶
Manage the Trados .txt Translation Memory format
A Trados file looks like this:
<TrU>
<CrD>18012000, 13:18:35
<CrU>CAROL-ANN
<UsC>0
<Seg L=EN_GB>Association for Road Safety \endash Conference
<Seg L=DE_DE>Tagung der Gesellschaft für Verkehrssicherheit
</TrU>
<TrU>
<CrD>18012000, 13:19:14
<CrU>CAROL-ANN
<UsC>0
<Seg L=EN_GB>Road Safety Education in our Schools
<Seg L=DE_DE>Verkehrserziehung an Schulen
</TrU>
-
translate.storage.trados.
RTF_ESCAPES
= {'\\-': '\xad', '\\_': '‑', '\\bullet': '•', '\\emdash': '—', '\\emspace': '\u2003', '\\endash': '–', '\\enspace': '\u2002', '\\ldblquote': '“', '\\lquote': '‘', '\\rdblquote': '”', '\\rquote': '’', '\\~': '\xa0'}¶ RTF control to Unicode map. See http://msdn.microsoft.com/en-us/library/aa140283(v=office.10).aspx
-
translate.storage.trados.
TRADOS_TIMEFORMAT
= '%d%m%Y, %H:%M:%S'¶ Time format used by Trados .txt
-
class
translate.storage.trados.
TradosSoup
(markup='', features=None, builder=None, parse_only=None, from_encoding=None, exclude_encodings=None, **kwargs)¶ -
append
(tag)¶ Appends the given tag to the contents of this tag.
-
clear
(decompose=False)¶ Extract all children. If decompose is True, decompose instead.
-
decode
(pretty_print=False, eventual_encoding='utf-8', formatter='minimal')¶ Returns a string or Unicode representation of this document. To get Unicode, pass None for encoding.
-
decode_contents
(indent_level=None, eventual_encoding='utf-8', formatter='minimal')¶ Renders the contents of this tag as a Unicode string.
Parameters: - indent_level – Each line of the rendering will be indented this many spaces.
- eventual_encoding – The tag is destined to be encoded into this encoding. This method is _not_ responsible for performing that encoding. This information is passed in so that it can be substituted in if the document contains a <META> tag that mentions the document’s encoding.
- formatter – The output formatter responsible for converting entities to Unicode characters.
-
decompose
()¶ Recursively destroys the contents of this tree.
-
encode_contents
(indent_level=None, encoding='utf-8', formatter='minimal')¶ Renders the contents of this tag as a bytestring.
Parameters: - indent_level – Each line of the rendering will be indented this many spaces.
- eventual_encoding – The bytestring will be in this encoding.
- formatter – The output formatter responsible for converting entities to Unicode characters.
-
extract
()¶ Destructively rips this element out of the tree.
-
fetchNextSiblings
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns the siblings of this Tag that match the given criteria and appear after this Tag in the document.
-
fetchParents
(name=None, attrs={}, limit=None, **kwargs)¶ Returns the parents of this Tag that match the given criteria.
-
fetchPrevious
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns all items that match the given criteria and appear before this Tag in the document.
-
fetchPreviousSiblings
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns the siblings of this Tag that match the given criteria and appear before this Tag in the document.
-
find
(name=None, attrs={}, recursive=True, text=None, **kwargs)¶ Return only the first child of this Tag matching the given criteria.
-
findAll
(name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)¶ Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.
The value of a key-value pair in the ‘attrs’ map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of ‘matches’. The same is true of the tag name.
-
findAllNext
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns all items that match the given criteria and appear after this Tag in the document.
-
findAllPrevious
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns all items that match the given criteria and appear before this Tag in the document.
-
findChild
(name=None, attrs={}, recursive=True, text=None, **kwargs)¶ Return only the first child of this Tag matching the given criteria.
-
findChildren
(name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)¶ Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.
The value of a key-value pair in the ‘attrs’ map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of ‘matches’. The same is true of the tag name.
-
findNext
(name=None, attrs={}, text=None, **kwargs)¶ Returns the first item that matches the given criteria and appears after this Tag in the document.
-
findNextSibling
(name=None, attrs={}, text=None, **kwargs)¶ Returns the closest sibling to this Tag that matches the given criteria and appears after this Tag in the document.
-
findNextSiblings
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns the siblings of this Tag that match the given criteria and appear after this Tag in the document.
-
findParent
(name=None, attrs={}, **kwargs)¶ Returns the closest parent of this Tag that matches the given criteria.
-
findParents
(name=None, attrs={}, limit=None, **kwargs)¶ Returns the parents of this Tag that match the given criteria.
-
findPrevious
(name=None, attrs={}, text=None, **kwargs)¶ Returns the first item that matches the given criteria and appears before this Tag in the document.
-
findPreviousSibling
(name=None, attrs={}, text=None, **kwargs)¶ Returns the closest sibling to this Tag that matches the given criteria and appears before this Tag in the document.
-
findPreviousSiblings
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns the siblings of this Tag that match the given criteria and appear before this Tag in the document.
-
find_all
(name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)¶ Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.
The value of a key-value pair in the ‘attrs’ map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of ‘matches’. The same is true of the tag name.
-
find_all_next
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns all items that match the given criteria and appear after this Tag in the document.
-
find_all_previous
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns all items that match the given criteria and appear before this Tag in the document.
-
find_next
(name=None, attrs={}, text=None, **kwargs)¶ Returns the first item that matches the given criteria and appears after this Tag in the document.
-
find_next_sibling
(name=None, attrs={}, text=None, **kwargs)¶ Returns the closest sibling to this Tag that matches the given criteria and appears after this Tag in the document.
-
find_next_siblings
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns the siblings of this Tag that match the given criteria and appear after this Tag in the document.
-
find_parent
(name=None, attrs={}, **kwargs)¶ Returns the closest parent of this Tag that matches the given criteria.
-
find_parents
(name=None, attrs={}, limit=None, **kwargs)¶ Returns the parents of this Tag that match the given criteria.
-
find_previous
(name=None, attrs={}, text=None, **kwargs)¶ Returns the first item that matches the given criteria and appears before this Tag in the document.
-
find_previous_sibling
(name=None, attrs={}, text=None, **kwargs)¶ Returns the closest sibling to this Tag that matches the given criteria and appears before this Tag in the document.
-
find_previous_siblings
(name=None, attrs={}, text=None, limit=None, **kwargs)¶ Returns the siblings of this Tag that match the given criteria and appear before this Tag in the document.
-
format_string
(s, formatter='minimal')¶ Format the given string using the given formatter.
-
get
(key, default=None)¶ Returns the value of the ‘key’ attribute for the tag, or the value given for ‘default’ if it doesn’t have that attribute.
-
getText
(separator='', strip=False, types=(<class 'bs4.element.NavigableString'>, <class 'bs4.element.CData'>))¶ Get all child strings, concatenated using the given separator.
-
get_attribute_list
(key, default=None)¶ The same as get(), but always returns a list.
-
get_text
(separator='', strip=False, types=(<class 'bs4.element.NavigableString'>, <class 'bs4.element.CData'>))¶ Get all child strings, concatenated using the given separator.
-
handle_starttag
(name, namespace, nsprefix, attrs)¶ Push a start tag on to the stack.
If this method returns None, the tag was rejected by the SoupStrainer. You should proceed as if the tag had not occurred in the document. For instance, if this was a self-closing tag, don’t call handle_endtag.
-
has_key
(key)¶ This was kind of misleading because has_key() (attributes) was different from __in__ (contents). has_key() is gone in Python 3, anyway.
-
index
(element)¶ Find the index of a child by identity, not value. Avoids issues with tag.contents.index(element) getting the index of equal elements.
-
insert_after
(successor)¶ Makes the given element the immediate successor of this one.
The two elements will have the same parent, and the given element will be immediately after this one.
-
insert_before
(successor)¶ Makes the given element the immediate predecessor of this one.
The two elements will have the same parent, and the given element will be immediately before this one.
-
isSelfClosing
¶ Is this tag an empty-element tag? (aka a self-closing tag)
A tag that has contents is never an empty-element tag.
A tag that has no contents may or may not be an empty-element tag. It depends on the builder used to create the tag. If the builder has a designated list of empty-element tags, then only a tag whose name shows up in that list is considered an empty-element tag.
If the builder has no designated list of empty-element tags, then any tag with no contents is an empty-element tag.
-
is_empty_element
¶ Is this tag an empty-element tag? (aka a self-closing tag)
A tag that has contents is never an empty-element tag.
A tag that has no contents may or may not be an empty-element tag. It depends on the builder used to create the tag. If the builder has a designated list of empty-element tags, then only a tag whose name shows up in that list is considered an empty-element tag.
If the builder has no designated list of empty-element tags, then any tag with no contents is an empty-element tag.
-
new_string
(s, subclass=<class 'bs4.element.NavigableString'>)¶ Create a new NavigableString associated with this soup.
-
new_tag
(name, namespace=None, nsprefix=None, attrs={}, **kwattrs)¶ Create a new tag associated with this soup.
-
object_was_parsed
(o, parent=None, most_recent_element=None)¶ Add an object to the parse tree.
-
select
(selector, _candidate_generator=None, limit=None)¶ Perform a CSS selection operation on the current element.
-
select_one
(selector)¶ Perform a CSS selection operation on the current element.
-
setup
(parent=None, previous_element=None, next_element=None, previous_sibling=None, next_sibling=None)¶ Sets up the initial relations between this element and other elements.
-
string
¶ Convenience property to get the single string within this tag.
Return: If this tag has a single string child, return value is that string. If this tag has no children, or more than one child, return value is None. If this tag has one child tag, return value is the ‘string’ attribute of the child tag, recursively.
-
strings
¶ Yield all strings of certain classes, possibly stripping them.
By default, yields only NavigableString and CData objects. So no comments, processing instructions, etc.
-
text
¶ Get all child strings, concatenated using the given separator.
-
-
class
translate.storage.trados.
TradosTxtDate
(newtime=None)¶ Manages the timestamps in the Trados .txt format of DDMMYYY, hh:mm:ss
-
get_time
()¶ Get the time_struct object
-
get_timestring
()¶ Get the time in the Trados time format
-
set_time
(newtime)¶ Set the time_struct object
Parameters: newtime (time.time_struct) – a new time object
-
set_timestring
(timestring)¶ Set the time_struct object using a Trados time formated string
Parameters: timestring (String) – A Trados time string (DDMMYYYY, hh:mm:ss)
-
time
¶ Get the time_struct object
-
timestring
¶ Get the time in the Trados time format
-
-
class
translate.storage.trados.
TradosTxtTmFile
(inputfile=None, **kwargs)¶ A Trados translation memory file
-
UnitClass
¶ alias of
TradosUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parser to process the given source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.trados.
TradosUnit
(source=None)¶ -
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
-
translate.storage.trados.
escape
(text)¶ Convert Unicode string to Trodas escapes
-
translate.storage.trados.
unescape
(text)¶ Convert Trados text to normal Unicode string
ts2¶
Module for handling Qt linguist (.ts) files.
This will eventually replace the older ts.py which only supports the older format. While converters haven’t been updated to use this module, we retain both.
TS file format 4.3, 4.8, 5. Example.
Specification of the valid variable entries, 2
-
class
translate.storage.ts2.
tsfile
(*args, **kwargs)¶ Class representing a TS file store.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Method to be overridden to initialise headers, etc.
-
addsourceunit
(source)¶ Adds and returns a new unit with the given string as first entry.
-
addunit
(unit, new=True, contextname=None, createifmissing=True)¶ Adds the given unit to the last used body node (current context).
If the contextname is specified, switch to that context (creating it if allowed by createifmissing).
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this .ts file.
The ‘sourcelanguage’ attribute was only added to the TS format in Qt v4.5. We return ‘en’ if there is no sourcelanguage set.
We don’t implement setsourcelanguage as users really shouldn’t be altering the source language in .ts files, it should be set correctly by the extraction tools.
Returns: ISO code e.g. af, fr, pt_BR Return type: String
-
gettargetlanguage
()¶ Get the target language for this .ts file.
Returns: ISO code e.g. af, fr, pt_BR Return type: String
-
getunits
()¶ Return a list of all units in this store.
-
initbody
()¶ Initialises self.body.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Write the XML document to a file.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this .ts file to targetlanguage.
Parameters: targetlanguage (String) – ISO code e.g. af, fr, pt_BR
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.ts2.
tsunit
(source, empty=False, **kwargs)¶ A single term in the TS file.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Add a note specifically in the appropriate comment tag
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
createlanguageNode
(lang, text, purpose)¶ Returns an xml Element setup with given parameters.
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ We override this to get source and target nodes.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettarget
()¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ States whether this unit needs to be reviewed
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
(origin=None)¶ Remove all the translator notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(value)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the “target” string (second language), or alternatively appends to the list
-
statemap
= {'obsolete': -100, 'unfinished': 30, '': 100, None: 100}¶ This maps the unit “type” attribute to state.
-
unit_iter
()¶ Iterator that only returns this unit.
-
ts¶
Module for parsing Qt .ts files for translation.
Currently this module supports the old format of .ts files. Some applictaions use the newer .ts format which are documented here: TS file format 4.3, Example
txt¶
This class implements the functionality for handling plain text files, or similar wiki type files.
- Supported formats are
- Plain text
- dokuwiki
- MediaWiki
-
class
translate.storage.txt.
TxtFile
(inputfile=None, flavour=None, no_segmentation=False, **kwargs)¶ This class represents a text file, made up of txtunits
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(lines)¶ Read in text lines and create txtunits from the blocks of text
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.txt.
TxtUnit
(source='', **kwargs)¶ This class represents a block of text from a text file
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
getcontext
()¶ Get the message context.
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
target
¶ gets the unquoted target string
-
unit_iter
()¶ Iterator that only returns this unit.
-
utx¶
Manage the Universal Terminology eXchange (UTX) format
UTX is a format for terminology exchange, designed it seems with Machine Translation (MT) as it’s primary consumer. The format is created by the Asia-Pacific Association for Machine Translation (AAMT).
It is a bilingual base class derived format with UtxFile
and UtxUnit
providing file and unit level access.
The format can manage monolingual dictionaries but these classes don’t implement that.
- Specification
- The format is implemented according to UTX v1.0 (No longer available from their website. The current UTX version may be downloaded instead).
- Format Implementation
- The UTX format is a Tab Seperated Value (TSV) file in UTF-8. The first two lines are headers with subsequent lines containing a single source target definition.
- Encoding
- The files are UTF-8 encoded with no BOM and CR+LF line terminators.
-
class
translate.storage.utx.
UtxDialect
¶ Describe the properties of an UTX generated TAB-delimited dictionary file.
-
class
translate.storage.utx.
UtxFile
(inputfile=None, **kwargs)¶ A UTX dictionary file
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parsese the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.utx.
UtxHeader
¶ A UTX header entry
- A UTX header is a single line that looks like this::
- #UTX-S <version>; < source language >/< target language>; <date created>; <optional fields (creator, license, etc.)>
- Where::
- UTX-S version is currently 1.00.
- Source language/target language: ISO 639, 3166 formats. In the case of monolingual dictionary, target language should be omitted.
- Date created: ISO 8601 format
- Optional fields (creator, license, etc.)
-
class
translate.storage.utx.
UtxUnit
(source=None)¶ A UTX dictionary unit
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
dict
¶ Get the dictionary of values for a UTX line
-
getcontext
()¶ Get the message context.
-
getdict
()¶ Get the dictionary of values for a UTX line
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setdict
(newdict)¶ Set the dictionary of values for a UTX line
Parameters: newdict (Dict) – a new dictionary with UTX line elements
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
versioncontrol¶
This module manages interaction with version control systems.
To implement support for a new version control system, inherit from
GenericRevisionControlSystem
.
- TODO:
- Add authentication handling
commitdirectory()
should do a single commit instead of one for each file- Maybe implement some caching for
get_versioned_object()
- check profiler
-
translate.storage.versioncontrol.
DEFAULT_RCS
= ['svn', 'cvs', 'darcs', 'git', 'bzr', 'hg']¶ the names of all supported revision control systems
modules of the same name containing a class with the same name are expected to be defined below ‘translate.storage.versioncontrol’
-
class
translate.storage.versioncontrol.
GenericRevisionControlSystem
(location, oldest_parent=None)¶ Bases:
object
The super class for all version control classes.
Always inherit from this class to implement another RC interface.
At least the two attributes
RCS_METADIR
andSCAN_PARENTS
must be overriden by all implementations that derive from this class.- By default, all implementations can rely on the following attributes:
root_dir
: the parent of the metadata directory of the working copylocation_abs
: the absolute path of the RCS objectlocation_rel
: the path of the RCS object relative toroot_dir
-
RCS_METADIR
= None¶ The name of the metadata directory of the RCS
e.g.: for Subversion -> “.svn”
-
SCAN_PARENTS
= None¶ Whether to check the parent directories for the metadata directory of the RCS working copy
Some revision control systems store their metadata directory only in the base of the working copy (e.g. bzr, GIT and Darcs) use
True
for these RCSOther RCS store a metadata directory in every single directory of the working copy (e.g. Subversion and CVS) use
False
for these RCS
-
add
(files, message=None, author=None)¶ Dummy to be overridden by real implementations
-
commit
(message=None, author=None)¶ Dummy to be overridden by real implementations
-
getcleanfile
(revision=None)¶ Dummy to be overridden by real implementations
-
update
(revision=None, needs_revert=True)¶ Dummy to be overridden by real implementations
-
translate.storage.versioncontrol.
commitdirectory
(directory, message=None, author=None)¶ Commit all files below the given directory.
Files that are just symlinked into the directory are supported, too
-
translate.storage.versioncontrol.
get_available_version_control_systems
()¶ return the class objects of all locally available version control systems
-
translate.storage.versioncontrol.
get_versioned_object
(location, versioning_systems=None, follow_symlinks=True, oldest_parent=None)¶ return a versioned object for the given file
-
translate.storage.versioncontrol.
get_versioned_objects_recursive
(location, versioning_systems=None, follow_symlinks=True)¶ return a list of objects, each pointing to a file below this directory
-
translate.storage.versioncontrol.
run_command
(command, cwd=None)¶ Runs a command (array of program name and arguments) and returns the exitcode, the output and the error as a tuple.
Parameters: - command (list) – list of arguments to be joined for a program call
- cwd (str) – optional directory where the command should be executed
-
translate.storage.versioncontrol.
updatedirectory
(directory)¶ Update all files below the given directory.
Files that are just symlinked into the directory are supported, too
bzr¶
-
class
translate.storage.versioncontrol.bzr.
bzr
(location, oldest_parent=None)¶ Class to manage items under revision control of bzr.
-
add
(files, message=None, author=None)¶ Add and commit files.
-
commit
(message=None, author=None)¶ Commits the file and supplies the given commit message if present
-
getcleanfile
(revision=None)¶ Get a clean version of a file from the bzr repository
-
update
(revision=None, needs_revert=True)¶ Does a clean update of the given path
-
-
translate.storage.versioncontrol.bzr.
get_version
()¶ return a tuple of (major, minor) for the installed bazaar client
-
translate.storage.versioncontrol.bzr.
is_available
()¶ check if bzr is installed
cvs¶
-
class
translate.storage.versioncontrol.cvs.
cvs
(location, oldest_parent=None)¶ Class to manage items under revision control of CVS.
-
add
(files, message=None, author=None)¶ Add and commit the new files.
-
commit
(message=None, author=None)¶ Commits the file and supplies the given commit message if present
the ‘author’ parameter is not suitable for CVS, thus it is ignored
-
getcleanfile
(revision=None)¶ Get the content of the file for the given revision
-
update
(revision=None, needs_revert=True)¶ Does a clean update of the given path
-
-
translate.storage.versioncontrol.cvs.
is_available
()¶ check if cvs is installed
darcs¶
-
class
translate.storage.versioncontrol.darcs.
darcs
(location, oldest_parent=None)¶ Class to manage items under revision control of darcs.
-
add
(files, message=None, author=None)¶ Add and commit files.
-
commit
(message=None, author=None)¶ Commits the file and supplies the given commit message if present
-
getcleanfile
(revision=None)¶ Get a clean version of a file from the darcs repository
Parameters: revision – ignored for darcs
-
update
(revision=None, needs_revert=True)¶ Does a clean update of the given path
Parameters: revision – ignored for darcs
-
-
translate.storage.versioncontrol.darcs.
is_available
()¶ check if darcs is installed
git¶
-
class
translate.storage.versioncontrol.git.
git
(location, oldest_parent=None)¶ Class to manage items under revision control of git.
-
add
(files, message=None, author=None)¶ Add and commit the new files.
-
commit
(message=None, author=None, add=True)¶ Commits the file and supplies the given commit message if present
-
getcleanfile
(revision=None)¶ Get a clean version of a file from the git repository
-
update
(revision=None, needs_revert=True)¶ Does a clean update of the given path
-
-
translate.storage.versioncontrol.git.
is_available
()¶ check if git is installed
hg¶
-
translate.storage.versioncontrol.hg.
get_version
()¶ Return a tuple of (major, minor) for the installed mercurial client.
-
class
translate.storage.versioncontrol.hg.
hg
(location, oldest_parent=None)¶ Class to manage items under revision control of mercurial.
-
add
(files, message=None, author=None)¶ Add and commit the new files.
-
commit
(message=None, author=None)¶ Commits the file and supplies the given commit message if present
-
getcleanfile
(revision=None)¶ Get a clean version of a file from the hg repository
-
update
(revision=None, needs_revert=True)¶ Does a clean update of the given path
Parameters: revision – ignored for hg
-
-
translate.storage.versioncontrol.hg.
is_available
()¶ check if hg is installed
svn¶
-
translate.storage.versioncontrol.svn.
get_version
()¶ return a tuple of (major, minor) for the installed subversion client
-
translate.storage.versioncontrol.svn.
is_available
()¶ check if svn is installed
-
class
translate.storage.versioncontrol.svn.
svn
(location, oldest_parent=None)¶ Class to manage items under revision control of Subversion.
-
add
(files, message=None, author=None)¶ Add and commit the new files.
-
commit
(message=None, author=None)¶ commit the file and return the given message if present
the ‘author’ parameter is used for revision property ‘translate:author’
-
getcleanfile
(revision=None)¶ return the content of the ‘head’ revision of the file
-
update
(revision=None, needs_revert=True)¶ update the working copy - remove local modifications if necessary
-
wordfast¶
Manage the Wordfast Translation Memory format
Wordfast TM format is the Translation Memory format used by the Wordfast computer aided translation tool.
It is a bilingual base class derived format with WordfastTMFile
and WordfastUnit
providing file and unit level access.
Wordfast is a computer aided translation tool. It is an application built on top of Microsoft Word and is implemented as a rather sophisticated set of macros. Understanding that helps us understand many of the seemingly strange choices around this format including: encoding, escaping and file naming.
- Implementation
The implementation covers the full requirements of a Wordfast TM file. The files are simple Tab Separated Value (TSV) files that can be read by Microsoft Excel and other spreadsheet programs. They use the .txt extension which does make it more difficult to automatically identify such files.
The dialect of the TSV files is specified by
WordfastDialect
.- Encoding
The files are UTF-16 or ISO-8859-1 (Latin1) encoded. These choices are most likely because Microsoft Word is the base editing tool for Wordfast.
The format is tab separated so we are able to detect UTF-16 vs Latin-1 by searching for the occurance of a UTF-16 tab character and then continuing with the parsing.
- Timestamps
WordfastTime
allows for the correct management of the Wordfast YYYYMMDD~HHMMSS timestamps. However, timestamps on individual units are not updated when edited.- Header
WordfastHeader
provides header management support. The header functionality is fully implemented through observing the behaviour of the files in real use cases, input from the Wordfast programmers and public documentation.- Escaping
Wordfast TM implements a form of escaping that covers two aspects:
- Placeable: bold, formating, etc. These are left as is and ignored. It is up to the editor and future placeable implementation to manage these.
- Escapes: items that may confuse Excel or translators are escaped as
&'XX;
. These are fully implemented and are converted to and from Unicode. By observing behaviour and reading documentation we where able to observe all possible escapes. Unfortunately the escaping differs slightly between Windows and Mac version. This might cause errors in future. Functions allow for<_wf_to_char>
and back to Wordfast escape (<_char_to_wf>
).
- Extended Attributes
- The last 4 columns allow users to define and manage extended attributes. These are left as is and are not directly managed byour implemenation.
-
translate.storage.wordfast.
TAB_UTF16
= b'\x00\t'¶ The tab t character as it would appear in UTF-16 encoding
-
translate.storage.wordfast.
WF_ESCAPE_MAP
= (("&'26;", '&'), ("&'82;", '‚'), ("&'85;", '…'), ("&'91;", '‘'), ("&'92;", '’'), ("&'93;", '“'), ("&'94;", '”'), ("&'96;", '–'), ("&'97;", '—'), ("&'99;", '™'), ("&'A0;", '\xa0'), ("&'A9;", '©'), ("&'AE;", '®'), ("&'BC;", '¼'), ("&'BD;", '½'), ("&'BE;", '¾'), ("&'A8;", '®'), ("&'AA;", '™'), ("&'C7;", '«'), ("&'C8;", '»'), ("&'C9;", '…'), ("&'CA;", '\xa0'), ("&'D0;", '–'), ("&'D1;", '—'), ("&'D2;", '“'), ("&'D3;", '”'), ("&'D4;", '‘'), ("&'D5;", '’'), ("&'E2;", '‚'), ("&'E3;", '„'))¶ Mapping of Wordfast &’XX; escapes to correct Unicode characters
-
translate.storage.wordfast.
WF_FIELDNAMES
= ['date', 'user', 'reuse', 'src-lang', 'source', 'target-lang', 'target', 'attr1', 'attr2', 'attr3', 'attr4']¶ Field names for a Wordfast TU
-
translate.storage.wordfast.
WF_FIELDNAMES_HEADER
= ['date', 'userlist', 'tucount', 'src-lang', 'version', 'target-lang', 'license', 'attr1list', 'attr2list', 'attr3list', 'attr4list', 'attr5list']¶ Field names for the Wordfast header
-
translate.storage.wordfast.
WF_FIELDNAMES_HEADER_DEFAULTS
= {'attr1list': '', 'attr2list': '', 'attr3list': '', 'attr4list': '', 'attr5list': '', 'date': '%19000101~121212', 'license': '%---00000001', 'src-lang': '%EN-US', 'target-lang': '', 'tucount': '%TU=00000001', 'userlist': '%User ID,TT,TT Translate-Toolkit', 'version': '%Wordfast TM v.5.51w9/00'}¶ Default or minimum header entries for a Wordfast file
-
translate.storage.wordfast.
WF_TIMEFORMAT
= '%Y%m%d~%H%M%S'¶ Time format used by Wordfast
-
class
translate.storage.wordfast.
WordfastDialect
¶ Describe the properties of a Wordfast generated TAB-delimited file.
-
class
translate.storage.wordfast.
WordfastHeader
(header=None)¶ A wordfast translation memory header
-
getheader
()¶ Get the header dictionary
-
header
¶ Get the header dictionary
-
-
class
translate.storage.wordfast.
WordfastTMFile
(inputfile=None, **kwargs)¶ A Wordfast translation memory file
-
UnitClass
¶ alias of
WordfastUnit
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addsourceunit
(source)¶ Add and returns a new unit with the given source string.
Return type: TranslationUnit
-
addunit
(unit)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
parse
(input)¶ parsese the given file or file source string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Convert the string representation back to an object.
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a bytes representation that can be parsed back using
parsestring()
. out should be an open file-like objects to write to.
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(sourcelanguage)¶ Set the source language for this store.
-
settargetlanguage
(targetlanguage)¶ Set the target language for this store.
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.wordfast.
WordfastTime
(newtime=None)¶ Manages time stamps in the Wordfast format of YYYYMMDD~hhmmss
-
get_time
()¶ Get the time_struct object
-
get_timestring
()¶ Get the time in the Wordfast time format
-
set_time
(newtime)¶ Set the time_struct object
Parameters: newtime (time.time_struct) – a new time object
-
set_timestring
(timestring)¶ Set the time_sturct object using a Wordfast time formated string
Parameters: timestring (String) – A Wordfast time string (YYYMMDD~hhmmss)
-
time
¶ Get the time_struct object
-
timestring
¶ Get the time in the Wordfast time format
-
-
class
translate.storage.wordfast.
WordfastUnit
(source=None)¶ A Wordfast translation memory unit
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
Parameters: - errorname (string) – A single word to id the error.
- errortext (string) – The text describing the error.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Adds a note (comment).
Parameters: - text (string) – Usually just a sentence or two.
- origin (string) – Specifies who/where the comment comes from. Origin can be one of the following text strings: - ‘translator’ - ‘developer’, ‘programmer’, ‘source code’ (synonyms)
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
dict
¶ Get the dictionary of values for a Wordfast line
-
getcontext
()¶ Get the message context.
-
getdict
()¶ Get the dictionary of values for a Wordfast line
-
geterrors
()¶ Get all error messages.
Return type: Dictionary
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ Indicates whether this unit needs review.
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Parameters: - needsreview – Defaults to True.
- explanation – Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
multistring_to_rich
(mulstring)¶ Convert a multistring to a list of “rich” string trees:
>>> target = multistring([u'foo', u'bar', u'baz']) >>> TranslationUnit.multistring_to_rich(target) [<StringElem([<StringElem([u'foo'])>])>, <StringElem([<StringElem([u'bar'])>])>, <StringElem([<StringElem([u'baz'])>])>]
-
removenotes
()¶ Remove all the translator’s notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Convert a “rich” string tree to a
multistring
:>>> from translate.storage.placeables.interfaces import X >>> rich = [StringElem(['foo', X(id='xxx', sub=[' ']), 'bar'])] >>> TranslationUnit.rich_to_multistring(rich) multistring(u'foo bar')
-
setcontext
(context)¶ Set the message context
-
setdict
(newdict)¶ Set the dictionary of values for a Wordfast line
Parameters: newdict (Dict) – a new dictionary with Wordfast line elements
-
setid
(value)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
unit_iter
()¶ Iterator that only returns this unit.
-
workflow¶
A workflow is defined by a set of states that a translation unit can be in and the (allowed) transitions between these states. A state is defined by a range between -128 and 127, indicating its level of “completeness”. The range is closed at the beginning and open at the end. That is, if a workflow contains states A, B and C where A < B < C, a unit with state number n is in state A if A <= n < B, state B if B <= n < C or state C if C <= n < MAX.
A value of 0 is typically the “empty” or “new” state with negative values reserved for states like “obsolete” or “do not use”.
Format specific workflows should be defined in such a way that the numeric state values correspond to similar states. For example state 0 should be “untranslated” in PO and “new” or “empty” in XLIFF, state 100 should be “translated” in PO and “final” in XLIFF. This allows formats to implicitly define similar states.
-
exception
translate.storage.workflow.
InvalidStateObjectError
(obj)¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
exception
translate.storage.workflow.
NoInitialStateError
¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
-
class
translate.storage.workflow.
StateEnum
¶ Only contains the constants for default states.
-
exception
translate.storage.workflow.
StateNotInWorkflowError
(state)¶ -
with_traceback
()¶ Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
-
xliff¶
Module for handling XLIFF files for translation.
The official recommendation is to use the extention .xlf for XLIFF files.
-
class
translate.storage.xliff.
xlifffile
(*args, **kwargs)¶ Class representing a XLIFF file store.
-
add_unit_to_index
(unit)¶ Add a unit to source and location idexes
-
addheader
()¶ Initialise the file header.
-
addsourceunit
(source, filename='NoName', createifmissing=False)¶ adds the given trans-unit to the last used body node if the filename has changed it uses the slow method instead (will create the nodes required if asked). Returns success
-
addunit
(unit, new=True)¶ Append the given unit to the object’s list of units.
This method should always be used rather than trying to modify the list manually.
Parameters: unit ( TranslationUnit
) – The unit that will be added.
-
createfilenode
(filename, sourcelanguage=None, targetlanguage=None, datatype='plaintext')¶ creates a filenode with the given filename. All parameters are needed for XLIFF compliance.
-
creategroup
(filename='NoName', createifmissing=False, restype=None)¶ adds a group tag into the specified file
-
detect_encoding
(text, default_encodings=None)¶ Try to detect a file encoding from text, using either the chardet lib or by trying to decode the file.
-
fallback_detection
(text)¶ Simple detection based on BOM in case chardet is not available.
-
findid
(id)¶ find unit with matching id by checking id_index
-
findunit
(source)¶ Find the unit with the given source string.
Return type: TranslationUnit
or None
-
findunits
(source)¶ Find the units with the given source string.
Return type: TranslationUnit
or None
-
getbodynode
(filenode, createifmissing=False)¶ finds the body node for the given filenode
-
getdatatype
(filename=None)¶ Returns the datatype of the stored file. If no filename is given, the datatype of the first file is given.
-
getdate
(filename=None)¶ Returns the date attribute for the file.
If no filename is given, the date of the first file is given. If the date attribute is not specified, None is returned.
Returns: Date attribute of file Return type: Date or None
-
getfilename
(filenode)¶ returns the name of the given file
-
getfilenames
()¶ returns all filenames in this XLIFF file
-
getfilenode
(filename, createifmissing=False)¶ finds the filenode with the given name
-
getheadernode
(filenode, createifmissing=False)¶ finds the header node for the given filenode
-
getids
(filename=None)¶ return a list of unit ids
-
getprojectstyle
()¶ Get the project type for this store.
-
getsourcelanguage
()¶ Get the source language for this store.
-
gettargetlanguage
()¶ Get the target language for this store.
-
getunits
()¶ Return a list of all units in this store.
-
initbody
()¶ Initialises self.body so it never needs to be retrieved from the XML again.
-
isempty
()¶ Return True if the object doesn’t contain any translation units.
-
makeindex
()¶ Indexes the items in this store. At least .sourceindex should be useful.
-
merge_on
¶ The matching criterion to use when merging on.
Returns: The default matching criterion for all the subclasses. Return type: string
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
parse
(xml)¶ Populates this object from the given xml string
-
classmethod
parsefile
(storefile)¶ Reads the given file (or opens the given filename) and parses back to an object.
-
classmethod
parsestring
(storestring)¶ Parses the string to return the correct file object
-
remove_unit_from_index
(unit)¶ Remove a unit from source and locaton indexes
-
removedefaultfile
()¶ We want to remove the default file-tag as soon as possible if we know if still present and empty.
-
require_index
()¶ make sure source index exists
-
save
()¶ Save to the file that data was originally read from, if available.
-
savefile
(storefile)¶ Write the string representation to the given file (or filename).
-
serialize
(out)¶ Converts to a string containing the file’s XML
-
setfilename
(filenode, filename)¶ set the name of the given file
-
setprojectstyle
(project_style)¶ Set the project type for this store.
-
setsourcelanguage
(language)¶ Set the source language for this store.
-
settargetlanguage
(language)¶ Set the target language for this store.
-
suggestions_in_format
= True¶ xliff units have alttrans tags which can be used to store suggestions
-
switchfile
(filename, createifmissing=False)¶ Adds the given trans-unit (will create the nodes required if asked).
Returns: Success Return type: Boolean
-
translate
(source)¶ Return the translated string for a given source string.
Return type: String or None
-
unit_iter
()¶ Iterator over all the units in this store.
-
-
class
translate.storage.xliff.
xliffunit
(source, empty=False, **kwargs)¶ A single term in the xliff file.
-
addalttrans
(txt, origin=None, lang=None, sourcetxt=None, matchquality=None)¶ Adds an alt-trans tag and alt-trans components to the unit.
Parameters: txt (String) – Alternative translation of the source text.
-
adderror
(errorname, errortext)¶ Adds an error message to this unit.
-
addlocation
(location)¶ Add one location to the list of locations.
Note
Shouldn’t be implemented if the format doesn’t support it.
-
addlocations
(location)¶ Add a location or a list of locations.
Note
Most classes shouldn’t need to implement this, but should rather implement
TranslationUnit.addlocation()
.Warning
This method might be removed in future.
-
addnote
(text, origin=None, position='append')¶ Add a note specifically in a “note” tag
-
classmethod
buildfromunit
(unit)¶ Build a native unit from a foreign unit, preserving as much information as possible.
-
correctorigin
(node, origin)¶ Check against node tag’s origin (e.g note or alt-trans)
-
createcontextgroup
(name, contexts=None, purpose=None)¶ Add the context group to the trans-unit with contexts a list with (type, text) tuples describing each context.
-
createlanguageNode
(lang, text, purpose)¶ Returns an xml Element setup with given parameters.
-
delalttrans
(alternative)¶ Removes the supplied alternative from the list of alt-trans tags
-
getNodeText
(languageNode, xml_space='preserve')¶ Retrieves the term from the given
languageNode
.
-
get_rich_target
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
getalttrans
(origin=None)¶ Returns <alt-trans> for the given origin as a list of units. No origin means all alternatives.
-
getcontext
()¶ Get the message context.
-
getcontextgroups
(name)¶ Returns the contexts in the context groups with the specified name
-
geterrors
()¶ Get all error messages.
-
getid
()¶ A unique identifier for this unit.
Return type: string Returns: an identifier for this unit that is unique in the store Derived classes should override this in a way that guarantees a unique identifier for each unit in the store.
-
getlanguageNode
(lang=None, index=None)¶ Retrieves a
languageNode
either by language or by index.
-
getlanguageNodes
()¶ We override this to get source and target nodes.
-
getlocations
()¶ A list of source code locations.
Return type: List Note
Shouldn’t be implemented if the format doesn’t support it.
-
getnotes
(origin=None)¶ Returns all notes about this unit.
It will probably be freeform text or something reasonable that can be synthesised by the format. It should not include location comments (see
getlocations()
).
-
getrestype
()¶ returns the restype attribute in the trans-unit tag
-
gettarget
(lang=None)¶ retrieves the “target” text (second entry), or the entry in the specified language, if it exists
-
gettargetlen
()¶ Returns the length of the target string.
Return type: Integer Note
Plural forms might be combined.
-
getunits
()¶ This unit in a list.
-
hasplural
()¶ Tells whether or not this specific unit has plural strings.
-
infer_state
()¶ Empty method that should be overridden in sub-classes to infer the current state(_n) of the unit from its current state.
-
isapproved
()¶ States whether this unit is approved.
-
isblank
()¶ Used to see if this unit has no source or target string.
Note
This is probably used more to find translatable units, and we might want to move in that direction rather and get rid of this.
-
isfuzzy
()¶ Indicates whether this unit is fuzzy.
-
isheader
()¶ Indicates whether this unit is a header.
-
isobsolete
()¶ indicate whether a unit is obsolete
-
isreview
()¶ States whether this unit needs to be reviewed
-
istranslatable
()¶ Indicates whether this unit can be translated.
This should be used to distinguish real units for translation from header, obsolete, binary or other blank units.
-
istranslated
()¶ Indicates whether this unit is translated.
This should be used rather than deducing it from .target, to ensure that other classes can implement more functionality (as XLIFF does).
-
makeobsolete
()¶ Make a unit obsolete
-
markapproved
(value=True)¶ Mark this unit as approved.
-
markfuzzy
(value=True)¶ Marks the unit as fuzzy or not.
-
markreviewneeded
(needsreview=True, explanation=None)¶ Marks the unit to indicate whether it needs review.
Adds an optional explanation as a note.
-
merge
(otherunit, overwrite=False, comments=True, authoritative=False)¶ Do basic format agnostic merging.
-
classmethod
multistring_to_rich
(mstr)¶ Override
TranslationUnit.multistring_to_rich()
which is used by therich_source
andrich_target
properties.
-
namespaced
(name)¶ Returns name in Clark notation.
For example
namespaced("source")
in an XLIFF document might return:{urn:oasis:names:tc:xliff:document:1.1}source
This is needed throughout lxml.
-
removenotes
(origin='translator')¶ Remove all the translator notes.
-
rich_source
¶ See also
-
rich_target
¶ See also
-
classmethod
rich_to_multistring
(elem_list)¶ Override
TranslationUnit.rich_to_multistring()
which is used by therich_source
andrich_target
properties.
-
setcontext
(context)¶ Set the message context
-
setid
(id)¶ Sets the unique identified for this unit.
only implemented if format allows ids independant from other unit properties like source or context
-
settarget
(target, lang='xx', append=False)¶ Sets the target string to the given value.
-
unit_iter
()¶ Iterator that only returns this unit.
-
xml_extract¶
extract¶
-
class
translate.storage.xml_extract.extract.
ParseState
(no_translate_content_elements, inline_elements={}, nsmap={})¶ Maintain constants and variables used during the walking of a DOM tree (via the function apply).
-
class
translate.storage.xml_extract.extract.
Translatable
(placeable_name, xpath, dom_node, source, is_inline=False)¶ A node corresponds to a translatable element. A node may have children, which correspond to placeables.
-
has_translatable_text
¶ Check if it contains any chunk of text with more than whitespace.
If not, then there’s nothing to translate.
-
-
translate.storage.xml_extract.extract.
build_idml_store
(odf_file, store, parse_state, store_adder=None)¶ Build a store for the given IDML file.
-
translate.storage.xml_extract.extract.
build_store
(odf_file, store, parse_state, store_adder=None)¶ Build a store for the given XML file.
-
translate.storage.xml_extract.extract.
make_postore_adder
(store, id_maker, filename)¶ Return a function which, when called with a Translatable will add a unit to ‘store’. The placeables will be represented as strings according to ‘placeable_quoter’.
-
translate.storage.xml_extract.extract.
process_translatable
(dom_node, state)¶ Process a translatable DOM node.
Any translatable content present in a child node is treated as a placeable.
generate¶
-
translate.storage.xml_extract.generate.
find_dom_root
(parent_dom_node, dom_node)¶ See also
-
translate.storage.xml_extract.generate.
find_placeable_dom_tree_roots
(unit_node)¶ For an inline placeable, find the root DOM node for the placeable in its parent.
Consider the diagram. In this pseudo-ODF example, there is an inline span element. However, the span is contained in other tags (which we never process). When splicing the template DOM tree (that is, the DOM which comes from the XML document we’re using to generate a translated XML document), we’ll need to move DOM sub-trees around and we need the roots of these sub-trees:
<p> This is text \/ <- Paragraph containing an inline placeable <blah> <- Inline placeable's root (which we want to find) ... <- Any number of intermediate DOM nodes <span> bold text <- The inline placeable's Translatable holds a reference to this DOM node
-
translate.storage.xml_extract.generate.
get_xliff_source_target_doms
(unit)¶ Return a tuple with unit source and target DOM objects.
This method is method is meant to provide a way to retrieve the DOM objects for the unit source and target for XLIFF stores.
-
translate.storage.xml_extract.generate.
replace_dom_text
(make_parse_state, dom_retriever=<function get_xliff_source_target_doms>, process_translatable=<function process_translatable>)¶ Return a function:
action: etree_Element x base.TranslationUnit -> None
which takes a dom_node and a translation unit. The dom_node is rearranged according to rearrangement of placeables in unit.target (relative to their positions in unit.source).
misc¶
-
translate.storage.xml_extract.misc.
compose_mappings
(left, right)¶ Given two mappings left: A -> B and right: B -> C, create a hash result_map: A -> C. Only values in left (i.e. things from B) which have corresponding keys in right will have their keys mapped to values in right.
-
translate.storage.xml_extract.misc.
parse_tag
(full_tag)¶ >>> parse_tag('{urn:oasis:names:tc:opendocument:xmlns:office:1.0}document-content') ('urn:oasis:names:tc:opendocument:xmlns:office:1.0', 'document-content') >>> parse_tag('document-content') ('', 'document-content')
-
translate.storage.xml_extract.misc.
reduce_tree
(f, parent_unit_node, unit_node, get_children, *state)¶ Enumerate a tree, applying f to in a pre-order fashion to each node.
parent_unit_node contains the parent of unit_node. For the root of the tree, parent_unit_node == unit_node.
get_children is a single argument function applied to a unit_node to get a list/iterator to its children.
state is used by f to modify state information relating to whatever f does to the tree.
unit_tree¶
-
translate.storage.xml_extract.unit_tree.
build_unit_tree
(store, filename=None)¶ Enumerate a translation store and build a tree with XPath components as nodes and where a node contains a unit if a path from the root of the tree to the node containing the unit, is equal to the XPath of the unit.
The tree looks something like this:
root `- ('document-content', 1) `- ('body', 2) |- ('text', 1) | `- ('p', 1) | `- <reference to a unit> |- ('text', 2) | `- ('p', 1) | `- <reference to a unit> `- ('text', 3) `- ('p', 1) `- <reference to a unit>
xpath_breadcrumb¶
A class which is used to build XPath-like paths as a DOM tree is walked. It keeps track of the number of times which it has seen a certain tag, so that it will correctly create indices for tags.
Initially, the path is empty. Thus >>> xb = XPathBreadcrumb() >>> xb.xpath “”
Suppose we walk down a DOM node for the tag <foo> and we want to record this, we simply do >>> xb.start_tag(‘foo’)
Now, the path is no longer empty. Thus >>> xb.xpath foo[0]
Now suppose there are two <bar> tags under the tag <foo> (that is <foo><bar></bar><bar></bar><foo>), then the breadcrumb will keep track of the number of times it sees <bar>. Thus
>>> xb.start_tag('bar') >>> xb.xpath foo[0]/bar[0] >>> xb.end_tag() >>> xb.xpath foo[0] >>> xb.start_tag('bar') >>> xb.xpath foo[0]/bar[1]
xml_name¶
-
class
translate.storage.xml_name.
XmlNamer
(dom_node)¶ Initialize me with a DOM node or a DOM document node (the toplevel node you get when parsing an XML file). Then use me to generate fully qualified XML names.
>>> xml = '<office:document-styles xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"></office>' >>> from lxml import etree >>> namer = XmlNamer(etree.fromstring(xml)) >>> namer.name('office', 'blah') {urn:oasis:names:tc:opendocument:xmlns:office:1.0}blah >>> namer.name('office:blah') {urn:oasis:names:tc:opendocument:xmlns:office:1.0}blah
I can also give you XmlNamespace objects if you give me the abbreviated namespace name. These are useful if you need to reference a namespace continuously.
>>> office_ns = name.namespace('office') >>> office_ns.name('foo') {urn:oasis:names:tc:opendocument:xmlns:office:1.0}foo
zip¶
This module provides functionality to work with zip files.
-
class
translate.storage.zip.
ZIPFile
(filename=None)¶ This class represents a ZIP file like a directory.
-
file_iter
()¶ Iterator over (dir, filename) for all files in this directory.
-
getfiles
()¶ Returns a list of (dir, filename) tuples for all the file names in this directory.
-
getunits
()¶ List of all the units in all the files in this directory.
-
scanfiles
()¶ Populate the internal file data.
-
unit_iter
()¶ Iterator over all the units in all the files in this zip file.
-