XML HyperText Scanner

Overview. The XML HyperText Scanner is intended for localizing complex xml-based files, for instance Dita files and xhtml files. It reads in the original files and writes out localized files, without breaking the structure of the original files.

Localizable data can be in both nodes and attributes.

The XML HyperText scanner options allow the user to specify how to localize xml files. The most important settings are the following:

  • File; the file to localize
  • Context attribute (Settings tab); a user-specified unique attribute that will be the context of the translation.
  • Elements (Elements tab); which tags are localized.


This plug-in is available as a separate plug-in. To get an evaluation version of this plug-in, contact sales(at)multilizer.com.


XML HyperText scanner is very simple to use; the user specifies the context id (Settings tab) and the tags to localize (Elements tab).

Nested tags

If elements that user wants to localize contain nested tags, they will be localized too. Nested tags are shown in localizable text and can be modified.

Nested tags can be

  • removed
  • reordered
  • added
  • changed

When building the localized files, Multilizer gives warnings about all actions performed on nested tags.


The XML HyperText scanner options has the following four tabs:


Scan Preview Data. Specify whether to scan or not the preview data. Preview data is used for the visualization of the original and localized xml file. The visualization requires a separate Visual Editor plug-in.

Context mode. XML HyperText scanner provides two methods for identifying localizable data:

  • Unique attribute value. XML HyperTextscanner assumes that all elements intended for localization are have an attribute that contains a unique value. Multilizer XML HyperTextscanner uses this value to form the context of the translation. When Multilizer reads the original xml file, it checks that the values are unique.
  • Combination of node name + value hash. Localizable data is identified by a combination of node name and its value.

The use of unique attribute values for identifying localizable data is strongly encouraged.


Native language. Specifies the native language that is used in the target . Set this to match the language in the original document

Default language. Specifies the default language. It is the language that is used in the build process, if there is no translation given in the the build language. The translation search order is following: first the build languages, then the country neutral languages (if the build language is country specific), then the default language. If no translation is found then the native value is used. For example if the project contains English, German and German (Austria) and the build language is German (Austria) and the default language is English the search order is German (Austria) - German - English - native.

Native encoding. Specifies the native encoding that is used in the target with each character set.

Encoding list. Contains the encodings to be used in the localized files. To change the value right click the line and select a new value.


Output directory. Specifies the directory where the localized files will be created. This is the directory where Multilizer creates the localized output files and sub directories. You can control the name of the localized files and directories by setting Type, Coding, Locale separator and Country separator.

Output file name

Type. Specifies how and where MULTILIZER creates the localized file version(s). Possible options are:

Subdirectory Create localized file(s) in sub directories named by language and locale.
Bundle name Append the language and locale information in the output file name.
File extension Replace the original file extension with language and locale information.

Coding. Selects file name initial coding style. Possible values are:

ISO ISO standard coding style.
.NET Microsoft .Net coding style.
Windows Windows coding style.

Write BOM. Specifies whether to write byte order mark (BOM) characters to the localized xml file. By default, BOM characters are written when output is Unicode and the input contains BOM characters as well. The user can also choose not to write BOM characters at all or write BOM characters only when the output is Unicode.


Not necessarily all nodes in xml files contain localizable data. The names of those tags that contain localizable text can be specified in elements tab. If the localizable tags contain child nodes, they will be localized as well.


In addition to xml node contents, XML HyperText Scanner can localize xml node attributes as well. Enter the attributes with localizable contents in the Attributes tab.