Choosing an Attribute Encoder / Decoder Transformer

Introduction

FME offers a variety of encoder/decoder transformers. These include:

While these transformers all modify attribute encoding systems, they differ in their approaches. This article will provide an overview of each FME encoding/decoding transformer to help you select the best one for your workspace needs. This article assumes prior knowledge of character encoding systems. For more information on character encoding, please visit the related Wikipedia page.

AttributeEncoder

The AttributeEncoder modifies an input attribute’s character encoding system by either marking the attributes with the new encoding system without performing character conversion or by both marking the attributes with and converting the characters to the new encoding system. The encoding systems most familiar to users are ASCII-based. ASCII character encoding systems use 128 different character values (codes) to represent the most common English letters, numbers, and symbols. However, this transformer also supports non-ASCII encodings to allow the encoding of non-English characters.

This transformer can receive any type of data. It also includes more output encoding options than the other transformers, allowing you to easily convert between a variety of language encodings and even Unicode. If incoming attributes are Null, the attribute's encoding will still be modified to the specified output system; however, the attribute values will remain Null.

There are two ways that the encoding is handled by the transformer, specified by the Incoming Attribute parameter:

If Honor Encoding is chosen, the transformer will attempt to convert the input attributes from one encoding system to another. If a character from the input attribute cannot be found in the target encoding system, the transformer will fail with an error.
If Use Bytes is chosen, the transformer will change the attribute's encoding but will make no attempt to convert its characters to those represented by the new encoding. This option is best when input attributes contain characters not present in the target encoding system, as it allows translation to continue even when target characters are missing or unidentified.

Additionally, there are two included encoding systems to be aware of, which provide instructions to FME:

Binary (fme-binary) - this setting labels the attribute as binary data that should not be interpreted as characters. FME will display these attributes with the Hex equivalents of the byte values.
System Default (fme-system) - this setting tells FME to interpret characters using the default operating system encoding, which can differ across language versions of Windows.

BinaryEncoder and BinaryDecoder

The BinaryEncoder converts binary data into encoded text using either Base64 or Hexadecimal encoding (both ASCII-related). This is useful when a binary file (such as an image or email attachment) needs to be included/embedded within a text file (e.g., an HTML document). This transformer is useful when transmitting or receiving data from web services, whose protocols often limit the data exchanged to text. The BinaryEncoder transformer can accept any data type and will output a new attribute containing the encoded text values.

The BinaryDecoder performs the inverse operation of the BinaryEncoder, decoding Base64- or Hexadecimal-encoded text attributes into binary data. This transformer offers output options similar to those of the AttributeEncoder; however, the input attributes must be encoded in Base64 or Hex for use with the BinaryDecoder.

TextEncoder and TextDecoder

Web URLs, XML, and HTML have a number of characters with specific meanings within their text. For example, a ? within a web URL represents the end of the main page address and the beginning of a query. These unique, meaningful characters must be properly encoded within a dataset's attributes to avoid being misinterpreted as plain text. The TextEncoder will do just that by encoding text strings so they are properly interpreted for inclusion in a URL, in HTML or XML documents, and the like. For more information on these encoding systems, please see:

The TextEncoder also offers the same Base64 and Hex output options as the BinaryEncoder. However, the TextEncoder will convert the input attribute text to UTF-8 first, then encode the character bytes as Base64 or Hex-encoded text. Since the TextEncoder specializes in working with web-compatible encoded text formats, and UTF-8 is so commonly used on the web, it is advantageous for the TextEncoder to first encode to UTF-8. If this behavior is undesirable, consider using the BinaryEncoder for Base64 and Hex encoding instead. The TextEncoder produces a new attribute containing encoded text values.

The TextDecoder transformer performs the reverse operation of the TextEncoder. It decodes a string attribute from encoded text to plain text. This transformer supports a number of encoded text types as input, including URL, Unicode, XML, HTML, Base64, Hex, and Octal. As output, the TextDecoder produces a new attribute containing plain text values.

Search

Choosing an Attribute Encoder / Decoder Transformer

Introduction

AttributeEncoder

BinaryEncoder and BinaryDecoder

TextEncoder and TextDecoder

Was this article helpful?