How to Fix XML Files with Bad Encoding

Liz Sanderson
Liz Sanderson
  • Updated

Introduction

Sometimes an XML file cannot be processed because it contains an invalid encoding or is marked with the incorrect character encoding scheme. FME may not recognize the file as XML, and when viewing the file in a browser or other tool, an error message may appear, such as "Encoding error" or "invalid character was found in text content."

Fixing this problem means re-encoding the XML data with the proper encoding scheme. To do this, create an FME Workspace to convert the XML file using a Text File Reader/Writer and set the encoding parameters. This article walks through the steps to building the workspace from scratch and includes a completed version in the files section

Note that FME's Text File Reader/Writer is used and not the built-in XML Reader/Writer because FME may not recognize a file with an invalid encoding as XML. This is therefore a helpful preliminary step in repairing an XML file before working with it in FME.

Step-by-Step Instructions

Fixing an XML file with an invalid encoding can be done by creating a basic FME Workspace that has a Text File Reader and a Text File Writer.

In this example, we are working with an XML file that is incorrectly encoded. When viewing it in Notepad, we can see that it is marked as UTF-8 but contains accented characters in ISO-8859-1:

sourcefile.PNG

The file displays an error when we try to view it in a browser or work with it in other tools:

error.PNG

We will use FME to convert the file to a UTF-8 encoding.

1. Add a Text File Reader

Open FME Workbench. Click the Reader icon to open the Add Reader dialog and set the following parameters:

  • Format: Text File
  • Dataset: C:\<Path to file>\XMLEncodingError.xml
    • When choosing the dataset that you might need to change the filter to "All Files", as the default is to search for .txt files only.

Click on the Parameters button.

reader.PNG

  • File Contents
    • Character Encoding: Latin-1 Western European (iso-8859-1)

Click OK to add the Reader to the workspace.
readerparams.PNG

2. Add a Text File Writer

Click the Writer icon to open the Add Writer dialog and set the following parameters:

  • Format: Text File
  • Dataset: C:\<Path to file>\FixedXMLFile.xml

Click the Parameters button.

writer.PNG

  • File Contents
    • Character Encoding: UTF-8
    • Write UTF-8 Byte Order Mark: Yes

Click OK twice to add the writer to the workspace.
writerparams.PNG

3. Connect the Reader and Writer

Drag a connection line from the Reader feature type to the Writer feature type. The finished workspace should look like this:

Screen Shot 2021-04-05 at 12.31.11 PM.png

4. Run the Workspace

Click the Run icon to run the workspace. The XML file will be converted to the UTF-8 encoding set in the Writer parameters.

Capture.PNG

Was this article helpful?

We're sorry to hear that.

Please tell us why.

As of January 14th, 2026, comments on knowledge base articles have been closed. To make sure questions don’t get missed and to enable more community support, we’ve moved discussions to the FME Community. If you have a question or a comment about this article, please create a new post or create a support ticket.