XML Reading with XfMaps: Flattening Example - Extracting all Attributes from an XML File

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2016.x

Easier approaches to reading XML are now available in FME. See XML Reader Configuration or Reading Complex XML or GML using the XMLFlattener

Introduction

Many users have problems reading complex xml or gml. The way to do this is with FME’s Generic XML reader. This can be done with Feature Paths. However xfmaps are xml intepretation scripts which give you full flexibility to read the xml how you want including geometry intepretation, which is something Feature Paths do not yet support. The basic idea is that you specify the which node within the xml structure you want to make into a feature type in the feature mapping section. Then you specify what each of these features contains in the feature content map section.

XfMap Structure Command

This can take a fair bit of time to list out all your attributes, so here is a technique for recursively extracting all attributes from below a tag match. This little trick can allow any FME user to read almost any xml dataset, so it is very powerful.

  <mapping match="Feature/*">
  <structure
  separator="_"
  cardinality ="*/+" />
 </mapping>  


The 'mapping match="Feature/*"' means that this maps all children whose parent is <Feature>. The <structure separator="_" cardinality ="*/+" /> command flattens their names from:

 <Feature>
   <attribute1>John</attribute1>
   <attribute2>Vancouver</attribute2>
   <activeDate>
               <from>11-22-99</from>
               <to>12-11-09</to>
   </activeDate>
 </Feature>

to:

attribute1 = John
attribute2 = Vancouver
activeDate_from = 11-22-99
activeDate_to = 12-11-09

 

Considerations

The only caveat is that it builds attribute names of the form parent_childAttrbite for everything below the tag matched. Also, where 'childAttribute' occurs multiple times, you get a list construct in FME childAttribute{0}, childAttribute{1}. You could then decide to match the tag at the childAttribute level rather than the parent level, or you could use a ListExploder within FME to create individual features for every list element.


The point is, this is a quick way to read xml data into FME, particularly if it is attribute only. You can mix literal definitions with these recursive definitions, so long as they dont overlap. If you try to match a tag twice the first one the interpreter hits wins, so to avoid conflict, you need to list the tags you dont want recursively deconstructed under the exceptions.

 

Source XML example

Suppose we want to read the source xml below. The structure command above would allow us to read all the attributes below the <Feature> tag. But how would we read the geometry as well?

<?xml version="1.0" encoding="UTF-8"?>
<FeatureCollection>
  <Feature>
  <attribute1>John</attribute1>
  <attribute2>Vancouver</attribute2>
  <activeDate>
     <from>11-22-99</from>
     <to>12-11-09</to>
  </activeDate>
  <Coordinate_BOX id="101">
  <coords>-123.1,49.25 -122.9,49.15</coords>
  </Coordinate_BOX>
  </Feature>
  <Feature>
  <attribute1>June</attribute1>
  <attribute2>Surrey</attribute2>
  <activeDate>
     <from>02-25-05</from>
     <to>9-15-10</to>
  </activeDate>
  <Coordinate_BOX id="102">
  <coords>-122.8,49.12 -122.5,49.0</coords>
  </Coordinate_BOX>
  </Feature>
</FeatureCollection>

 

Adding Geometry to Structure XFMaps

The trick is to define an exception so that we dont try to render the geometry field 'Coordinate_BOX' automatically into an attribute. We do this by qualifying the mapping expression with:

<mapping match="Feature/*" except ="Coordinate_BOX">
  <structure ...


This means that the structure command will go through all the elements below Feature/* (hence the wildcard) but skip over the Coordinate_BOX element. We can then deal with the box geometry as we did in the basic example:

  <mapping match="Coordinate_BOX">   
  <geometry activate="xml-box">
  <data name="data-string">
  <extract expr="./coords"/>   
  </data>   
  </geometry>   
  </mapping>   

Putting it all together, we now have an xfmap which will create a new feature type for <Feature>, dynamically capture all the elements under <Feature> as attributes, and add box geometry to this. I know, this xfmap stuff may look a little strange, but once you play with a few examples like this you can see that it isnt really as scary as you may have thought till now.
 

Completed 'Flattening Example' XFmap

<?xml version="1.0"?>
<xfMap>
  <feature-map>
  <mapping match="Feature">
  <feature-type> <literal expr="Feature"/> </feature-type>
  </mapping>         
  </feature-map>           
  <feature-content-map>
  <mapping match="Feature/*" except ="Coordinate_BOX">
  <structure
  separator="_"
  cardinality ="*/+"/>
  </mapping>  
  <mapping match="Coordinate_BOX">   
      <trait>
     <name>
      <literal expr="id"/>
     </name>
     <value>
      <extract expr="@id"/>
     </value>
    </trait>   
  <geometry activate="xml-box">
  <data name="data-string">
  <extract expr="./coords"/>   
  </data>   
  </geometry>   
  </mapping>   
  </feature-content-map>       
</xfMap>

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.