FME Version
Files
Easier approaches to reading XML are now available in FME. See XML Reader Configuration or Reading Complex XML or GML using the XMLFlattener
Introduction
Many users have problems reading complex xml or gml. The way to do this is with FME’s Generic XML reader. However to do this you need to make an xfmap. The basic idea is that you specify the which node within the xml structure you want to make into a feature type in the feature mapping section. Then you specify what each of these features contains in the feature content map section. Below is a very basic example to help you get started. You can then add to it and make it as complex as you want later.
Source XML example
Suppose we want to read the source xml below:
<?xml version="1.0" encoding="UTF-8"?> <FeatureCollection> <Feature> <attribute1>John</attribute1> <attribute2>Vancouver</attribute2> <activeDate> <from>11-22-99</from> <to>12-11-09</to> </activeDate> <Coordinate_BOX id="101"> <coords>-123.1,49.25 -122.9,49.15</coords> </Coordinate_BOX> </Feature> <Feature> <attribute1>June</attribute1> <attribute2>Surrey</attribute2> <activeDate> <from>02-25-05</from> <to>9-15-10</to> </activeDate> <Coordinate_BOX id="102"> <coords>-122.8,49.12 -122.5,49.0</coords> </Coordinate_BOX> </Feature> </FeatureCollection>
XFmap Feature Type
As you can see this has a nested structure. So the first thing we have to do is decide what node will represent our feature type in FME. We could choose FeatureCollection, but then we would get one huge record. Or we could choose activeDate as our feature type but then we would be missing a lot of other information. So in this case probably the best position to make the feature type is at the node <Feature>. We can see already that this should yield 2 features since there are 2 Feature blocks in the source xml.
In Xfmaps we use a feature map to define our feature types. In this case it looks like:
<?xml version="1.0"?> <xfMap> <feature-map multi-feature-construction="true"> <mapping match="Feature"> <feature-type> <literal expr="Feature"/> </feature-type> </mapping>
This tells FME to construct a feature when we read the 'Feature' element. Note that we have not defined any content yet so this is just a container that we will fill later.
XFmap Attributes
So lets define some content for our new feature type. The trick here is we can read as much or as little of the xml as we want. The only limitation is that the xfmap will only ever process any match once, so only the first match will get used. Lets start with something simple.
<?xml version="1.0"?> <xfMap> <feature-map multi-feature-construction="true"> <mapping match="Feature"> <feature-type> <literal expr="Feature"/> </feature-type> </mapping> </feature-map> <feature-content-map> <mapping match="attribute1"> <attributes> <attribute> <name> <literal expr="attr1"/> </name> <value> <extract expr="."/> </value> </attribute> </attributes> </mapping> </feature-content-map> </xfMap>
The mapping match ="attribute1" tells what node we want to match on. <attribute> tags define what attribute we want to create. <name>, <value> specify the actual field name and value. extract expr="." defines the content of the field. The "." part means we take the value from the current matched element.
Using your XFmap
To actually read this data you will need to paste the source into a source.xml file, or download it from the bottom of this page. You then can drag and drop it into Workbench, choose the Generic XML reader, then click on the 'parameters' button, select xfmap for the configuration type, and then browse to your xfmap file. If your xfmap is configured correctly, you should get a new feature type called Field with one field called attr1. Note that it has no geometry so to view the content you could connect to a visualizer, run it and then select 'view no geometry' in the Viewer. Congratulations, you have just created your first xfmap to parse an xml source file.
XFmap Multiple Attributes
How about adding more fields? Well, we could have separate mappings for each field, but its easier to just list all the fields we want in one mapping expression. Replace the previous <mapping match="attribute1"> section with:
<mapping match="attribute1 attribute2"> <attributes> <attribute> <name> <matched expr="local-name"/> </name> <value> <extract expr="."/> </value> </attribute> </attributes> </mapping>
This will create fields for attribute1 and attribute2. local-name will keep the name the same as the matched tag, and "." will just extract the matched value. This is shorter to set up but doesn't allow you to rename the fields.
XFmap Nested Property Extraction
What about the nested date structure? This is typically what causes grief for traditional relational or simple features type xml or gml readers. Often these kind of fields aren't read at all, are read as some list structure, or just come in as an xml blob.
To read the date fields, we can do the following:
<mapping match="activeDate"> <attributes> <attribute> <name> <literal expr="date_start"/> </name> <value> <extract expr="./from" /> </value> </attribute> <attribute> <name> <literal expr="date_end"/> </name> <value> <extract expr="./to" /> </value> </attribute> </attributes> </mapping>
Because we match on "activeDate", that becomes our relative location within the xml document at the moment the match occurs. We could create a date field and give it the value "." but then we would have a field that contains xml. The better option is to drill into the object structure and create two new fields, one for date_start and one for date_end. We do this by using an extract expression "./from" "./to" which tells FME to pull out the values from the from and to properties, respectively.
Remember, if you list activeDate along with attribute1 and 2, then that match expression will capture activeDate as field with nested xml content and the second activeDate matching expression will be ignored.
The one limitation with xfmaps is that the error reporting isn't always very descriptive, and will only usually tell you about the first problem with your xfmap, so you will need to start simple and work your way up in order to learn the ins and outs of the xfmap syntax.
XFmap Geometry
Last but not least, let's add some geometry to our features. To do this we need to find an element within our feature that defines the geometry and then we need to choose the appropriate xfmap object type to interpret that geometry. Let's match on Coordinate_BOX and use it to create an xml-box.
<mapping match="Coordinate_BOX"> <geometry activate="xml-box"> <data name="data-string"> <extract expr="./coords"/> </data> </geometry> </mapping>
So we create a mapping match in the feature content map as normal. However this time we use the geometry_activate command to tell FME that we want to create an xml-box geometry. There are also xml-point, xml-line, xml-area and many other geometry types we could use. These are all defined in the XML - Xfmaps section of the Readers and Writers manual.
Remember, every time you add to the xfmap you will need to reimport your dataset or at least delete and re-add the feature type in order for the new schema information (attribute or geometry definitions) to be read. If your workspace doesn't know about your schema changes, then it will ignore them even if you have added them to your xfmap.
XFmap Geometry Traits
Finally, let's add some traits to our geometry. Traits are tags or attributes that are associated with the individual geometries on a feature. These become particularly important when we have more than one geometry associated with one feature. Often there are geometry ids such as gml_id that we need to define.
To add traits we use the <trait> tag within the geometry definition as follows:
<mapping match="Coordinate_BOX"> <trait> <name> <literal expr="id"/> </name> <value> <extract expr="@id"/> </value> </trait> ...
The @ sign tells FME to pull the "101" property from the <Coordinate_BOX id="101"> object.
That concludes our 'basic' example. I will also add a few shortcuts and examples of other geometries, but it is hoped that this will help you get started with reading your xml whatever the source or structure may be.
Completed XFmap
Putting it all together, here is the completed xfmap file which reads all the data from the source xml:
<?xml version="1.0"?> <xfMap> <feature-map multi-feature-construction="true"> <note> construct an FME feature when we read the 'Feature' element. </note> <mapping match="Feature"> <feature-type> <literal expr="Feature"/> </feature-type> </mapping> </feature-map> <feature-content-map> <mapping match="attribute1 attribute2"> <attributes> <attribute> <name> <matched expr="local-name"/> </name> <value> <extract expr="."/> </value> </attribute> </attributes> </mapping> <mapping match="activeDate"> <attributes> <attribute> <name> <literal expr="date_start"/> </name> <value> <extract expr="./from" /> </value> </attribute> <attribute> <name> <literal expr="date_end"/> </name> <value> <extract expr="./to" /> </value> </attribute> </attributes> </mapping> <mapping match="Coordinate_BOX"> <trait> <name> <literal expr="id"/> </name> <value> <extract expr="@id"/> </value> </trait> <geometry activate="xml-box"> <data name="data-string"> <extract expr="./coords"/> </data> </geometry> </mapping> </feature-content-map> </xfMap>
Comments
0 comments
Please sign in to leave a comment.