How to Expose Feature Attributes from KML Tags or HTML Tables

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

Introduction

FME can read in Google Earth KMLs, but it takes a few more steps to expose the attributes contained within the description balloons seen within Google Earth. This tutorial will show you two different methods of exposing those attributes based on how your data is structured. 

 

Source Data

Viewing the source data in Google Earth, you can see the data is contained within a table for each point feature. 
GoogleEarth.png


Then if we open the data in a text editor, we can see that the attribute table data is embedded within a <description> tag and the entire file is written in HTML. 
TextEditor.png


Step-by-step Instructions

Part 1: Convert to XHTML

There are two methods you can try to expose your attributes. Both methods involve converting the HTML contained within the KML balloon to XHTML. 

1. Add an OGC/Google KML Reader
Open FME Workbench and start a blank workspace. Add an OGC/Google KML reader to the canvas and browse to the doc.kml dataset that can be downloaded from the Files section of this article. Open the reader parameters. 
KMLReader.png

In the reader parameters, expand the Schema Attributes section, then click the ellipsis next to Additional Attribute to Expose. The list of attributes is long, type in kml_description into the Filter. Select kml_description, then click OK three times to add the reader. 
ReaderParams.png

The kml_description contains all of the feature attributes, which we will parse out. 

In the Select Feature Type dialog, deselect all of the feature types except Placemark and click OK. 
SelectFT.png

2. Convert HTML to XHTML
Next, add an HTMLToXHTMLConverter transformer to the canvas and connect it to the Placemark reader feature type. This transformer ensures that the tag contains valid XHTML and the elements are properly nested. In the parameters, set the Attributes with HTML Text to kml_description, then change the Input Encoding to Unicode 8-bit (utf-8). Next, set the Output Attribute to kml_description so that it will overwrite our input. 
HTMLToXHTML.png
 

 

Part 2: Extract Attribute Information

Once the attribute table has been converted to XHTML, you can continue processing the XHTML to extract attribute information using one of the two methods listed below.
 

Method 1: XQuery

1. Use an XQuery Expression
Add an XQueryExtractor to the canvas and connect it to the HTMLToXHTMLConverter passed output port. In the parameters, set the XQuery Expression to:

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in /html/body/table/tr/td/table/tr
return fme:set-attribute($x/td[1]/text(),$x/td[2]/text())


Next, set the XML Input to Attribute Specifying XML, then select kml_description for XML Attribute. Then finally, expand the Expose Attributes section and type in CH_LINING as an Attribute to Expose. To learn more about XQuery, please see the w3schools tutorial.  
XMLQueryExtractor.png

If the above XQuery Expression does not work for you, you can try:

declare default element namespace "http://www.w3.org/1999/xhtml";
for $x in //tr where (exists($x/td[1]) and compare($x/td[2]/text(),"&lt;Null&gt;"))
return fme:set-attribute($x/td[1]/text(),$x/td[2]/text())



This was suggested by user Marcp with the following explanation: 
“The //tr extracts rows, regardless of what comes before, which is very handy so that you don't really need to figure out the structure. The where(exists()) part is necessary for unknown reasons. Anything I tried without a where clause returned no results. The compare clause removed NULL parameters, which were formatted as <> in my data”
 
2. Run Workspace and Inspect Output
Either run the workspace with feature caching enabled or add an Inspector to the QueryResults output port on XMLXQueryExtractor, then run the workspace. View the output data in Visual Preview. 
VPXQuery.png

The CH_LINING attribute shows that an attribute from the table was properly exposed, but if you click on a single feature and open the Feature Information window, you will see all of the available attributes. 
FeatureInfo.png
If you would like to use your attributes later in the workspace or write them out, you will need to expose them by either adding them into the Attributes to Expose field in the XMLXQueryExtractor or by using an AttributeExposer transformer. 
 

Method 2: XML

Instead of using the XQueryExtractor transformer, it might be easier to use an XMLFragmenter based on your knowledge of XQuery or how your data is structured. 

1. Create a Feature Id
First, we will need to create a Feature Id which we will use to aggregate features at the end. Add a Counter to the canvas and connect it to the HTMLToXHTMLCoverter passed output port. We can accept the defaults. 

2. Fragment XML
Next, add an XMLFragmenter to the canvas and connect it to the Counter. In the parameter, set the XML Source Type to Attribute with XML Document, then select kml_description as the XML Attribute. For Elements to Match type in the following:

html/body/table/tr

Note: Change this structure should your document be structured in a different way. 

Now set Merge Attributes from Input Feature to Yes, then click on the Flattening Options button, then enable Flattening. 
XMLFragmenter.png


3. Expose Attributes
Now add an AttributeExposer to the canvas and expose the following attribute:

td.table.tr{}.td{}

AttributeExposer.png

Note that this attribute could also be exposed in the XMLFragmenter. 

4. Explode List
With the td.table.tr{}.td{} attribute exposed, we need to explode the list into its individual list parts. Add a ListExploder to the canvas and connect it to the AttributeExposer. In the parameters, set the List Attribute to td.table.tr{}
ListExploder.png

5. Create an Attribute with the List{0} Item
If we inspect the data, we will notice that every feature has a different td{0} value, which is the table headers (attribute titles). Let’s create attributes with these values. 
Td0.png

Add an AttributeCreator to the canvas and create a new attribute called @Value(td{0}) with the Attribute Value of td{1}
Be sure when creating the new attribute to type in @Value before the td{0} as this will pull in the value of each list item instead of the list itself. 
AttributeCreator.png

6. Aggregate Features 
Now we have the Attribute Name and the corresponding value, we can now build the feature again. Add an Aggregator to the canvas and connect it to the AttributeCreator. In the parameters, enable Group Processing, then select _count as the Group By. Then set the Accumulation Mode to Merge Incoming Attributes. 
Aggregator.png

7. Expose Additional Attributes
From here you should have the feature with its' corresponding attributes and values. Add another AttributeExposer and attach it to the Aggregator. Expose the CH_LINING attribute.  
AttributeExposer2.png

8. Run Workspace and Inspect Output
Either run the workspace with feature caching enabled or add an Inspector to the AttributeExposer_2, then run the workspace. View the output data in Visual Preview. 
ExposerVp.png

The CH_LINING attribute shows that an attribute from the table was properly exposed, but if you click on a single feature and open the Feature Information window, you will see all of the available attributes. 
ExposerFI.png


Additional Resources

How to Use XQuery Expressions to Query XML Data Within FME
Using the XQueryExtractor Transformer to Extract XML Text Using XQuery Expressions
Tutorial: Getting Started with XML

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.