Harvesting and writing ISO19115 XML Metadata

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2015.x

Introduction

Automatically harvest metadata from source MapInfo TAB files and write out one ISO19115 xml metadata document for each input file. This is only meant as a demo and therefore the metadata generated is only partially populated.

 

Background

When writing with FME to custom XML such as metadata the best way is with the XMLTemplater and the text file writer, not the XML writer. This requires an XML template with functions that insert FME fields at the appropriate places in the XML structure. This template can come either from an XSD or sample data records. 

The easiest way to start is to find any sample metadata XML that looks like the result you want. Use that as a starting point for the XML template and insert the appropriate FME field functions. For iso 19115 it should start with a <MD_Metadata> tag. If you do not have any valid sample output, then you can generate the templates directly from the ISO 19115 XSDs. This is no problem other than the fact that the standard is very comprehensive and has a lot of optional elements. Note that there may be information that is constant for all datasets, such as organization name, contact info, etc. This info does not need to be merged in your workspace but can be entered directly into the XML template. Once we have an XMLTemplater generating valid XML output, then I will use the XMLValidator transformer to validate the results against the ISO19115 schemas to ensure that it is compliant. If you want to generate repeating elements in your XML document then you will need to use a sub template in your XMLTemplater. This defines the document structure for your features and creates a seperate input port. If you want to process multiple documents then you will need to set up a dataset fanout on the test file writer, as I have in the attached example, which generates one metadata document per input tab file.

Note that this demo is based on ISO 19115, but actually uses ISO 19139 schemas. The reason is that when ISO 19115 was developed, no equivalent XML schema was released (DTD or XSD). ISO 19139 was developed so that 19115 could be encoded for XML XSD. This makes it possible to do schema validation on XML metadata documents. But its all the same standard, just different implementations or encodings.

For more info see:

https://trac.osgeo.org/geonetwork/wiki/115and139Confusion

 

Metadata Harvest Demo Description

Here is how the attached example works:

  1. Read source datasets (MapInfo TAB reader)
  2. Expose dataset name (AttributeExposer)
  3. Capture dataset extents (BoundsAccumulator, BoundsExtractor)
  4. Generate time stamp and unique id (UUIDGenerator, TimeStamper)
  5. Apply each datasets metadata to the xml template (XMLTemplater)
  6. Format and validate the xml result (XMLFormatter, XMLValidator)
  7. Write to xml using the schema free text file writer, generating one xml document per input dataset (Text File Writer with dataset fanout on fme_basename).

Key elements of XMLTemplater's template are shown below. Note that the fme:get-attribute() function is used to merge FME field values within the appropriate XML elements. For example, {fme:get-attribute("fme_basename")} inserts the value of fme_basename within the <gco:CharacterString> element.

XML template example - fragments showing where FME fields are merged:

 

<?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:srv="http://www.isotc211.org/2005/srv" xmlns:gml="http://www.opengis.net/gml" xmlns:gco = "http://www.isotc211.org/2005/gco" xmlns:gmd= "http://www.isotc211.org/2005/gmd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.isotc211.org/2005/gmd" >
<!-- Part B 10.3 Metadata Language -->
<gmd:language> ... <gmd:identificationInfo>
   <gmd:MD_DataIdentification id="dataId">
      <gmd:citation>
         <gmd:CI_Citation>
         <!-- FME FIELD fme_basename Part B 1.1 Resource title -->
	    <gmd:title>
               <gco:CharacterString>{fme:get-attribute("fme_basename")}</gco:CharacterString>
            </gmd:title> 
...
	    <!-- FME FIELD 'rev_date' Part B 5.3 Date of revision -->
	       <gmd:date>
   		   <gmd:CI_Date>
		      <gmd:date>
		 	   <gco:Date>{fme:get-attribute("mod_date")}</gco:Date>
		      </gmd:date>
		      <gmd:dateType>
		         <gmd:CI_DateTypeCode codeList="http://standards.iso.org/ittf/PubliclyAvailableStandards/ISO_19139_Schemas/resources/Codelist/ML_gmx Codelists.xml#CI_DateTypeCode" codeListValue="revision">revision</gmd:CI_DateTypeCode>
		       </gmd:dateType>
		   </gmd:CI_Date>
		</gmd:date> 
...
		<!-- FME FUID Part B 1.5 Resource unique identifier -->
		<gmd:identifier>
	 	   <gmd:RS_Identifier>
			<gmd:code>
			   <gco:CharacterString>FR.natmap.HYDRO.{fme:get-attribute("fuid")}</gco:CharacterString>
			</gmd:code>
		   <gmd:codeSpace>
			<gco:CharacterString>INSPIRE</gco:CharacterString>
		   </gmd:codeSpace>
		</gmd:RS_Identifier>
	     </gmd:identifier>
	     <!-- Part B 1.5 Resource unique identifier -->
	     <gmd:identifier>
	        <gmd:RS_Identifier>
		   <gmd:code>
			<gco:CharacterString>{fme:get-attribute("fuid")}</gco:CharacterString>
		   </gmd:code>
		   <gmd:codeSpace>
			<gco:CharacterString>http://www.natmap.fr</gco:CharacterString>
		   </gmd:codeSpace>
		</gmd:RS_Identifier>
	   </gmd:identifier> ...
	   <!-- Part B 2.1 Topic Category -->
		<gmd:topicCategory>
			<gmd:MD_TopicCategoryCode>inlandWaters</gmd:MD_TopicCategoryCode>
			</gmd:topicCategory>
			<!-- FME FIELDS X/Y MIN/MAX Part B 4.1 Geographic Bounding Box -->
			<gmd:extent>
				<gmd:EX_Extent>
					<gmd:geographicElement>
						<gmd:EX_GeographicBoundingBox>
							<gmd:westBoundLongitude>
							<gco:Decimal>{fme:get-attribute("_xmin")}</gco:Decimal>
							</gmd:westBoundLongitude>
							<gmd:eastBoundLongitude>
							<gco:Decimal>{fme:get-attribute("_xmax")}</gco:Decimal>
							</gmd:eastBoundLongitude>
							<gmd:southBoundLatitude>
							 <gco:Decimal>{fme:get-attribute("_ymin")}</gco:Decimal>
	 						</gmd:southBoundLatitude>
							<gmd:northBoundLatitude>
							<gco:Decimal>{fme:get-attribute("_ymax")}</gco:Decimal>
							</gmd:northBoundLatitude>
							</gmd:EX_GeographicBoundingBox>
						</gmd:geographicElement> ...
					</gmd:statement>
				</gmd:LI_Lineage>
			</gmd:lineage>
		</gmd:DQ_DataQuality>
	</gmd:dataQualityInfo>
</gmd:MD_Metadata> 

Note: to see the fields that FME has merged into your result open the output xml and search for 'FME'. you should see a comment such as:

 <!--         FME FIELD fme_basename Part B 1.1 Resource title       -->


Followed by the content that has been updated:

 <gco:CharacterString>BoundaryArea</gco:CharacterString>

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.