The FeatureWriter Transformer

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2016.x

Introduction

In a conventional workflow, FME reads the data and writes using the writer feature types. This workflow has limited scope for additional processing after all the data has been written, using a Python or TCL shutdown script. The FeatureWriter expands the possibilities for workflows by allowing writing data mid workflow, followed by additional data translation and transformation. This is useful when it is desirable to do something with the data after it is written.

Here are a few possibilities for the FeatureWriter, all accomplished within one workspace:

  1. Simple procedures previously accomplished with scripts or manually:
    1. Copy or move data after writing
    2. FTP upload after writing
    3. Email after job complete
    4. Load the file into S3, Dropbox (FME 2016.1) or other Cloud storage after writing
  2. Complex tasks that required chained workspaces and FMEServerJobSubmitters before:
    1. Quality Check > Quality Report with Notification > Database Insert
  3. Notifications to FME Server
    1. After writing data with the FeatureWriter, it is easier to prepare notification messages, such as email for FME Server right in the same workspace
  4. Integrations with third party tools for data transformation in FME without waiting for a new reader:
    1. Example are LASTOOLS or Orfeo Tool box, Image Magic
    2. Basic workflow is FeatureWriter - SystemCaller - FeatureReader - Cleanup

This basic FeatureWriter demo validates a dataset and emails a validation report to the data validation manager after all the features have been written.

 

Video

 

Source Data

Parks.tab data is read with the MapInfo TAB reader. The dataset contains a number of attributes related to information about the parks, such as park name, whether the park has washrooms, a dog park, or other special features.

source-parks-mitab.jpg

Parks.tab in the Data Inspector

 

Step-by-step Instructions

1. Validate park attribute values

  • Validate the Parks dataset against a number of tests with the AttributeValidator. A list of the tests failed is added to features routed through the Failed port. Later on, this list allows us to write data about the parks which failed against one or more tests to an Excel spreadsheet enabling data validation and quality assurance.

2. Extract error messages and validation tests of failed parks

  • The list from the AttributeValidator, _fme_validation_message_list{}, is exploded into individual list items, so that a Park feature is obtained for each test failed. For example, there is a park in the Mount Pleasant neighborhood where the ParkName, SpecialFeatures, and Washrooms attributes are missing values, so three features are output for this park, each one containing a different value for _fme_validation_message_list{}.
  • The StringSearcher creates an attribute, _first_match, which contains the rule and its configuration. In this case, the values are Type is 'STRING', in 'Y,N' or Maximum Length =20. This attribute is used to fanout features in the FeatureWriter.

3. Write failed parks into a spreadsheet

  • FeatureWriter writes failed Park features into an Excel spreadsheet, FailedParks.xlsx, with a separate Excel tab for each _first_match value. The Summary port outputs a list which summarizes the features written including the total number of features written, the name and number of each feature type, as well as the output dataset path. The FeatureWriter allows us to write data mid-transformation and then continue with additional processing and tasks.

4. Email the failed parks report, FailedParks.xlsx

  • Email the failed features report, FailedParks.xlsx, to the Data Validation Manager using the Emailer transformer. The Emailer transformer must be configured to use your from and to email addresses.
    • Attachments: path to FailedParks.xlsx error report spreadsheet
    • Configuration for SMTP Connection section if Gmail is used
    • Sender Authentication: If you have Gmail / Google account with two step verification / authentication, you will need to generate a special App specific password for the Emailer. If you don’t have 2-step authentication, you may allow less secure apps to access your Gmail account
  • Once the email is received, the data validation manager can perform data validation and quality assurance on the parks dataset

Result

emaileremail.jpg

Email sent by the Emailer viewed in Gmail

 

output-parks-excel.jpg

FailedParks.xlsx opened in Excel

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.