Dynamic Workflows with the SchemaScanner

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2022.0

Introduction

As we’ve seen, dynamic workflows can obtain their schema from multiple locations.

One of those locations is in the workspace itself, and the SchemaScanner transformer is a key tool in making that happen.

The SchemaScanner - as the name suggests - scans incoming features and produces a schema from them. That schema can then be used in a dynamic writer. The schema is stored in a specific FME feature.

The schema produced by this transformer may be different from the reader schema, as a result of processes in the workspace; such as attribute renaming, removal, or addition.

In this workspace, attributes are being added and removed from the reader schema

SchemaScannerA1.png

The SchemaScanner generates a new schema for the output, assigns it a name, and passes it to a dynamic writer. Notice how the schema feature passes from the SchemaScanner:Schema output port to the same writer input port as incoming data.

The dynamic writer is set up to recognize incoming schema features and will make use of them:

SchemaScannerA2.png

Notice how the Schema Source is set to “Schema From Schema Features” to inform the writer from where the schema is to be obtained. Also, notice the parameter that defines the name of the schema. This handles the situation where the same writer is fed multiple schema features.

Now the data will be written to the output CSV dataset with the schema as modified by the AttributeManager.

Definition: A Schema Feature stores information about a schema that can be passed to a writer. The information is held as a series of list attributes called attribute{}.name and attribute{}.fme_data_type


SchemaScannerA10.png

Schema features are generated by the SchemaScanner transformer, but also by the FeatureReader transformer, Schema (Any Format) reader, and even the AttributePivoter transformer!

Going back to the SchemaScanner, let’s look at the parameters:

SchemaScannerA3.png

The important things to note here are:

  • The transformer is set up to output the schema feature before any data features. This is vital. The writer must receive the schema feature before any data features that make use of it.
  • The transformer is set up to exclude from the schema any attributes that contain fme_, csv_, or multi_. These are format attributes that we don’t really need in the output.
  • Attributes with empty (or null) values will be ignored; i.e. excluded from the output schema, just as if they’d been removed in the AttributeManager
  • The transformer has been instructed to detect dates in the incoming data. If this had not been set, values such as 20220812 would be interpreted as a number, not a date.


Step-by-step Instructions

The goal here is to transform some CSV data, but with a dynamic schema to take account of changes made to the data within the workspace itself.

1. Generate Workspace
Generate a workspace to translate from CSV to CSV. Set the Reader Dataset to Cedar Cottage.csv which can be found in the downloaded Files. Set the Writer Dataset to an output location. Finally, set the Workflow Options to Dynamic Schema. 


SchemaScannerA4.png

Run the workspace to load the data and inspect it to see what we have.

2. Create New Attributes
Add an AttributeManager transformer and connect it between the reader and writer feature types. We'll use it to create a new attribute called Hours, which is a combination of the Open and Close attributes. Open the Text Editor for the Hours Value and set it to:

@Value(Open) - @Value(Closed)

Delete the Open and Close attributes, since we no longer need them.

SchemaScannerA5.png

Be sure to create “Hours” above “Open” and “Close”; otherwise the deletion occurs first and Hours has no content. You can use the Up/Down arrows to move attributes.

You can run the translation again at this point, but the output will still have the same schema. That’s because it’s still coming from the reader, and proves why we need the SchemaScanner.

3. Add a SchemaScanner Transformer
Place a SchemaScanner transformer. Ensure it is after the AttributeManager and connect both the Output and <Schema> ports to the writer feature type.

Check the transformer parameters. Ensure that Output Schema Before Data Features is set to Yes.
For Ignore Attributes Containing, enter:

fme_|csv_|multi_


SchemaScannerA6.png

4. Modify Writer Parameters
Now open the parameters for the <dynamic> CSV writer feature type 

Click the ellipsis [...] button to the right of the Schema Sources parameter. In the dialog, uncheck the box for Cedar Cottage and check the box for Schema From Schema Feature:

SchemaScannerA7.png

Now set the Schema Definition Name parameter to the attribute created by the SchemaScanner; fme_feature_type_name

5. Run Workspace
Your workspace should now look like this:

SchemaScannerA8.png

Run the workspace and inspect the output. It will look like this:

SchemaScannerA9.png

Note that the Open and Close attributes are not there, but there is an attribute called Hours.

6. Save Workspace
Save the workspace. We'll use it as the starting point for the next exercise.

 

Data Attribution

The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.
 

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.