Files
-
- 70 KB
- Download
Introduction
Sometimes in a dynamic workflow, the desired schema only exists within the workspace (e.g., after pivoting data). Building a schema feature works, but often the schema needs to be carefully constructed with attribute transformers, like in the last example. There must be a better way!
In this example, a custom transformer called the SchemaScanner will be used to build a schema feature, which will then be manipulated using other transformers. This is a common pattern and should be used whenever you need to build a schema and edit its contents.
The Pattern
SchemaScanner > ListExploder > (Manipulate the Data) > ListBuilder
Use the SchemaScanner to create a list{} feature in FME using the current feature’s schema. Next, use the ListExploder to split the list into its individual features, where each feature contains an attribute name from the list. The third step is to manipulate the schema as needed (i.e., sort). Finally, bring everything back together into a new version of the schema feature using the ListBuilder.
Here are a few examples of why you would want to do this:
- Renaming Attribute Names
- Sorting (e.g., Dates)
- Removing or Adding New Attributes
The Workspace
In the attached workspace, we are creating a pivot table using a dataset of sales figures. The current data looks like:
| Quarter | Sales | Region |
| 2016Q1 | 1936 | New England |
| 2016Q2 | 167 | New England |
| 2016Q1 | 1235 | Mid-Atlantic |
| 2016Q2 | 169 | Mid-Atlantic |
The output we want is the following:
| Region | 2016Q1 | 2016Q2 |
| New England | 1936 | 167 |
| Mid-Atlantic | 1235 | 169 |
This is a simple pivot around the Region field. Note that we are not doing any grand totals of the data. How can we do this with FME?
Step-by-Step Instructions
1. Pivot the Data
Use an AttributeCreator to build attribute names out of the values of ‘Quarter’. The values of the new attribute will contain the Sales data.
-
New Attribute:
@Value(Quarter) - Attribute Value: Sales
Click OK to accept the parameters.
Next, use the Aggregator to summarize each Region. Set the following parameters:
-
Group Processing: Enabled
- Group By: Region
Click OK to accept the parameters.
After this step, you have pivoted your data, and the actual data looks like:
| 2016Q1 | 2016Q2 | Region |
| 1936 | 167 | New England |
| 1235 | 169 | Mid-Atlantic |
You'll see something like this in the Record Information window:
To see the data in Table View you'd have to add an AttributeExposer to add the new attributes to the workbench schema
You should notice that this data is structured in a completely new way, as it was created directly inside the workspace. This is important! Dynamic workspaces always need to be able to reference a schema definition in order to work, and in this case, we do not have one defined just yet (Step 3 below will address this).
2. Branching the Data
We only want to manipulate one schema feature (instead of doing this for every feature). We need to split the data into two streams. The first stream should connect to a brand new FeatureMerger in the Requestor Port, while the second stream should connect to a Sampler. Use the sampler to pass one single feature into the next step. See below:
3. Using the Pattern
At this stage, we need to build the schema feature and then manipulate it. We will start with the SchemaScanner transformer. Set the following parameters:
- Ignore Attributes Containing: multi|xlsx|fme
Click OK to accept the parameters.
If you were to inspect after using the SchemaScanner, you would see that the attribute list is now stored on the feature.
At this stage, the Schema object has been created, but isn’t organized in the sort order that you need. Let’s fix that by using a ListExploder. Set the following parameters:
- List Attribute: attribute{}
- Accumulation Mode: Only Use List Attributes
The Schema has been divided into individual attributes and stored inside an FME Feature. It is time to manipulate each attribute. In the attached example, we used a Sorter transformer, but since we are talking about a pattern, this is where your particular needs are addressed. Modify the name, fme_data_type, and list order as you see fit.
The final step is to bring the data back together into a list. Use the ListBuilder to accomplish this. Set the following parameters:
- List Name: attribute
You now have a completed Schema Feature that has been designed and edited. Fantastic!
4. Merge the Data and Write Dynamically
Connect the output from step 3 above to the Supplier port on the FeatureMerger. Do a 1:1 match to ensure the Schema is present on each feature. This ensures that the writer will receive the schema on the first feature it encounters. To do this, set the following parameters:
- Requestor: 1
- Supplier: 1
- Comparison Mode: Automatic
Click OK to accept the parameters.
Finally, ensure the writer is set up in dynamic mode and the source is your schema feature. Set the following parameters:
- Dynamic Schema Definition: enabled
When you run the workspace, you will get your desired output with your new schema defined.
Final Workspace:
Notes
Avoiding Dropped Features - FeatureReader:
If using the FeatureReader to create a schema feature, please note that the transformer adds a special attribute and value: fme_schema_handling = ‘schema_only’
This attribute tells the writer to use that feature as a schema and then remove it from the output. This can be an issue if you merge data and schema in the workspace, as the data merged with this schema feature would be dropped when written. For example, if you merge a schema feature, it gets added to all the features in the workspace.
To ensure this doesn’t occur, remove the fme_schema_handling attribute entirely so that the first merged data feature is not dropped by the writer.