Extracting a Schema Subset for Dynamic Schemas

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2022.1

Introduction

Using FME, schema can be extracted, and then a subset of that schema can be repurposed in other workflows. This tutorial walks you through how to make a custom format and then uses Python to extract a schema list. This is only an example and the PythonCaller can easily be replaced with list transformers. Please see Tutorial: Dynamic Workflows and Tutorial: Getting Started with List Attributes for additional ideas and information. 
 

Step-by-step Instructions

Part 1: Create a New Schema Subset

1. Create Custom Format Framework
Open a new workspace in FME Workbench, then click on the Add Reader button. In the Add Reader dialog, click on the drop-down for Format and select More Formats. 
MoreFomats.png

In the Reader Gallery dialog, click on New under Custom Formats, this will open the Custom Format Wizard. 
NewFormat.png

In the Custom Format Wizard, click Next to progress to the Select Format section. Set the Format to Schema (Any Format).
 Wizard1.png


Next, we will specify a source dataset, select the Parks.tab file that can be downloaded from the Files section of this article. Open the Parameters, and expand Schema Attributes. Click on the ellipsis next to Additional Attributes to Expose and select the following:

  • attribute{}.fme_data_type
  • attribute{}.name
  • attribute{}.native_data_type
  • fme_basename
  • fme_format_long_name
  • fme_format_short_name


Then click OK twice, and then click Next to proceed through the wizard. 
SchemaAttributes.png
Wizard2.png


The next section of the Custom Format Wizard is for additional parameters to expose. Since we want to create a dynamic schema reader the Input Format. Select Input Format then click Next. 
Wizard4.png

Now we will enter in the Short Name of the custom format, this is the name that will appear in the Navigator pane. For our example, type in DYNAMICSUBSET. The Description is the long name that appears in the Formats section of a Reader/Writer, for that type in Dynamic Subset Schema. 
Wizard3.png

Finally, click Finish. A new instance of FME Workbench will open with an editable version of the newly created custom format. There will be a Schema reader as well as a Custom Format Output writer in the Navigator pane. 
Workspace.png



2. Inspect Schema
We only want to use a subset of the schema attributes, so let’s inspect the schema to figure out which ones we want to keep. Click on the Schema reader feature type to open the popup menu,  then click View Source Data to view the data in Visual Preview. 
ViewSource.png

In Visual Preview, take note of the attributes (or fields) that you want to use as the schema. For this example, we’ll use ParkId, ParkName, and NeighborhoodName. 
VP.png

3. Create Published Parameters
Now that we know which schema attributes we would like to keep as a subset schema, let’s create a published parameter to capture those. Right-click on User Parameters in the Navigator pane and select Manage User Parameters. 
ManageUser.png

In the User Parameters dialog, click on the green plus sign (+) and select Text as the parameter type. 
SelectText.png

On the right-hand side of the dialog fill in the following parameter properties:

  • Parameter Identifier: to_keep
  • Prompt: Fields to keep (comma separated)
  • Published: Enabled
  • Required: Enabled
  • Disable Attribute Assignment: Disabled
  • Editor Syntax: Plain Text (Uniline)
  • Trim Whitespace: Enabled
  • Default Value: ParkId, ParkName, NeighborhoodName


Click OK to close the User Parameter Manager. 
Parameter.png

One final parameter to create is for the input dataset. Expand the Parks [SCHEMA] reader in the Navigator pane. Right-click on the Source Dataset, then select Create User Parameter. 
SourceParm.png

In the Add/Edit User Parameter dialog, click OK to accept the defaults. 
AddEdit.png


4. Fetch Parameter
Add a ParameterFetcher to the canvas and connect it between the Schema reader and writer feature types. 
ParamFetchConect.png

In the parameters, select the $(to_keep) parameter, then set the Target Attribute to _to_keep. 
ParameterFetcher.png

5. Python to Select Attributes 
Next, we’ll use a PythonCaller to extract the attributes selected in the to_keep User Parameter. Add a PythonCaller to the canvas and connect it between the ParameterFetcher and the Schema writer feature type. In the parameters, copy and paste in the following Python script:

import fme
import fmeobjects


class FeatureProcessor(object):
    def __init__(self):
        pass

    def input(self, feature):
        att_name_list = feature.getAttribute('attribute{}.name')
        att_ntype_list = feature.getAttribute('attribute{}.native_data_type')
        att_ftype_list = feature.getAttribute('attribute{}.fme_data_type')

        keep_values = feature.getAttribute('_to_keep')
        keep_list = keep_values.split(',')

        if att_name_list != None:
            feature.removeAttribute('attribute{}.name')
            feature.removeAttribute('attribute{}.native_data_type')
            feature.removeAttribute('attribute{}.fme_data_type')
        
            count = 0
            for i in range(len(att_name_list)):
                if (att_name_list[i] in keep_list):
                    feature.setAttribute(('attribute{'+str(count)+'}.name'),att_name_list[i])
                    feature.setAttribute(('attribute{'+str(count)+'}.native_data_type'),att_ntype_list[i])
                    feature.setAttribute(('attribute{'+str(count)+'}.fme_data_type'),att_ftype_list[i])
                    count = count + 1

                if (att_name_list[i] == 'fme_geometry{0}'):
                    feature.setAttribute('fme_geometry{0}',att_ftype_list[i])
        self.pyoutput(feature)

    def close(self):
        pass

    def process_group(self):
        pass

    def has_support_for(self, support_type):
        if support_type == fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM:
            return False
 
        return False


Next, set the Attributes to Hide to _to_keep, and click OK. 
PythonCaller.png

6. Inspect Python Output
Run the workspace with Feature Caching enabled to the PythonCaller, or connect an Inspector to the PythonCaller. In Visual Preview, click on the feature in the Table View and then open the Feature Information window, you should only see the attributes that were listed in the to_keep parameter. 
PythonOutput.png

7. Save the Custom Format
Since we used the Custom Format Wizard, the workspace/custom format, is already pointing to the correct location for FME to find it, which is \Documents\FME\Formats. Save the workspace then close FME. 
FileExplore.png


Part 2: Use the Custom Format

1. Make DYNAMICSUBSET Accessible For FME
Before we can use the new custom format DYNAMICSUBSET in FME, we need to close all open instances of FME, including FME Data Inspector. 

If you are adding a custom format that was not created on your computer, navigate to \Documents\FME\Formats then move the .fds file to this location. 


2. Use DYNAMICSUBSET
Open FME Workbench, and start a new workspace. Click on the Add Reader button, then for Format use the Description that was input when creating the custom format, for this example, it is Dynamic Subset Schema. Then browse to a dataset. 
NewSchemaReader.png

If you inspect the DYNAMICSUBSET schema reader, only the attributes selected will be available. You can now use this schema in your workflow. 
 FinalVP.png

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.