Introduction
The Unexpected Input Remover carries out a very useful function, but is the source of much FME-related confusion.
Basically it checks to make sure the layers in your data are defined in the workspace. If they are not defined then the data on those layers (the unexpected input) is discarded (removed).
When the Unexpected Input Remover is Good
An important part of FME is being able to choose not to read certain source feature types; this can be done by removing them from the workspace. For example, I don't want to read the Schools layer from my data, so I delete the Schools feature type.
The Unexpected Input Remover is that function that discards such data. It checks the input data against the structure (or schema) defined in the Workbench or mapping file. If the data doesn't have a matching feature type, it is discarded.
The results of the Unexpected Input Remover are noted in the log pane, in two separate places...
Here the log reports how many features were tested, and how many passed and failed:
And at the foot of the log FME highlights the fact that some features did fail:
And just in case you don't check the log, there is an (optional) dialog that confirms the Unexpected Input Remover actions:
When the Unexpected Input Remover is Not So Good
Simple, eh? The problem is that, in many cases, these source feature types are not intentionally missing.
The most common reason for this isn't that they have been accidentally deleted, but rather that they never existed at the time the workspace was created. For example, the data has changed under you and new feature types have been added. For example, in the above screenshots, Schools were deliberately left out, but the dataset has changed recently and the feature type 'Buildings' has been added. Without a 'Buildings' feature type it will not get read.
Another common reason is that, for folder based datasets such as Esri shapefile or MapInfo, you have chosen a source dataset with a different name or structure than the dataset you originally wrote the workspace for.
To explain, some formats of data don't have the capability to store multiple Feature Types within a single file. Esri shapefile is one example of this, MapInfo MIF/MID is another. In such a format it isn't possible to distinguish between, say, roads and railways within a single file, therefore a separate file is created for each; for example roads.shp and railways.shp
In this case FME takes the name of the file as the name of the Feature Types it is processing.
Here a user has a set of MapInfo mif/mid files to process:
When parcel_P29 is used to create a workspace, the feature type takes the name parcel_P29:
This makes it difficult to process multiple shapefiles with the same workspace because any workspace you create for a single shape file (roads.shp, Feature Type=roads) will be incompatible with any differently named shapefile (railways.shp, Feature Type=railways).
It also means that files holding the same type of data won't be processed if they are named differently; for example create a workspace using roads-south.shp and it will not be compatible with roads-north.shp even though the features in both are of the same schema.
For example, in the above workspace if the source is changed to use pacel_P27, then it no longer matches the source feature type in the workspace and will be discarded by the Unexpected Input Remover.
Of course, this issue arises most often when working in batch mode - the first dataset works because it is the one used to create the workspace, but all subsequent datasets fail the UIR and are discarded.
FAQ
Q) Why does FME have the Unexpected Input Remover - why not just read all data in the source dataset?
A) Because not every user wishes to read all data from the dataset. By using Feature Types in this way FME permits a user to selectively choose which features he wishes to process.
Q) Is there a way to bypass the Unexpected Input Remover and read all the feature types?
A) Of course; there are two methods. The one to choose depends on your format type.
1) If there are missing Feature Types the logical solution is to add them to the workspace. Use Source Data > Import Feature Type definitions from the menubar. You are prompted to select a source dataset – when you do so FME makes a list of all of the Feature Types in that dataset and adds them to the workspace, in much the same way as when the workspace was originally created. You will need to make sure you select all the datasets that have different Feature Types.
This solution is better suited to file-based datasets which have the ability to assign Feature Type; for example DGN, DXF, geodatabase, Smallworld.
Note that the Feature Type definitions can be imported from any dataset - not necessarily the one you are wanting to read (e.g. you could store definitions for your DGN feature types inside an Oracle database schema, or add destination definitions by importing them from an existing source)
2) If missing Feature Types are being thrown away, another logical solution is to tell FME to keep them regardless. Open up the properties dialog box for a single source Feature Type. You will see a ‘Merge Feature Type’ option – click it to turn on the option. You then need to enter a value for the ‘Merge Filter’. The merge filter is a standard expression that defines which Feature Types are allowed to pass. Set it to "*" (asterisk) and ALL feature types will be permitted to pass. Set it to "st*" and only Feature Types beginning with st will be permitted to pass.
This solution is better suited to folder-based datasets which don't have the ability to assign Feature Type; for example shapefile, MID/MIF, CSV.
The advantage of the second method is that you don’t need to worry about whether new datasets contain Feature Types you haven’t encountered before, they will be passed regardless. The first method always requires you to pre-define your Feature Types. However the disadvantage to the second method is that since all features pass through the same input channel you lose the individual control over them that the first method allows for.
Q) How is this different to the "Feature Types to Read" parameter?
A) The Feature Types to Read parameter lets you define which feature types (layers) should be read from the source data. However, the list of layers is those defined in the workspace, not those in whatever dataset you are reading.
In other words, Feature Types to Read is a great way to turn layers on/off, but they must already be defined in the workspace. The main advantage is that it saves having to disable/delete unneeded feature types.
Q) Why did the Unexpected Input Remover refuse to read my Microstation DGN data?
A) The Unexpected Input Remover will filter out any features that do not match the input Feature Type attributes such as name or level.
In a MicroStation file this effectively filters out features on a different level to that which is expected.
BEWARE - The main catch here is when you create a workspace for one file and try to run a different file through it. If the second file has different levels its data is filtered out!
Complex Chains are another case to watch - in a V7 file the default level is 1, in a V8 file the default level is 'default' or 64.
This means the same workspace will not necessarily work for both versions of data! If you are missing features check their level name and number against your workspace.
Comments
0 comments
Please sign in to leave a comment.