Reading AutoCAD Map 3D Object Data

Files

StateCountyBoundaries.dwg
- 3 MB
- Download
autocad_od2shapefile.fmw
- 100 KB
- Download
autocad_od2spatialite.fmw
- 200 KB
- Download

Introduction

In this article we will go over how to read AutoCAD Map 3D Object Data so that we may translate this data within FME.

AutoCAD Map 3D Object Data Reader

There is no standard way to go about setting up a translation, it all depends upon the schema of the source data and the structure that is required in the output datasets.
Object Data allows attributes to be stored in data tables, but is particularly liberal in its approach to data structures, and any particular spatial feature (or entity) within a dataset can possess any number of attribute records in any number of data tables (or none at all).
Given this, the approach that we take in FME is to provide a number of different reading modes so that the user has a wider choice of workspace layouts to view the data schema in Workbench.

Description of the Sample Dataset

A number of example workspaces follow. The sample dataset for these workspaces is a map of the state of Texas. The map includes state and county boundaries plus a number of features representing the state road network.

An illustration of the sample dataset. The county boundaries are red and roads are blue. The data's coordinate system is Lat/Long NAD 83.
Boundary features are stored on a layer called Boundaries and have links to data tables called County and State.
Road features are stored on a layer called Roads. The road network is spatial data only, with no attributes or object data tables.
The important thing about the source data is that all features are line geometries, and that each boundary between two counties (or one county and the state outline) is a single feature with two data records - one for each county (or one for the county and one for the state).

A close-up view. This particularly twisty section is a single line with two records in the county data table; one for San Saba County and one for Lampasas County.

Step-By-Step Instructions

There are three different Reader Modes of Operation. They are "Group by Entity", "Raw Relational", and "Group by Object Data".
The choice of mode is made in the parameters dialog when a source dataset is added to a workspace.

Generating a Workspace. Click the Parameters button to get the Object Data parameters dialog.

In the Object Data parameters dialog, notice the choice of the three different modes at the top of the dialog.

Since these parameters have an immediate effect on the layout of a new workspace - it isn't possible to change reading mode after the workspace has been created (or change the layout of a reader once it has been added to an existing workspace); the reader will need to be deleted and re-added to change the parameters.

1. "Group By Entity" Mode

The Group By Entity mode sets up a workspace with one feature type for each layer of data within the dataset.
The data table attributes relating to each feature are attached to the feature, which makes this a very useful mode for writing data to a GIS-related format such as Mif/Mid or shapefile. This isn't a good mode to use to preserve the original schema structure, so is NOT the way to go if you want to write back to the same format. However, because of the great flexibility of the Object Data format, some considerations need to be made.

A. Where each entity (feature) possesses more than one record - particularly more than one record in the same table - then a list structure is created.
For example, given this setup of a single feature with two records in the same table:

Feature # Data Table Fields
1 TableA FieldA1, FieldA2, FieldA3
1 TableA FieldA1, FieldA2, FieldA3

...the output feature will include a list attribute on its schema:

TableAData{0}.FieldA1
TableAData{0}.FieldA2
TableAData{0}.FieldA3
TableAData{1}.FieldA1
TableAData{1}.FieldA2
TableAData{1}.FieldA3

B: Because it is possible for different features to possess records in different data tables, the list of attributes for source feature types will be verbose and include all possible attributes.

For example, given this setup:
Layer Feature # Data Table Fields
LayerA 1 TableA FieldA1, FieldA2, FieldA3
LayerA 2 TableB FieldB1, FieldB2, FieldB3

...the source feature type for LayerA will include the attributes:

FieldA1
FieldA2
FieldA3
FieldB1
FieldB2
FieldB3

i.e. all features exiting this feature type will have ALL possible attributes attached, whether or not they contain a value.

C: Because it is possible for different data tables to possess the same field names, there is an option to prepend all attributes with the table name to distinguish them.

For example, given this setup:
Layer Feature # Data Table Fields
LayerA 1 TableA Field1, Field2, Field3
LayerA 2 TableB Field1, Field2, Field3

...the prepend option will ensure non-conflicting field naming by giving:

TableA_Field1
TableA_Field2
TableA_Field3
TableB_Field1
TableB_Field2
TableB_Field3

In our example dataset, there are no clashes of table name, however the schema will have both state and county fields (even though not all features will have state records), plus features that are the boundary between two counties will have a list attribute to store the names of the two counties.

As you might expect, road features have a feature type, but no attribute data since they do not have a related data table.

The data in the Feature Information window shows the list structure and contents of that list for a specific feature.

Because shapefile datasets won't accept data that is in a list structure, our final workspace is adjusted to turn the list into a comma-delimited attribute (using a ListConcatenator transformer). Roads bypass this step since they do not contain a list.

2. "Raw Relational" Mode

The "Raw Relational" mode essentially spits out the spatial entities and database records as separate features, each of which has an attribute link to mark them as related.
One feature type is provided for each layer of data, and another feature type for each object data table.

In effect, this is the raw data for the user to make use of in whichever way suits them best. This mode is particularly useful for writing to a database format where spatial data and attribute data are kept in separate tables (which is why we call it "relational" mode, because like a "relational" database it is a table representation related by a primary key) and also for writing data to certain CAD formats where attributes are traditionally held separately to the spatial features.

In this mode, we don't have to worry about any of the considerations of the "Group By Entity" mode, because multiple records assigned to a single feature simply result in multiple records through the database table feature type. Similarly, each table gets a separate feature type so there is no problem with an over-verbose schema, and there is no clash of fields even where different tables have the same field names. However, these problems would, of course, re-emerge if the user subsequently tried to merge the attribute records back onto the features within the workspace.

In this example workspace we are choosing to write the data to a SpatialLite database.

In "Raw Relational" mode the initial workspace looks like this. Notice how there is a feature type for each layer (Boundaries, Roads) and one for each table (CountyData, StateData). Notice also the format attribute autocad_od_entity_key which is the attribute that acts as a lookup key between the spatial data and attribute tables. NB: Layer0 is a default layer that is present in all AutoCAD databases.

Since we are using a SpatialLite database, you will be able to view the output as well as inspect the visual preview. This will show a feature that has no user attributes, but has a list of tables (autocad_map_odtable{}) in which this entity has records, plus an entity key to be able to match the two.

The table records are read as FME non-geometry features. Here's a StateData table record that relates to the spatial feature.

This is also the mode you would want to use to read Object Data and write it back to a dataset of the same format. The table (non-geometry) records would get written back as a table, and the entity (geometry) features would get written back as a layer. Because the format attribute (autocad_od_entity_key) is the same one that the Object Data writer looks for, the links would be made automatically. One example of such a task would be in reprojecting the data from one coordinate system to another without wanting to change format, or a user who wants to join or split tables.

3. "Group By Object Data" Mode

The "Group By Object Data" mode is almost the opposite of the "Entity" mode, in that instead of getting one feature type per layer, you get one feature type per data table, and the data coming into the workspace is one feature for every record in each table.

Because of this, each AutoCAD entity that is attached to multiple records will be present multiple times in the data.

For example, given this setup, where one feature is linked to two records:
Feature # Data Table Fields
1 TableA FieldA1, FieldA2, FieldA3
1 TableB FieldB1, FieldB2, FieldB3

...the reader will output two features:
Feature1 (Attributes = FieldA1, FieldA2, FieldA3)
Feature1 (Attributes = FieldB1, FieldB2, FieldB3)

As you can see this is going to be inefficient in FME, because you can end up processing many times the number of actual entities in the source data. On the other hand, this is a useful mode to use when (like in our sample dataset) a single entity is representing more than one geographical feature (in our case both county and state boundary).
However, the data table feature types are only half of the picture. Entities that don't have associated records (i.e. no object data) need to be output also, so a new workspace also has a feature type for each layer that contains non-object-data entities. Furthermore, because it could be confusing to a user to have some entities on a layer read (the ones without object data) and some ignored (the ones with), all features on that layer will be output, whether or not they have already been output through an object data feature type.

This can obviously cause even more duplicate features and inefficiency, but as our developer tells me, this choice was made for clarity not efficiency, and to ensure backwards compatibility with the previous AUTODESK_MAP object data reader.

Our sample data was added to a workspace in "Group by Object Data" mode. Note the feature type for each table (CountyData, StateData) which includes attributes, and the feature types for each layer (Roads, Boundaries); these don't include attributes - even though some of the entities might have related object data - to differentiate them with the object data table features.

Since in this example we do not wish to get the boundary data twice over, we can simply disable that feature type.

The output shows two features at this location; this State boundary and a county boundary. But that's ok, because they are divided in the workspace and being written to two different feature types.

Add an AreaBuilders to your workspace. This provides separate area features for each feature type. This is only possible with duplicated features.

This reading mode could also be used if you want to read Object Data and write it back to a dataset of the same format (for example to carry out a coordinate system reprojection). Duplicated features are not a problem because, when writing Object Data, spatial features with duplicate entity keys are discarded. The outcome would be one entity, but multiple records - meeting the rule that each row only matches one entity and giving you just what you started with! We still recommend "Raw Relational" as the mode to use for this scenario though.

FAQ

Q) Should I use this format to read non-object DWG/DXF datasets?
A) You could, but unless you have a specific reason the recommendation is to continue using the AutoCAD DWG/DXF (aka ACAD) reader/writer for these files.

Q) What happens to block features with attribute data when I explode them?
A) Blocks can have one record for the whole block or one record for each entity (part) in the block - or both! If block references have associated object data it will be attached to the insert point for the block. If block references have parts that have associated object data, and blocks are exploded, the object data associated with the block parts will be attached to all the features created for the block parts. Usually you're unlikely to want to explode blocks when you are doing an Object Data > Object Data translation, since there could be a conflict of entity/record keys.

Additional Resources

Data Attribution

The data used here originates from open data made available by the Texas Open Data Portal

Search