Pivot Tables and FME

Files

AttributePivoter.fmw
- 70 KB
- Download
pivot-source.csv
- 297 Bytes
- Download
StatisticsCalculator.fmw
- 80 KB
- Download

Introduction

A pivot table, also known as a crosstab, allows a user to change the structure of a spreadsheet to perform data summarization. Below is an example of a Microsoft Excel pivot table, which demonstrates the summarization of ‘observed’ data by the ‘region’ and ‘potential’ attributes. This article demonstrates two of the FME transformers that can be used to construct a pivot table.

Video

Note that this video was created using FME 2016. The interface might look different, but the content and workflow are the same.

Source Data

The source data is a CSV file containing statistics about areas in British Columbia.

Step-by-step Instructions

Example 1: Using the AttributePivoter Transformer

If you are only analyzing one attribute, you can use the AttributePivoter transformer to create a pivot table or crosstab in FME. This transformer allows you to create pivot tables that are similar to Excel, and this approach has distinct advantages over other approaches, including:

Adding dynamic attributes to the results makes it easier to write data to the pivot table. This is useful because it is sometimes difficult to predict the names used in the output data schema after the pivot has been performed.
Summary statistics are created automatically by the AttributePivoter.
Row order is preserved when using the AttributePivoter transformer.

Please download the pivot-source.csv from the files section on the right sidebar to follow along with the tutorial

1. Create a New Workspace

Open FME Workbench and select New to create a blank workspace.

2. Add a CSV Reader

Add a CSV (Comma Separated Value) reader to the canvas by clicking the Add Data Source square on the canvas (2025.0+) or typing CSV on the canvas to open the quick add menu. In the Add Reader window, set the following parameters:

Format: CSV (Comma Separated Value)
Dataset: pivot-source.csv
- Click on the ellipsis to navigate to the file location on your computer

3. Inspect Data

Before we pivot the data, let’s inspect it to see what values we should use. Click on the CSV reader feature type to open the mini toolbar, then click on the View Source Data button to open the data in Data Preview.

In Data Preview, you will see that we have a small table containing city names, their region, a potential, and an observed number. These numbers could be any statistical data, such as rainfall. We will pivot based on the observed attribute.

4. Pivot Data

Add an AttributePivoter to the canvas by typing “AttributePivoter” to bring up the list of FME Transformers in the Quick Add Search. Double-click the AttributePivoter from the list of Transformers to add the transformer to your canvas.

Double-click on the AttributePivoter to open the parameter. In the parameters, click on the ellipsis next to Group Rows By and select both potential and region. Then on the right-hand side of the Select Group Rows By Attributes dialog, click on region and move it to the top so that it is first. Click Ok to close the dialog.

Next, set the Attribute to Analyze to observed, and then for Pivot Summary Statistic Types, select Average, Count, and Sum. Click OK to close the AttributePivoter.

5. Write to Microsoft Excel

Add a Microsoft Excel writer to the canvas the same way a reader is added. Set the following parameters:

Format: Microsoft Excel
Dataset: AttributePivoter.xlsx
- Click on the ellipsis to navigate to a location on your computer
Sheet Definition: Dynamic

Click OK to add the writer to your canvas.

6. Run the Workspace

Connect the <Dynamic> writer to both of the Data and Summary output ports on the AttributePivoter.

Run the workspace by clicking on the Run button on the top toolbar or by using Run > Run Workspace on the menu bar.

After running the workspace, the table will be pivoted by region and potential, with observed having the statistics applied. You can view the table by selecting the writer feature type to open the mini toolbar and clicking on the View Written Data button. In Data Preview, switch to the fme_blank_sheet table.

Example 2: Using the StatisticsCalculator Transformer

If you have multiple attributes to analyze, or you want more control over which statistics are calculated using the StatisticsCalculator transformer, you can create a pivot table. The StatisticsCalculator can generate statistics for groups (or categories) of features.

1. Create a New Workspace

Open FME Workbench and select New to create a blank workspace.

2. Add a CSV Reader

Format: CSV (Comma Separated Value)
Dataset: pivot-source.csv
- Click on the ellipsis to navigate to the file location on your computer

3. Inspect Data

4. Pivot Data with StatisticsCalculator

Add a StatisticsCalculator to the canvas, and connect it to the CSV reader. Double-click on the StatisticsCalculator to open the parameters. In the parameters, enable Group Processing, and set the following parameters:

Group Processing: Enabled
- Group By: potential region
Statistics To Calculate:
- Total Count
- Sum
- Mean

Click OK to accept the parameters

5. Calculate Totals

Duplicate the StatisticsCalculator by clicking on it and hitting Ctrl-D (Cmd-D) on your keyboard. Connect it to the Complete output port on the first StatisticsCalculator.

Disable Group Processing in the StatisticsCalculator_2 parameters, but the other parameters remain the same. This StatisticsCalculator will calculate the total for all regions.

6. Rename Attributes and Set Order

Add an AttributeManager to the canvas and connect it to the Summary output port on the StatisticsCalculator_2. In the parameters, rename and create the following attributes:

Rename Attribute:
- Input Attribute: observed.total_count
  - Output Attribute: count_of_observed
  - Action: Rename
- Input Attribute: observed.sum
  - Output Attribute: sum_of_observed
  - Action: Rename
- Input Attribute: observed.mean
  - Output Attribute: mean_of_observed
  - Action: Rename
Add Attribute:
- Output Attribute: region
  - Attribute Value: Grand Total
  - Action: Set Value
- Output Attribute: potential
  - Attribute Value: <leave blank>
  - Action: Set Value
- Output Attribute: _FeatureOrder
  - Attribute Value: 1
  - Action: Set Value

We are creating a new value for the region attribute called Grand Total. This will be a label for all of the total statistics calculated in the StatisticsCalculator_2. The FeatureOrder attribute will ensure this feature is last.

Next, duplicate the AttributeManager, and connect the AttributeManager_2 to the Summary output port on the first StatisticsCalculator.

In the AttributeManager_2 parameters, change the FeatureOrder attribute value to 0

Output Attribute: FeatureOrder
- AttributeValue: 0
- Action: Set Value

7. Sort Features

Add a Sorter to the canvas, and connect it to both AttributeManagers. In the Sorter parameters, set the following:

Attribute: FeatureOrder
- Alpha/Num: Numeric
- Order: Ascending
Attribute: region
- Alpha/Num: Alphabetic
- Order: Ascending
Attribute: potential
- Alpha/Num: Alphabetic
- Order: Ascending

8. Write to Microsoft Excel

Add a Microsoft Excel writer to the canvas the same way a reader is added. Set the following parameters:

Format: Microsoft Excel
Dataset: AttributePivoter.xlsx
- Click on the ellipsis to navigate to a location on your computer
Sheet Definition: Dynamic

Click OK to add the writer to your canvas.

In the Feature Type dialog, click OK to accept the defaults for now, as we need to connect to the workflow to access the attributes. Connect the writer feature type to the Sorter, then double-click on the writer feature type to reopen the parameters.

In the parameters, change the Sheet Name to StatsPivotTable, then switch to the User Attributes Tab.

In the User Attributes tab, change the Attribute Definition to Manual, then remove the FeatureOrder attribute by selecting it and clicking on the minus ( - ) sign at the bottom. Next, move the region and potential attributes up, so they appear first. To do this, click on the attribute name and then on the up arrows below the table. Click OK to finish modifying the writer.

Your workspace should look similar to the image below

9. Run the Workspace

You can run the workspace by clicking the Run button on the top toolbar or by using Run > Run Workspace on the top menu bar.

After running the workspace, the table will be pivoted by region and potential, with observed having the statistics applied and a Grand Total at the bottom. You can view the table by selecting the writer feature type to open the mini toolbar and clicking on the View Written Data button.

Search