Pivot Tables and FME

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

Introduction

A pivot table, also known as a crosstab, allows a user to change the structure of a spreadsheet to perform data summarization. An example of a Microsoft Excel pivot table is found below, which demonstrates the summarization of ‘observed’ data by the ‘region’ and ‘potential’ attributes. This article demonstrates the FME transformers that can be used to construct a pivot table.

pivottable-1.png

Initial table (orange) and the pivot table created in Microsoft Excel (blue)

Video

Note that this video was created using FME 2016. The interface might look different, but the content and workflow are the same.

 

Step-by-step Instructions

Example 1: Using the AttributePivoter Transformer

If you are only analyzing one attribute, you can use the AttributePivoter transformer to create a pivot table or crosstab in FME. This transformer allows you to create pivot tables that are similar to Excel, and this approach has distinct advantages over other approaches, including:

  • Adding dynamic attributes to the results makes it easier to write data to the pivot table. This is useful because it is sometimes difficult to predict the names used in the output data schema after the pivot has been performed.
  • Summary statistics are created automatically by the AttributePivoter.
  • Row order is preserved when using the AttributePivoter transformer.


Source Data

The source data is a CSV file containing statistics about areas in British Columbia. 
SourceData.png

 
1. Create a New Workspace
Open FME Workbench and create a blank workspace. 
NewWorkspace.png

 
2. Add a CSV Reader
Add a CSV (Comma Separated Value) reader to the canvas by clicking on the Reader button on the top menu bar or by going to Readers > Add Reader. In the Add Reader dialog, select CSV as the Format, then for Dataset, browse to the pivot-source.csv dataset, which is available for download from the Files section on this article. Then click OK to finish adding the reader. 
CSVReader.png

3. Inspect Data
Before we pivot the data, let’s inspect the data to see what values we should use to pivot. Click on the CSV reader feature type to open the popup menu, then click on the View Source Data button to open the data in Visual Preview. 
ViewSource.png
 
In Visual Preview, you will see that we have a small table containing city names, which region they are in, a potential, and observed number. These numbers could be any statistical data such as rainfall. We will pivot based on the observed attribute.
SourceData.png

4. Pivot Data
Click on the CSV reader feature type to select it.  Then add an AttributePivoter to the canvas by typing “AttributePivoter” to bring up the list of FME Transformers in the Quick Add Search. Select the AttributePivoter from the list of Transformers by double-clicking or by using the arrow keys and the Enter key to add it. 
QuickAdd.png
Connection.png

Double-click on the AttributePivoter to open the parameter. In the parameters, click on the ellipsis next to Group Rows By and select both potential and region. Then on the right-hand side of the Select Group Rows By Attributes dialog, click on region and move it to the top so that it is first.  Click Ok to close the dialog. 
AttributePivot1.png


Next, set the Attribute to Analyze to observed, and then for  Pivot Summary Statistic Types, select Average, Count, and Sum. Click OK to close the AttributePivoter. 
AttributePivot2.png

5. Write to Microsoft Excel
Add a Microsoft Excel writer to the canvas the same way a reader is added. Browse to the Output folder, and name the file AttributePivoter.xlsx. Change the Sheet Definition to Dynamic (Advanced) and click OK to finish adding the writer. 
Writer.png

6. Run the Workspace
Connect the <Dynamic> writer to both of the Data and Summary output ports on the AttributePivoter. 
Workspace.png

Run the workspace by clicking on the Run button on the top toolbar or by using Run > Run Workspace on the top menu bar. 
Run.png
 
After running the workspace, the table will be pivoted by region and potential, with observed having the statistics applied. You can view the table by selecting the writer feature type to open the popup menu and clicking on the View Written Data button. In Visual Preview, switch to the fme_blank_sheet table. 
VPOutput.png
 

ex1output.png

Output data viewed and manually stylized in Microsoft Excel

 

Example 2: Using the StatisticsCalculator Transformer

If you have multiple attributes to analyze or you want more control over which statistics are calculated using the StatisticsCalculator transformer to create a pivot table. The StatisticsCalculator can generate statistics for groups (or categories) of features.
 
1. Create a New Workspace
Open FME Workbench and create a blank workspace. 
NewWorkspace.png
 
2. Add a CSV Reader
Add a CSV (Comma Separated Value) reader to the canvas by clicking on the Reader button on the top menu bar or by going to Readers > Add Reader. In the Add Reader dialog, select CSV as the Format, then for Dataset, browse to the pivot-source.csv dataset, which is available for download from the Files section on this article. Then click OK to finish adding the reader. 
CSVReader.png

3. Inspect Data
Before we pivot the data, let’s inspect the data to see what values we should use to pivot. Click on the CSV reader feature type to open the popup menu, then click on the View Source Data button to open the data in Visual Preview. 
ViewSource.png
 
In Visual Preview, you will see that we have a small table containing city names, which region they are in, a potential, and observed number. These numbers could be any statistical data such as rainfall. We will pivot based on the observed attribute.
SourceData.png
 

4. Pivot Data with StatisticsCalculator
Click on the CSV reader feature type to select it.  Then add a StatisticsCalculator to the canvas by typing “StatisticsCalculator” to bring up the list of FME Transformers in the Quick Add Search. Select the StatisticsCalculator from the list of Transformers by double-clicking or by using the arrow keys and the Enter key to add it. 
QuickAdd2.png
StatConnect.png

Double-click on the StatisticsCalculator to open the parameters. In the parameters, enable Group Processing, then click on the ellipsis next to Group By and select potential and region. Next, select the observe attribute for Statistics to Calculate, then enable Total Count, Sum, and Mean, then click OK. 
StatsCalc1.png

5. Calculate Totals
Duplicate the StatisticsCalculator by clicking on it and hitting ctrl-D (cmd-D) on your keyboard. Connect it to the Complete output port on the first StatisticsCalculator. 
Stats2Connect.png

In the StatisticsCalculator_2 parameters, disable Group Processing, but the other parameters can remain the same. This StatisticsCalculator will calculate the total for all of the regions. 

6. Rename Attributes and Set Order
Add an AttributeManager to the canvas and connect it to the Summary output port on the StatisticsCalculator_2. In the parameters, rename and create the following attributes: 

Input Attribute

Output Attribute

Attribute Value

Action

observed.total_count

count_of_observed

<leave blank>

Rename

observed.sum

sum_of_observed

<leave blank>

Rename

observed.mean

average_of_observed

<leave blank>

Rename

 

region

Grand Total

Set Value

 

potential

<leave blank>

Set Value

 

_FeatureOrder

1

Set Value


AttributeManager.png


We are creating a new value for the region attribute called Grand Total, this will be a label for all of the total statistics calculated in the StatisticsCalculator_2. The FeatureOrder attribute will ensure this feature is last. 
 

Next, duplicate the AttributeManager, and connect the AttributeManager_2 to the Summary output port on the first StatisticsCalculator. 
AttManConnect.png

In the AttributeManager_2 parameters, remove the two region and potential attributes that were created; they will have an empty value in the Input Attribute column, then change the value for FeatureOrder to 0. 
AttMan2.png

7. Sort Features
Add a Sorter to the canvas, and connect it to both AttributeManagers. In the Sorter parameters, select FeatureOrder as the attribute to sort by, then set the Alpha/Num to Numeric. Next, add region and potential, then sort those Alphabetic Ascending. 
Sorter.png

8. Write to Microsoft Excel 
Add a Microsoft Excel writer to the canvas and browse to the Output folder. Name the file StatisticsCalculator.xlsx, then set the Sheet Definition to Automatic, and click OK. 
Writer.png

In the Feature Type dialog, click OK to accept the defaults for now as we need to connect to the workflow to access the attributes. Connect the writer feature type to the Sorter, then double-click on the writer feature type to reopen the parameters. 
In the parameters, change the Sheet Name to StatsPivotTable, then switch to the User Attributes Tab. 
WriterParams1.png
In the User Attributes tab, change the Attribute Definition to Manual, then remove the FeatureOrder attribute by selecting it and clicking on the minus ( - ) sign at the bottom. Next, move region and potential up, so they appear first. Click OK to finish modifying the writer. 
WriterParams2.png
Workspace2.png
8. Run the Workspace
Run the workspace by clicking on the Run button on the top toolbar or by using Run > Run Workspace on the top menu bar. 
 Run.png

 
After running the workspace, the table will be pivoted by region and potential, with observed having the statistics applied and a Grand Total at the bottom. You can view the table by selecting the writer feature type to open the popup menu and clicking on the View Written Data button. 
VisualPreviw1.png

ex2output.png

Output data viewed and manually stylized in Microsoft Excel

 

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.