How to Prepare Data for Tableau with FME (Merging Multiple Spreadsheets)

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

Introduction

In this exercise, we'll show how to process multiple CSV files in FME. We'll modify an existing FME workspace translating data from a single CSV file to Tableau. The workspace has been set up to process business license data from a single CSV file. Data validation and cleanup is performed before the data is written to Tableau format. But, there are also a number of other CSV files containing business license data from previous years. We will set up FME to handle all of these files. Then, we will demonstrate 2 options for writing out to Tableau:

  • Option 1: Write all data to a single Tableau file. 
  • Option 2: Write data to multiple Tableau files

 

Video

 

Step-by-Step Instructions

Read in Data

1. Open up the previous exercise and remove DataCleanup.xlsx writers

We will be using the workspace from the previous exercise How to prepare Data for Tableau with FME (Processing Spreadsheets). If you haven't completed the previous exercise and would like to, click on the link to go to the exercise. If you are only interested in this exercise, download the starting template. In the previous exercise, we already created an excel file to store all our values that need to be cleaned up, delete both DataCleanup.xlsx writers.
 

2. Set up CSV reader to read all of the files from a folder

In the Navigator pane, expand the CSV reader. Double-click on the source CSV file parameter, then open the advanced browser. Click on "Select Multiple Folders/Files..." Navigate to the folder containing the four CSV files, and select it. Read the files with a .csv extension from that folder. Remove the .gz file and the .txt file, and the previous .csv file, we are only interested in .csv files from the PastYears folder.

Datasets within the PastYears folder:

  • 2012business_licences.csv
  • 2013business_licences.csv
  • 2014business_licences.csv
  • 2015business_licences.csv

Change the Source CSV file(s) in the the Navigator

 

ex2b-s2b.png

Select Multiple Folders/Files... then click on the PastYears folder to add it
 

3. Modify source feature type

Now that we've set up the reader, the next step is to set up the existing source feature type on the canvas to handle all the files read by the CSV Reader. Open the Writer Feature Type properties. It was originally set up to read the single business license file. Checking the Merge Feature Type option allows this feature type to process all of the CSV files that are read. We'll use the default wildcard option the Merge Filter and Filter Type. It's important to note that when we turn the Merge Feature Type, FME automatically exposes an attribute called "fme_feature_type". Each feature read is tagged with this attribute, which holds the name of the file each feature was read from. We'll make use of this attribute shortly. The name of the reader will have changed from business_licenses to <All>

ex2b-s3.png

Open up the properties of the writer, enable Merge Feature Type and accept the defaults

 

4. Inspect data

Confirm that FME has been set up to process all of the CSV files within that folder. Right-click on the source feature type and inspect the data. Confirm that all 4 files in that folder were indeed read. By further inspecting a single feature, we see that the name of the file (or feature type) is stored with the feature.

 

5. Create geometry points with the VertexCreator transformer

Connect a VertexCreator transformer to the Reader. This will create points with our Latitude and Longitude attributes. For mode, ensure that "Add Point" is selected and then change your X Value to read the Longitude attribute and your Y Value to read the Latitude attribute, then click ok.

ex2b-s5.png

Add a VertexCreator and set the X and Y Value


Option 1: Write to Single Tableau File

If we ran this workspace now, all of the data would be written to a single Tableau file. In that case, we should create a new attribute to store the year the business license data was collected using a SubStringExtractor - similar to a feature type fanout.

 

1. Add SubStringExtractor

Place a SubstringExtractor between the AttributeManager and the writer feature type. Set it up to extract the first 4 characters of the fme_feature_type attribute, which is in effect the year. Store the year value in a new attribute called YearCollected.

The destination Tableau schema should be updated with the new attribute.

ex2b-p1-s1.png

We only want the year from the fme_feature_type, set the Start Index to 0 and End to 3

 

2. Run the workspace

A single Tableau file is created, with a new attribute YearCollected. View this file in Tableau to ensure the YearCollected field is populated

ex2b-p1-s2.png

View the BusinessLicenses.hyper in Tableau to ensure the YearCollected field is populated

 

Option 2: Write Data to Multiple Tableau Files

It is easy to create a separate Tableau file (dataset fanout) for every year of business license data.

 

1. Disable SubstringExtractor

If you added the SubstringExtractor in Option 1, disable it for Option 2, by right clicking on the transformer and clicking "Disable"

 

2. Modify destination feature type

Open the properties of the destination schema (Writer). Click on the drop-down next to Table Name, and select fme_feature_type. We are instructing FME to use the value of this attribute for the output table name, which means, that for every unique value found, a separate file will be created. We know that fme_feature_type holds the name of the file each feature was read from, and since we are reading 4 CSV files, we expect 4 tableau files to be created.

ex2b-p2-s2.png

Open up the .Hyper writer and change the Table Name to the attribute fme_feature_type to name the tables.

3. Modify destination file parameters

In the Navigator Window, right click on the .Hyper reader and choose Edit Parameters. In the writer parameters, enable the Fanout Dataset option and set the Fanout Expression to @Value(YearCollected)_@Value(fme_feature_type).hyper to create a new .hyper file for each year. 
Fanout.png

 

4. Run the workspace

Navigate to the Output folder to confirm the 4 tableau files were created. If running this translation multiple times to completion. In the Writer properties, under Table Settings > General, for Table Handling: Drop and Create.

DatasetFanout.png

4 Tableau Files in Windows Explorer

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.