Converting from CSV to PostgreSQL and PostGIS

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2020.1

Introduction

You have been assigned the task to update a PostgreSQL database with census data and library locations for different neighborhoods in the City of Vancouver. The census data needs to be split up by age range as well as neighborhood. The library dataset is missing neighborhood information, so using a spatial overlay you will add in the neighborhood information before writing out to the database.
 

Requirements

A PostgreSQL database with PostGIS:
Before you begin, please follow Creating a PostgreSQL/PostGIS Training Database with FME to create a local Postgres database that can be written to. 
 

Step-by-step Instructions

Part 1: CSV to Tabular PostgreSQL

1. Open a Blank FME Workspace and Add a CSV Reader
In a blank workspace, add a CSV Reader and read in the Census2016.csv. Open the parameters, in this CSV dataset, the field names start on line 5. For the Field Names Line parameter, enter 5, then confirm the Preview is showing the correct data, and then click OK.


2. Filter Data
Now we need to filter the census data based on the different age range of potential visitors to the library. This census data is a bit messy, but if you inspect it using either Visual Preview or FME Data Inspector, you can see the following:

  • ID 2 = Ages 0 to 14
  • ID 7 = Ages 15 to 19
  • ID 8 = Ages 20 to 24
  • ID 9 = Ages 25 to 29
  • ID 10 = Ages 30 to 34
  • ID 11 = Ages 35 to 39
  • ID 12 = Ages 40 to 44
  • ID 13 = Ages 45 to 49
  • ID 14 = Ages 50 to 54
  • ID 15 = Ages 55 to 59
  • ID 16 = Ages 60 to 64
  • ID 17 = Ages 65 and over

These are the IDs we want to keep. Add a Tester to the canvas and connect it to the CSV reader feature type. In the parameters, create a new test ID = 2, then copy the test and paste it 11 times. Change the Right Value to match the ID number listed above. Then change the Logic for the 11 rows to OR.


After completing the Tester, run the workspace with Feature Caching Enabled, and inspect the output of the Tester. You should have 12 features that match the list above.
 
3. Condense Ages
We don’t need our data to be this specific with most of the age ranges spanning only 5 years. Instead, we can condense them to be under 14, 15 to 29, 30 to 59, and 60 and over. To do this, we will use an AttributeRangeFilter. Add an AttributeRangeFilter to the canvas and connect it to the Passed output port on the Tester. In the parameters, set the following:

From

To

Output Port

7

9

Age 15 to 29

10

15

Age 30 to 59

16

17

Age 60 and over





4. Rename Age Range
Before we go any further, we need to update the Variable attribute values to reflect our new age range. Add an AttributeCreator to the canvas and connect it to the AttributeRangeFilter Age 15 to 29 output port. In the parameters, select Variable as the New Attribute and then set the Attribute Value to Age 15 to 29, then click OK.
Duplicate the AttributeCreator and connect it to the Age 30 to 59 output port, and change the Attribute Value to Age 30 to 59. Repeat with the Age 60 and over output port. For the Unfiltered output port change the Attribute Value to Age 14 and under.


5. Aggregate Values
With the age ranges redefined, now we need to add them together (aggregate them). Add an Aggregator after the first AttributeCreator. In the parameters, select all of the Neighborhood Names (all attributes except ID and Variable) for Attributes to Sum, then click OK. Duplicate the Aggregator three times and connect each one to the AttributeCreators.


6. Sort the Age Ranges
Before we write out, we will need to sort the age ranges so that they are in numerical order. Add a Sorter transformer to the canvas and connect it to all four Aggregators. In the parameters, set the Attribute to Variable, Alphabetic, in Ascending order. 


7. Write to PostgreSQL
Now we are ready to write out our census data. Add a PostgreSQL Writer to the canvas. For the Dataset, connect to the FME Training database created in Creating a PostgreSQL/PostGIS Training Database with FME or your own database. 

Change the Table Definition to Automatic, then click OK. In the Feature Type parameters, set the Table Name to CensusByNeighborhood, then for Table Qualifier type in public, if using the training database.  For the Feature Operation, select Insert, and then change the Table Handling to Drop and Create. Once the parameters have been set, connect the writer feature type to the Sorter.

Once connected, go back into the feature type parameters and switch to the User Attributes tab. Change the Attribute Definition to Manual, then remove the ID attribute and move variable to the top of the list.
 
One last step before we can write out the data, we need to map the attributes because PostgreSQL uses lowercase attributes with underscores instead of spaces. 
 
Right-click on the connection line from the Sorter and select Auto Connect Attributes. The attributes will now be mapped from the Sorter to the writer regardless of case or spaces. 


8. Run the Workspace
Run the workspace and view the output PostgreSQL table. Your output should look like the following image:

 

Part 2: CSV to Spatial PostGIS

1. Continue in the Workspace and Add Another CSV Reader

Continuing in the same workspace as Part 1, add another CSV Reader. We’ll use libraries.csv as the dataset and LL84 as the coordinate system. Open the parameters and change the Field Names Line back to 1. FME will remember the CSV parameters from the last reader, so this will need to be updated. Click OK to add the reader.
 
2. Add the Neighborhood Boundaries
Now we need to add in the neighborhood boundaries. Add a Shapefile Reader to the canvas and browse to the local_area_boundary.shp dataset.
 
3. Reproject the Neighborhood Boundaries
If you inspect the local_area_boundary shapefile, you will notice that it is in the UTM83-10 coordinate system. We will need to reproject this into LL84 to match the libraries CSV file. Add a Reprojector transformer to the canvas and connect it to the local_area_boundary feature type. In the parameters, change the Destination Coordinate System to LL84.
 
4. Retrieve the Attributes From Neighborhoods and Add Them to the Libraries
Each library falls within a Vancouver neighborhood, to add this neighborhood attribute to the libraries, we will use a PointOnAreaOverlayer transformer. Add a PointOnAreaOverlayer transformer to the canvas, then connect the Point input port to the libraries reader feature type, and the Area input port to the local_area_boundaries reader feature type. In the parameters, enable Merge Attributes.  


5. Clean up the Attributes
We need to clean up the attributes before we write out to the database. Add an AttributeManager transformer to the canvas and connect it to the Points output port on the PointOnAreaOverlayer.
In the parameters, remove both MAPID and _overlap attributes since we are only interested in the neighborhood name. Then rename NAME to NEIGHBORHOOD. You can inspect the data once more before writing to the PostGIS database if you wish.

6. Write to PostGIS
Add a PostGIS Writer to the canvas and connect it to the AttributeManager. Connect to the same Database Connection as in Part 1. We can accept the default parameters but set the Coord. System to LL84 and the Table Definition to Automatic. Click OK.
In the Feature Type parameters, set the Table Name to Libraries, and then for the Table Qualifier type in public. For the Feature Operation, select Insert, and then change the Table Handling to Drop and Create. Click OK and then run the translation.

If your database does not have PostGIS installed, please see Creating a PostgreSQL/PostGIS Training Database with FME.


Data Attribution

The data used here originates from open data made available by the City of Vancouver, British Columbia. It contains information licensed under the Open Government License - Vancouver.
 

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.