FME & Hadoop FAQ

Liz Sanderson
Liz Sanderson
  • Updated

FME Version

  • FME 2018.x

Introduction

Hadoop clusters can be amazingly powerful, but often lack the precision and spatial processing ability that FME has. This question and answer article will look at how you can empower your Hadoop instances with the transformative power of FME.

 

Can FME help me automate my flow of data into Hadoop?

Yes, FME has an HDFSConnector which can upload a file to your cluster. You can also make Hive tables out of these HDFS files by executing a ‘create table’ query from an Apache Hive Reader connection in an SQLExectuor or SQLCreator.

 

What formats can I upload to the HDFS?

Any, you can upload anything you want to your HDFS. If your looking for an example this video demonstrates how it can be done.

 

How can I read data from a Hadoop Cluster?

You can either use the HDFSConnector to download the data from the cluster or use an Apache Hive Reader to read a table.

 

Does the HDFS connector support Kerberos Authentication?

Yes, it does support Kerberos Authentication.

 

What do I need to do to use the HDFS Connector?

Make sure your name node resolves to the instance, and that your port is open.

 

What do I need to do to use the Apache Hive Reader?

You will need to copy the Hive JDBC client driver (a .jar file) from your Hadoop installation to a location that FME can use. Please see the Getting Started with JDBC documentation for a list of locations specific to your Operating System. Once that is in place you will need to restart your application. Then you are ready to use the Apache Hive Reader.

 

Can we integrate FME solutions inside Hadoop in the Map-Reduce process?

None of this configuration is out of the box, but you can call FME Server via the FME REST API from a Map-Reduce process. Additionally, if you are inclined most Hadoop systems will allow you to host FME Server on the same machine as a Hadoop node. However, if you want to distribute FME Server on multiple nodes you would need a method of breaking up the work to be processed within multiple workspaces.

Was this article helpful?

Comments

0 comments

Please sign in to leave a comment.