FME Version
Files
Introduction
The power of FME is being able to take data from multiple sources and manipulate it efficiently. So why not use FME for data science?
We’ve recently added a series of transformers to the FME hub that performs a few basic statistical tests using the RCaller or the PythonCaller.
If you don't see the statistical test you are looking for in this list, you can create your own and upload it to the FME Hub to share with other users or create a new Idea and if it gets enough votes will add it to the list.
Learning
Perform a Shapiro-Wilks Statistical Test using R or Python
Learn how to create a custom transformer using either R or Python to perform the Shapiro-Wilks test (to test for the normality of a distribution). This workflow can be adapted for any statistical test using R or Python.
Transformers
Each transformer listed has a link to the FME Hub page as well as a test workspace download. Due to the external software requirements for R, these test workspaces could not be uploaded to the hub. Each of the R transformers requires R to be installed on the users' machine as well as the sqldf R package. For the Python transformers, the SciPy Python package needs to be installed.
Correlation
A correlation is a test between two variables to determine their association.
RCorrelationCalculator
Uses R to calculate if there is an association between two variables.
RCorrelation-TestWorkspace.fmwt
Cluster Analysis
A Cluster Analysis is a method for determining groups of data.
RClusterCalculator
Uses R to calculate similar groups of data using one of three algorithms. This transformer only works for 2018.0+
RClusterCalculator-TestWorkspace.fmwt
Shapiro-Wilks Test
The Shapiro-Wilks test calculates whether a random sample of data comes from a normal distribution.
RShapiroWilksCalculator
Using R and the RCaller this transformer calculates whether a random sample of data comes from a normal distribution using the Shapiro-Wilks test.
RShapiroWilks-TestWorkspace.fmwt
PyShapiroWilksCalculator
Using Scipy and the PythonCaller, this transformer calculates whether a random sample of data comes from a normal distribution using the Shapiro-Wilks test.
PyShapiroWilks-TestWorkspace.fmwt
T-Test
A T-Test is a statistical test to test if the means of two samples are significantly different from random.
ROneSampleTTestCalculator
The one-sample t-test tests the null hypothesis that the population mean is equal to a specified value, In other words, it tells you if the mean of your sample is close enough to a certain number to be statistically significant. This test outputs the t-value, p-value, confidence interval and the estimate.
ROneSampleTTest-TestWorkspace.fmwt
RTwoSampleTTestCalculator
The two-sample t-test tests the mean of two groups to determine if they are significantly different or it is by random chance. This test outputs the t-value, p-value, confidence interval and the estimate.
RTwoTTest-TestWorkspace.fmwt
Comments
0 comments
Please sign in to leave a comment.