Getting Started with AI in FME: Reasoning

Files

AI_Reasoning_in_FME.fmw
- 400 KB
- Download
sample_sewage_pipes.csv
- 629 Bytes
- Download
schema.json
- 2 KB
- Download

Introduction

AI Reasoning refers to AI models specifically trained to solve problems, draw conclusions, and make decisions by processing information and applying logical rules. Unlike traditional models, reasoning models think before they answer, using a chain of internal thought before generating a response back to the user. Reasoning models can act on high-level guidance and instructions, whereas traditional models typically require very precise instructions.

You can think of a reasoning model like a senior co-worker - you can give them a goal to achieve and trust them to do it. You can think of a traditional model like a junior co-worker - they’ll need more specific instructions to achieve the goal you desire.

Depending on the AI service you use, you may need to choose a specific model built for reasoning. In the example below, we will demonstrate using GPT-5, a well-rounded model with reasoning capabilities built in.

Model Chaining

Multiple AI models can be chained together to manage token usage, costs, and performance.

For example, in a data validation workflow, a low-cost model could be used to detect invalid data early on. If the model suspects data validation issues based on the user prompt provided, then the output data can be sent to a more thorough yet more expensive reasoning model that is likely to provide better results.

This hybrid method of chaining models can provide the best results while decreasing the cost of the overall workflow.

Using Reasoning for Data Validation

A strong use case for reasoning models is data validation due to the flexibility that AI models provide.

Typically, transformers in FME can check for one validation rule at a time or a set of rules. However, this may require setting up a long set of conditional statements in an AttributeManager or TestFilter.

Reasoning models break this traditional approach by being able to weigh multiple checks simultaneously using natural language. It can also provide more than just a Boolean yes/no, valid/invalid response. The model can provide a human-readable explanation as to why the data failed a validation check.

Step-By-Step Instructions

In this section, you’ll build an FME workspace that reads a CSV file containing a list of sewage pipes. This data has been provided by a vendor and must be validated before loading the data into a production environment. Instead of using transformers for the data validation process, you will learn how to use AI reasoning models to perform multiple validation checks and get a structured JSON response. The response can then be easily parsed and transformed into an Excel Report to share with relevant stakeholders.

Part 1: Data preparation

Before sending the raw data to the OpenAIConnector, we must prepare our data. To do so, we will create a concatenated string of the data’s schema and a concatenated string of the dataset.

1. Open a New Workspace

Open FME Workbench and create a new blank workspace. Add a Creator to the workspace and then add a FeatureReader and connect it to the Creator. Set the FeatureReader parameters as follows:

Format: CSV (Comma Separated Value)
Dataset: sample-sewage_pipes.csv

Click OK to accept the new parameters.

2. Add an AttributeCreator

Add an AttributeCreator to the workspace and connect it to the FeatureReader’s CSV output port. Open the AttributeCreator parameters and set them as follows:

Output Attribute: “record_data”.
Value: “@Value(pipe_id),@Value(material),@Value(diameter_mm),@Value(condition),@Value(install_date),@Value(status),@Value(start_manhole),@Value(end_manhole)”

Your workspace should now look like the following screenshot.

3. Run the Workspace and Inspect Your Data

With data caching enabled, run the workspace. Inspect the AttributeCreator’s cached data. In the Data Preview window, a new record_data attribute is present, containing each field’s value separated by a comma. The record_data attribute is what will be sent to the OpenAIConnector.

Part 2: Configuring the OpenAIConnector for Data Validation

The most important step in this workflow is setting up your prompt to ensure proper validation. The quality and directness of your prompt will dictate what results the AI is likely to return. Reasoning models perform better with fewer instructions, so we will keep that in mind. For tips on prompt engineering, you can check out our article on Prompt Engineering and Text Generation.

1. Add an OpenAIConnector

Add an OpenAIConnector to your workspace. Connect it to your AttributeCreator. Open the OpenAIConnector parameters and set them as follows:

API Key: enter your OpenAI API key
Action: Reasoning
Effort: low
Model: GPT-5 (or any other reasoning model)
Instructions:

You are a meticulous data quality validator for a municipal sewage pipe asset register. You MUST return valid JSON only. Be concise, actionable, and classify issues clearly.

User Prompt:

Validate the following sewage pipe record.


Rules:
- Required fields: pipe_id, material, diameter_mm, install_date, status, start_manhole, end_manhole
- Domains:
  - material ∈ {PVC, Concrete, HDPE, Ductile Iron, Clay}
  - condition ∈ {Good, Fair, Poor}
  - status ∈ {Active, Inactive, Abandoned}
- Ranges:
  - diameter_mm ∈ [100, 3000]
  - install_date ≤ today (no future dates)
- Uniqueness: pipe_id must be unique
- Required: start_manhole and end_manhole cannot be blank


Soft rules:
- If a field is close to a valid value (e.g., "Decomissioned"), suggest the closest valid option.
- If material is invalid, suggest the most likely valid material based on context.
- If ranges are violated (negative or extreme diameters), propose plausible corrections if possible.


Record Schema:
pipe_id,material,diameter_mm,condition,install_date,status,start_manhole,end_manhole


Record:
@Value(record_data)

Structured Output: Copy and paste the content from the schema.json file attached to this article

Click OK to accept the parameters.

2. Extract JSON & Validation Results

Add a JSONFlattener to your workspace. Connect it to the OpenAIConnector Output port. Open the JSONFlattener parameters and set them as follows:

JSON Document: click the drop-down arrow and select the Response attribute
Attributes to Expose:
- decision
- issues{}.code
- issues{}.evidence
- issues{}.message
- issues{}.severity
- quality_tags{}
- record_id
- suggested_fixes{}.confidence
- suggested_fixes{}.field
- suggested_fixes{}.proposed_value
- suggested_fixes{}.reason

Click OK to accept the parameters.

3. Add a Tester

Add a Tester to the workspace and connect it to the JSONFlattener Output port. Open the Tester parameters and set them as follows:

Left Value: decision
Operator: !=
Right Value: reject

Click OK to accept the parameters.

4. Add a ListExploder

To expose the remaining attributes from the JSON, we will use a ListExploder. Add a ListExploder and connect it to the Tester “Failed” output port. Open the ListExploder parameters and set them as follows:

List Attribute: issues{}
Accumulation Mode: Prefix List Attributes
Prefix: issue_

Click OK to accept the transformer parameters.

5. Run the Workspace and Inspect the Data

With data caching enabled, run the workspace and inspect the output from the ListExploder in the Data Preview window. You should see the results from the AI reasoning model, such as the evidence, message, and severity columns.

Each provides detailed information on why the row failed validation.

Part 3: Create a Microsoft Excel Report

1. Add an AttributeKeeper

Add an AttributeKeeper and connect it to the ListExploder. Open the AttributeKeeper and select the following Attributes to Keep:

decision
issue_code
issue_evidence
issue_message
issue_severity
pipe_id

2. Add an Excel Writer

Add a Writer to the workspace and set the parameters as follows:

Format: Microsoft Excel
Dataset: specify an output dataset
- Click on the ellipsis to select a location on your computer

Click OK to add the writer to the workspace. Connect it to your AttributeKeeper output port.

Run the workspace to output the Excel file and view the results in Microsoft Excel. To increase visibility, the report could be emailed or shared with stakeholders.

Sample Validation Report in Excel

Conclusion

By combining AI reasoning with FME, data validation becomes not only easier but also more insightful. FME takes care of transforming and preparing your data, ensuring it’s ready for analysis, while AI reasoning models evaluate it against complex rules and provide clear, human-readable explanations.

This reduces manual effort, uncovers issues traditional checks might miss, and delivers richer insights to the end-user.

The result is faster, more reliable decision-making and a scalable approach to maintaining high data quality in the long run.

Resources

If you’d like to learn more about using reasoning models for data validation, OpenAI has a great guide on the same topic: https://cookbook.openai.com/examples/o1/using_reasoning_for_data_validation

Search

Getting Started with AI in FME: Reasoning

Files

Introduction

Model Chaining

Using Reasoning for Data Validation

Step-By-Step Instructions

Part 1: Data preparation

Part 2: Configuring the OpenAIConnector for Data Validation

Part 3: Create a Microsoft Excel Report

Conclusion

Resources

Was this article helpful?