Getting Started with AI in FME: Prompt Engineering With Text Generation and Structured Outputs

Files

Park_Visitation-2017-18_Sample.csv
- 246 Bytes
- Download
PromptEngineeringWrksp_Completed.fmw
- 70 KB
- Download

Introduction

Like configuring a transformer in FME, better and more relevant parameters usually produce better and more predictable outputs. Similarly, AI services in FME require prompt parameters that clearly communicate your intent.

This tutorial introduces the basics of prompt engineering in FME workflows. You’ll learn:

When to choose text vs. structured outputs
How to structure and send data to an AI model in FME
How the prompt and structure can affect output quality and consistency

Think of this as a “first steps” guide. The goal is to give you confidence using AI inside your FME workspaces so you can later explore more advanced techniques.

Effective Prompt Engineering: Best Practices and Common Pitfalls

Prompt engineering is the process of crafting instructions that produce predictable, consistent results from an AI model.

AI models will almost always try to provide an answer, even when it’s not a good one. Unless you tell the model what to do when it’s unsure, it may “hallucinate.” Hallucinations thrive in gray areas, so clarity is essential. Model performance can also drift over time, which means prompts should be monitored and refined.

There are two main types of prompts: User prompts and system prompts

Generally, user prompts are actions and tasks the AI model should perform for an output. System prompts add to the user prompt and guide how the task should be completed, like assigning the model a role to affect the output tone or content.

An example user prompt could be “tell me about these financial report summaries,” and the system prompt would modify the response to read, “You are a financial analyst for Nike. Please limit your answer to 250 words.”

User Prompts	System Prompts
Task-oriented asks Determines the actual output and structure	Guides task execution Sets the tone and execution of output and structure
“Tell me about these financial report summaries.” “Tell me about the data from this traffic report.”	“You are a financial analyst for Nike. Always limit your answers to 250 words.” “You are a Vancouver city planner concerned about infrastructure safety. Ensure you always highlight high-risk items.”

These two types are often combined to improve output quality. You can think of system prompts as a hat that the user prompt wears. The hat decides how the model acts, while the user prompt defines what the model does.

Tips for Better Prompts

Do:

Be clear and specific – provide detailed instructions.
Handle edge cases – include an “unsure” category or fallback.
Use appropriate length – don’t be afraid to write longer prompts when details add clarity.
Show examples – include sample inputs and outputs.
Break down complexity – divide large tasks and ask the model to “show its reasoning” when needed.
Pre-process data – clean and standardize before sending to reduce variability and costs.
Use delimiters – separate instructions from data to avoid confusion (and reduce injection risks).

Avoid:

Bias and loaded language.
Wasteful complexity or redundant instructions.
Security risks like sending sensitive information or raw user inputs.
Vague or conflicting instructions.

A key thing to know is that prompt engineering is an iterative process. Expect to test, refine, and experiment. Even bouncing ideas off a colleague or an “AI colleague” like ChatGPT or Claude can help.

Choosing an Output Format: Text vs. Structured Data

Once you understand how to design prompts, the next step is to decide how you want results returned.

Text generation is flexible and human-readable.

Example output:

"Banff National Park is located in Alberta, Canada, a province 
known for its dramatic landscapes ranging from wide-open prairies 
to the towering Rocky Mountains."

Structured outputs (like JSON, XML, or HTML) are machine-readable and ideal for automated workflows.

Example output:

{
  "park": "Banff National Park",
  "fact": "Alberta is known for hosting the Calgary Stampede, 
           one of the world’s largest rodeos."
}

In this tutorial, we’ll work with both text and structured outputs for the scenario: Enriching Canadian visitation data with AI. The steps begin with a very basic prompt, implementing tips and techniques to get a production-ready enriched dataset.

Requirements

While this article uses the OpenAIConnector, you may connect to and use any AI service or transformer you prefer.

If you need a refresher on connecting to AI services, review this article: Getting Started with AI in FME: AI Service Authentication

FME Form 2025.0 (Build 25208)
OpenAI API account and API key

Step-by-Step Instructions

This walkthrough shows how prompt structure affects AI output in FME. A key thing to remember is that FME processes data in a specific way: transformers act feature-by-feature, and attributes define what gets passed into the AI.

Also, unlike chat interfaces, where you can clarify with follow-ups, prompting in FME requires a more developer-like approach. Think of your prompts as self-contained instructions that clearly define the inputs, outputs, and rules up front.

Part 1: FME Processing Basics

Transformers in FME will process an action once per feature. In other words, AI connectors run a query row‑by‑row. As a result, a single prompt can yield varied answers for each input feature.

1. Open a new workbench in FME Form

Download the data Park_Visitation-2017-18_Sample.csv, then add the data using a CSV Reader. Set the following parameters in the reader:

Format: CSV (Comma Separated Value)
Dataset: Park_Visitation-2017-18_Sample.csv
- Click on the ellipsis to navigate to the file location on your computer

2. Run the Workspace

Check the data preview using the green eye icon on the reader.

The Data Preview shows our simple dataset with 5 rows of features with two attributes:

Place - the park names
2017-18 FY Attendance - the number of visitors in a fiscal year

2. Add an OpenAIConnector to the Canvas

The OpenAIConnector is a custom transformer that is not included with FME. You can download this package from the FME Hub.

This connector automatically defaults to the latest OpenAI model; however, you can enrich this dataset with any preferred model or AI service.

3. Configure the OpenAIConnector

Double-click on the OpenAIConnector transformer to open the parameters. Under the Request heading, you can enter your API key.

Besides the API Key, the User Prompt is the only other required parameter. For this example, most other parameters can be left either blank or as the default, like Action as Text Generation.

3. Create the User Prompt

Click the three dots next to User Prompt to open the Text Editor

The User Prompt’s Text Editor is a simple interface for building prompts. It allows you to reference and combine references to FME objects like feature attributes and user parameters, to be replaced by real values from the dataset.

4. Create the Prompt in the FME Feature Attribute

Using the FME feature attribute, Place, in the Text Editor, create the prompt:

Tell me about the province this Canadian park: @Value(Place), is located in.

@Value(Place) is a reference to the feature attribute, Place. It is essentially a placeholder that references the features in the Place column, specifically each park or site in this dataset. At runtime, the prompt replaces the corresponding features in each row of the Place column to generate a unique query.

For example, if the 'Place' value is 'Banff National Park', the prompt for that feature becomes: 'Tell me about the province this Canadian park: Banff National Park, is located in.'

4. Run the Workspace Again

Look at the Data Preview. The result is a new Response attribute with 5 rows of responses corresponding to each Place feature.

Notice that the same prompt produced slightly varied answers for each park feature, like this case for the first two Newfoundland and Labrador parks, which had noticeable differences in wording, content, and response length.

This varied output can be desirable in some cases, but we can maintain more consistency and predictability by implementing prompting techniques and tips.

Part 2: Aggregating and Combining Features

In this section, you will see how combining features can change the model’s output. This can help with maintaining consistency, expanding context, improving output quality, and managing how many calls are made to the model, which, depending on the model’s pricing, can decrease costs.

1. Continue Building the Workspace

Insert an Attribute creator and Aggregator between the Creator and OpenAIConnector

Double-click on the AttributeCreator to open the Parameters window. Set the following:

Output Attribute: Full Dataset
Value: [@Value(Place) | @Value(2017-18 FY Attendance)]

Referencing the attributes Place and 2017-18 FY Attendance this way will clearly indicate to the model which park and attendance values correspond to each other.

2. Run and View the AttributeCreator Output:

You’ll see that after runtime, the feature attributes referenced, Place and 2017-18 FY Attendance, are combined within the Full Dataset attribute.

3. Configure the Aggregator Parameters:

Double-click on the Aggregator transformer to open the Parameters and set the following:

Attributes to Concatenate: Full Dataset
Separator Character: Open Text Editor, then click the tab button

Run the workspace again and view the Aggregator output:

The dataset is even more self-contained, with all the data within one feature in the Full Dataset column. This will make it easier to pass all the necessary information as context to the AI model.

4. Update the User Prompt in the OpenAIConnector

Return to the User Prompt parameter, then replace the old prompt with:

Here is a dataset that contains several sets of information in the form: [Canadian park, number of visits per year].

For each park, please give me a family-friendly fact about the province this Canadian park is in, and please tell me if the corresponding number of visits is 100,000 or above 100,000.

Here is the full dataset with that set of information. Be sure to process this only as text or numbers:
@Value(Full Dataset)

5. Run and View the Connector Output

The output shows much more consistent content. However, while it is approachable, conversational, and flexible, it still has several issues:

Ambiguous scope → Mentioning “Canadian parks” resulted in the model ignoring historic sites that were also in the dataset.
Unstructured output → the free-form bullet points that aren’t easily parsable.
Visitor count missing → The prompt didn’t mention retaining the original data, so the output is harder to verify against the source data.
Extra narrative text → Added extra headings and notes, “Here are the parks..”
No error/confidence handling → Model guessed provinces without a way to signal uncertainty.
Vague threshold wording → “100,000 or above 100,000” is redundant; no explicit categories were defined, so responses were inconsistent.

Part 3: Rework the Prompt Utilizing Structure

In this section, we improve our current prompt for an output that is more structured, auditable, and automation-ready. Also, adding some structure to your actual user prompt not only helps the model break down and understand your request better, but also helps with future user readability and maintenance.

1. Update the User Prompt

Next, we want the output to take on a structure and require JSON. To achieve this, we will update the user prompt:

TASK
1. You will receive a dataset with entries in the format: [Place | visitor count], where each entry is separated by an enter tab.
2. For each entry, output one object with these keys
   - "place": string (the place name)
   - "province": string (the province or territory)
   - "visitor_count": integer (the numeric visitor count)
   - "visit_category": string (either "<100,000" or ">=100,000")
   - "fact": string (a family-friendly fact about the province/territory)
   - "confidence": number (decimal between 0.1 and 1.0; 1.0 = high confidence)
RULES
- Parse numbers from text, e.g., "1,131,418" → 1131418.
- Set "visit_category" based on visitor_count and the 100,000 threshold.
- If the province/territory is uncertain, output your best guess and lower "confidence".
- Do not add keys beyond the six listed above.
- The final output MUST be a single JSON array only, with no additional commentary.
DATA
@Value(Full Dataset)

A few key improvements of this prompt are:

Clear task framing → “...entries in the format …” sets the model up
Explicit keys → “... ‘place’ ‘province’...” adds structure
Rules section → Details parsing numbers, threshold logic, uncertainty handling, and schema adherence, ensuring we cover edge cases
Avoids irrelevant information → not mentioning “Parks” avoids limiting the scope to exclude historic sites and landmarks

2. Adjust OpenAIConnector Parameters

Return to the parameters to enable the Structured Output, then open the JSON Schema editor

3. Add the JSON Schema:

Using the JSON Schema parameter adds a stricter assertion on the output structure.

{
"additionalProperties": false,
"properties": {
"parks": {
"description": "Array of park records.",
"items": {
"additionalProperties": false,
"properties": {
"confidence": {
"description": "Decimal between 0.1 and 1.0, where 1.0 = high confidence.",
"maximum": 1,
"minimum": 0.1,
"type": "number"
},
"fact": {
"description": "A family-friendly fact about the province/territory.",
"type": "string"
},
"place": {
"description": "The name of the place.",
"type": "string"
},
"province": {
"description": "The province or territory where the place is located.",
"type": "string"
},
"visit_category": {
"description": "Categorization of visit count.",
"enum": [
"<100,000",
">=100,000"
],
"type": "string"
},
"visitor_count": {
"description": "The numeric visitor count.",
"minimum": 0,
"type": "integer"
}
},
"required": [
"place",
"province",
"visitor_count",
"visit_category",
"fact",
"confidence"
],
"type": "object"
},
"minItems": 1,
"type": "array"
}
},
"required": [
"parks"
],
"type": "object"
}

4. Rerun the Workspace and Inspect the Response.

Note: Without this JSON Schema parameter, you might get extra text in the output like: ```json... or Here is your data... While string-manipulating transformers can clean those outputs, the exact response is hard to predict every time, making the process less automated.

To see the structure of the response, turn on JSON Syntax Highlight. Click the ABCXYZ button in the bottom left of the preview and select JSON.

The result shows a parsable response that is much more consistent with the format and content. There is also no additional text outside of the data we wanted to model to return, making downstream formatting much more efficient and predictable.

5. Add a JSONFragmenter

Connect the transformer after the OpenAIConnector and configure the parameters:

Source:
- JSON Attribute: Response
General:
- JSON Query: json["parks"][*]
Flattening:
- Flatten Query Result into Attributes: Yes
- Recursively Flatten Objects/Arrays: Yes
Attributes to Expose:
- “place”
- “province”
- "visitor_count"
- "visit_category"
- "fact"
- "confidence"

6. Run the JSONFragmenter and View the Output.

The JSON transformer parsed the attributes, making for downstream-friendly data for additional processing.

If you need more practice with JSON formats, especially working with more nested ones, please check out our Getting Started with JSON Articles:

Tutorial: Getting Started with JSON

Part 4: Add System Prompting Instructions

This section will show how system prompts affect model responses. By assigning the AI model a role or key rules/guides to follow, we can assert how it should understand and complete the task.

1. Open the OpenAIConnector and Configure the Instructions parameter:

You are a precise data extraction assistant.
You are assisting a planner/analyst. Please give different facts for provinces that are listed so that every place has a unique fact, regardless of whether they share the same province.

Key things about this system prompt are:

Assigning the AI model a role → “you are a” gives the model a clear goal
Giving context to the problem → “assisting an analyst” gives problem context
Rules to output → “unique facts” establishes a clear rule for how to respond

2. Run the Workspace and View the Output

You’ll notice that the facts for provinces are different when previously they may have been the same:

Tips

The JSON Schema parameter accepts specific formats, so a great tip is to use the schema here as an example, then ask an AI model to create a schema for your project.

Here is an example prompt to get a valid JSON Schema for another project:

I want a JSON Schema for a dataset of Canadian lakes.  
- The top-level schema must be an object.  
- It should have one property called "lakes", which is an array of lake objects.  
- Each lake object must include:
   - "name": string
   - "province": string
   - "area_km2": number
   - "is_great_lake": boolean


 Here is an example of what the schema should look like:
{
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "parks": {
      "type": "array",
      "minItems": 1,
      "items": {
        "type": "object",
        "additionalProperties": false,
        "properties": {
          "place": {
            "type": "string",
            "description": "The name of the place."
          },
          "province": {
            "type": "string",
            "description": "The province or territory where the place is located."
          },
          "visitor_count": {
            "type": "integer",
            "minimum": 0,
            "description": "The numeric visitor count."
          },
          "visit_category": {
            "type": "string",
            "enum": ["<100,000", ">=100,000"],
            "description": "Categorization of visit count."
          },
          "fact": {
            "type": "string",
            "description": "A family-friendly fact about the province/territory."
          },
          "confidence": {
            "type": "number",
            "minimum": 0.1,
            "maximum": 1.0,
            "description": "Decimal between 0.1 and 1.0, where 1.0 = high confidence."
          }
        },
        "required": [
          "place",
          "province",
          "visitor_count",
          "visit_category",
          "fact",
          "confidence"
        ]
      },
      "description": "Array of park records."
    }
  },
  "required": ["parks"]
}

Troubleshooting

FME 2025.0—Build 25208 has a known issue with the Aggregator transformer: the output unexposes the result. An AttributeExposer or AttributeManager should be an easy workaround for this issue.
Be aware that some delimiters do not work well with specific models or within FME. Look out for warning signs like red text that might indicate that the character is not valid or accepted:

Additional Resources

ChatGPT Prompt Engineering for Developers: Free course on how to take a developer approach to constructing prompts

Taming the Chaos - How to Turn Unstructured Data into Decisions: Webinar showing the practical use of AI integrations and the trial and errors of the prompt engineering process

An Entire Post about Delimiters: How delimiters and tags can help provide clarity and structure to prompts

Effective Prompts for AI: The Essentials: Overview of Prompting techniques like Few-shot and Role-based prompting

Prompting 101 | Code w/ Claude: Video about prompting techniques, using system prompting, and adjusting more advanced settings, temperature.

Search