Getting Started with AI in FME: Web Searching

Dan Minney
Dan Minney
  • Updated

Introduction

Web searching refers to the process of using artificial intelligence (AI) models to query the internet for recent and relevant information. The AI then reads and understands the content of web pages or articles, and summarizes, extracts, or interprets the information into a structured and human-readable form.

With FME and the OpenAIConnector transformer (or any of our other AI Transformers that support web searching), you can apply the power of large language models to automate web queries and retrieve relevant web content.

In this tutorial, you'll learn how an environmental consulting company could use web search functionality to automatically generate summaries of regulatory updates, environmental impacts, and community feedback related to ongoing solar farm projects.

The results generated by the OpenAIConnector are based on predictions from a large language model and may contain inaccuracies, misinterpretations, or omissions. Always review AI-generated outputs before relying on them for decision-making or reporting. For critical use cases, validate insights against source data or consult a subject matter expert.

Requirements

  • FME Workbench 2025.0.0.0 or later (Build 25208+)
  • Access to the OpenAI API via an API Key

Step-by-Step Instructions

In this section, you’ll build an FME workspace that reads a CSV file containing a list of current solar farm sites and send the data to an OpenAIConnector to generate structured summaries of each site based on recent online information.

Part 1: Reading & preparing the data prior to the OpenAIConnector

Before inputting our CSV file into the OpenAIConnector to perform a web search, we will need to prepare the data and generate addresses for each site using the Geocoder transformer.

1. Open a New Workspace

Open FME Workbench and create a new blank workspace. Add a Reader to the workspace and set the parameters as follows:

  • Format: CSV (Comma Separated Value)
  • Dataset: solar_farm_sites.csv

Click  OK to add the reader to the workspace. 

2. Add a Geocoder Transformer 

Connect it to the CSV Reader. In the Geocoder, set the parameters as follows:

  • Geocoding Service: OpenStreetMap (Nominatim)
  • Mode: Reverse
  • Latitude: select the Latitude attribute from the drop-down
  • Longitude: select the Longitude attribute from the drop-down

Click OK to accept the parameters.

3. Run Workspace

With feature-caching enabled, run the workspace and inspect the results of the Geocoder in the visual preview table view. Each record should now contain a new attribute called _address that contains the address of each site. You will use this information in our prompt to ensure that the AI service gathers information about the correct site.

Part 2: Configuring the OpenAIConnector

Configuring the OpenAIConnector is a relatively simple process. The most important component is creating a prompt that is clear and specific. The more context you provide to the AI, the better the model performs. 

Tips for prompt generation:

• Be clear and specific in your prompt. The more context you provide, the better the model performs.

• Role-based prompts (e.g., “You are a document classification assistant…”) help set expectations for the model.

• Use a structured output whenever possible. This makes downstream processing much easier, especially in FME, where structured JSON can be parsed into attributes using the JSONFlattener.

• Enumerate options. If you want consistent classification (e.g., fixed categories), provide a defined list the model can choose from.

• Ask for justification. Including fields like explanation or confidence_score can help with auditing and quality assurance.

1. Add an OpenAIConnector

Connect it to the Geocoder Output port

2. Configure the OpenAIConnector Parameters

Open the transformer parameters and configure them as follows:

  • API Key: <provide your OpenAI API Key>
  • Action: Web Search
  • Model: gpt-4o
  • Instructions: You are an assistant that searches the web for information about solar farm sites. You return your results using the JSON Schema provided.
  • User Prompt: Search the web for recent news articles or credible updates related to solar energy development or activity at "@Value(LocationQuery)". The specific address is: @Value(_address).
    • Limit your search to information published within the past 12 months.
    • Provide a structured summary with key points categorized under the following sections:
    • Environmental Concerns: Any reported or potential impacts on land use, ecosystems, biodiversity, or other environmental factors.
    • Community Sentiment: Public opinion, local feedback, community engagement, or any documented support or opposition.
    • Policy or Regulatory Changes: Legislative updates, zoning rules, incentive programs, or new regulations related to solar energy in the area.
    • If no relevant or credible information is found for a section, explicitly state “No recent information found” under that heading.
    • For each of these topics, please also provide a rating out of 5, where a 5 would be excellent and a 1 would be considered a poor rating.
  • Structured Output:
{
	"additionalProperties": false,
	"properties": {
		"community_sentiment": {
			"additionalProperties": false,
			"properties": {
				"content": {
					"description": "Information about local community opinions, support, or opposition",
					"type": "string"
				},
				"rating": {
					"description": "The overall rating of the community sentiment from a value of 1 to 5",
					"type": "number"
				},
				"source": {
					"description": "Name of the source publication or organization",
					"type": "string"
				},
				"url": {
					"description": "URL where the information was retrieved",
					"type": "string"
				}
			},
			"required": [
				"content",
				"source",
				"url",
				"rating"
			],
			"type": "object"
		},
		"environmental_concerns": {
			"additionalProperties": false,
			"properties": {
				"content": {
					"description": "Information about environmental impacts, wildlife effects, or ecological concerns",
					"type": "string"
				},
				"rating": {
					"description": "The overall rating of the envionrmental concern from a value of 1 to 5",
					"type": "number"
				},
				"source": {
					"description": "Name of the source publication or organization",
					"type": "string"
				},
				"url": {
					"description": "URL where the information was retrieved",
					"type": "string"
				}
			},
			"required": [
				"content",
				"source",
				"url",
				"rating"
			],
			"type": "object"
		},
		"facility_name": {
			"description": "Name of the wind farm facility",
			"type": "string"
		},
		"recent_policy_changes": {
			"additionalProperties": false,
			"properties": {
				"content": {
					"description": "Information about recent regulatory changes, policy updates, or legislative impacts affecting the site",
					"type": "string"
				},
				"rating": {
					"description": "The overall rating of the policy affecting the project from a value of 1 to 5",
					"type": "number"
				},
				"source": {
					"description": "Name of the source publication or organization",
					"type": "string"
				},
				"url": {
					"description": "URL where the information was retrieved",
					"type": "string"
				}
			},
			"required": [
				"content",
				"source",
				"url",
				"rating"
			],
			"type": "object"
		}
	},
	"required": [
		"facility_name",
		"environmental_concerns",
		"community_sentiment",
		"recent_policy_changes"
	],
	"type": "object"
}

Your OpenAIConnector parameters should look like the following screenshot.

 

Click OK to accept the parameters.

3. Run the Workspace

Inspect the result from the OpenAIConnector. The Response attribute in the Visual Preview window contains a JSON-formatted response with the results from the OpenAIConnector Web Search.

Part 3: Extract Attributes from the JSON Response

Enabling Structured Outputs ensures the OpenAIConnector produces reliable, uniform results. The output is formatted as JSON, which can be easily processed using transformers such as the JSONFlattener or the JSONFragmenter to extract attributes.

1. Format the JSON Response

Add a JSONFlattener to your workspace and connect it to the OpenAIConnector Output port. Open the JSONFlattener parameters and configure them as follows:

  • JSON Document: Response

Click OK to accept the parameters

2. Run the Workspace

With feature caching enabled, run the workspace up to the JSONFlattener. This will cache the flattened JSON response, which we can then expose in the transformer, which we will add next.

3. Add an AttributeExposer

Connect it to the JSONFlattener. Open the AttributeExposer parameters and select the Import > From Feature Cache button. Click the Select All button to deselect all the attributes. We only want to import the following attributes:

  • community_sentiment.content
  • community_sentiment.rating
  • community_sentiment.source
  • community_sentiment.url
  • environmental_concerns.content
  • environmental_concerns.rating
  • environmental_concerns.source
  • environmental_concerns.url
  • recent_policy_changes.content
  • recent_policy_changes.rating
  • recent_policy_changes.source
  • recent_policy_changes.url

4. Run the Workspace

Inspect the AttributeExposer output. In the Data Preview Table, you can see the community sentiment, environmental concerns, and recent policy changes set of attributes have been exposed, including their source & URL that was used to generate the content. A general rating is also provided for each of these categories to help you gauge whether or not any of these categories are a concern for the solar farm site.

Next Steps

You’ve successfully configured the OpenAIConnector to use web search functionality and return consistent, structured output using the provided JSON schema. From here, you can expand the workflow in several ways, depending on your goals:

  • Generate Reports
    • Create HTML or PDF reports to share insights with stakeholders using transformers like HTMLReportGenerator or Document PDF Writer.
  • Add Keyword or Phrase Detection
    • Incorporate logic to flag key phrases (e.g., “permit denied” or “community opposition”) for easier filtering or prioritization using a Tester or TestFilter transformer.
  • Automate on a Schedule
    • Deploy the workflow to FME Flow and schedule it to run regularly — enabling automatic detection of new policies, updates, or news about each solar farm site

Was this article helpful?

We're sorry to hear that.

Please tell us why.

As of January 14th, 2026, comments on knowledge base articles have been closed. To make sure questions don’t get missed and to enable more community support, we’ve moved discussions to the FME Community. If you have a question or a comment about this article, please create a new post or create a support ticket.