Next Steps with AI: Accessing Models Deployed from Hugging Face

Files

Ollama_Example.fmw
- 400 KB
- Download
LMStudio_Example.fmw
- 400 KB
- Download
city_inspections.csv
- 1 KB
- Download
AzureAIFoundry_Example.fmw
- 300 KB
- Download

Introduction

Large Language Models (LLMs) are increasingly being deployed outside of traditional, public cloud AI services. The increasing accessibility of publicly available models on platforms like Hugging Face allows organizations to choose where and how their models are hosted - whether locally, on private secure infrastructure, or within a managed cloud environment.

In this article, we will explore how LLMs deployed from various platforms can be integrated into FME. We’ll focus on three different approaches: local model deployments using LM Studio and Ollama, as well as managed cloud deployments using Azure AI Foundry. Beyond these approaches, FME also supports models deployed across other managed cloud platforms such as Amazon SageMaker, Google Cloud, and more.

Step-by-Step Instructions

1. Using an AI Model Deployed to LM Studio

2. Using an AI Model Deployed to Ollama

3. Using an AI Model Deployed to Azure AI Foundry

Using an AI Model deployed to LM Studio

In this section, you will learn how to use the LMStudioConnector to connect FME to a locally running LLM in LM Studio. LM Studio exposes locally deployed models through an API that FME can interact with using its AI connectors.

Requirements:

LM Studio is downloaded and installed
A model is downloaded and deployed using LM Studio; instructions can be found here.
Read through the LM Studio getting started guide before beginning

1. Open LM Studio and Load a Model

Open LM Studio on your machine. Make sure you’re in Developer mode by clicking on the Developer button at the bottom-left of the window.

Go to the Developer tab by clicking on the green button on the left side of the LM Studio window. Next, click the "Select a model to load" button at the top of the window to choose the model you want to use.

Once ready, click on the toggle next to the Status button. This will begin running LM Studio as a server.

LM Studio can also be run via command line interface (CLI). If you prefer this option, check out the LM Studio CLI Documentation.

2. Open FME Workbench

Open FME Workbench and create a new workspace.

3. Add and Configure a CSV Reader

In this example, we will be using a dataset of public works service requests. We will utilize AI to assess the severity of these issues and categorize them based on their descriptions.

Add a Reader to the workspace and set the parameters as follows:

Format: CSV (Comma Separated Value)
Dataset: download the CSV file from this article and provide its file path.

Click OK to add the reader to the workspace.

4. Add and Configure a LMStudioConnector

Next, add an LMStudioConnector to the workspace. When prompted, download the transformer from the FME Hub. Alternatively, download the transformer from the FME Hub and install it with your preferred version of FME.

Open the LMStudioConnector and configure the parameters as follows:

LM Studio Base URL: enter your base URL. This can be found from within LM Studio in the Developer tab.
- If running LM Studio locally, you can use http://localhost:1234
Action: Chat Completion
Model: enter the name of the model you want to use
- In this tutorial, we are using a lightweight model - mistralai/ministral-3-3b
User Prompt:

You are a dispatcher for the City of Vancouver. Categorize this service request: '@Value(description)'. Return the following information:
- department (Transportation, Parks & Rec, or Engineering)
- priority (1 for Emergency, 2 for Urgent, 3 for Routine)
- equipment (what specific tool, if any, is needed for the task?)

Click OK to accept the new parameters.

5. Run the LMStudioConnector and Inspect the Output

With data caching enabled, run the workspace. In LM Studio, you should see that it receives the API request from FME and is processing the results.

When the workspace has finished running, click the LMStudioConnector Output port to open the Data Inspector window.

In the Table view, you can see the LMStudioResponse attribute that includes the response from the LLM. The Output also includes the raw_response attribute, which contains the raw JSON response from the LLM.

6. Using a JSON Schema

To obtain a more structured response, we can provide a JSON Schema to the LLM, informing it of our expected output structure. This allows for predictable responses and is a best practice when using an AI connector in FME.

Open the LMStudioConnector parameters, enable Structured Output, and then provide the following value for the JSON Schema:

{
"name": "service_request_classification",
"schema": {
"additionalProperties": false,
"properties": {
"department": {
"enum": [
"Transportation",
"Parks & Rec",
"Engineering"
],
"type": "string"
},
"equipment": {
"type": "string"
},
"priority": {
"enum": [
1,
2,
3
],
"type": "integer"
}
},
"required": [
"department",
"priority",
"equipment"
],
"type": "object"
}
}

Click OK to accept the new parameters.

Again, with data caching enabled, run the workspace and inspect the output. This time, you will see that the LMStudioResponse attribute contains a structured JSON output.

To extract information from the response, you can use one of FME’s many JSON parsing transformers, such as the JSONExtractor, JSONFlattener, or JSONFragmenter. For more information, check out the following tutorial: Transforming JSON using the JSONExtractor, JSONFlattener, and JSONFragmenter

Using an AI Model deployed to Ollama

In this section, we will focus on integrating FME with an LLM running in Ollama using the OllamaConnector.

Ollama provides a lightweight way to expose local models through an API interface that FME can interact with. Ollama does not have a UI available. Instead, we will use the command line interface to interact with Ollama.

Requirements

Ollama is downloaded and installed
A model is downloaded and deployed with Ollama - instructions here
Check the Ollama CLI Reference before getting started

1. Start Ollama as a Server

Open a Command Prompt window on your machine. Enter the following command to start the Ollama Server.

ollama serve

You should see some log messages as Ollama initializes. You’re now ready to send requests to any of the LLMs you’ve downloaded in Ollama.

2. Open FME Workbench

Open FME Workbench and create a new workspace.

3. Add and Configure a CSV Reader

In this example, we will be using a dataset of public works service requests. We will utilize AI to assess the severity of these issues and categorize them based on their descriptions.

Add a Reader to the workspace and set the parameters as follows:

Format: CSV (Comma Separated Value)
Dataset: download the CSV file from this article and provide its file path.

Click OK to add the reader to the workspace.

4. Add and Configure an OllamaConnector

Next, add an OllamaConnector to the workspace. When prompted, download the workspace from the FME Hub. Alternatively, download the transformer from the FME Hub and install it with your preferred version of FME.

Open the OllamaConnector and configure the parameters as follows:

Ollama Base URL: enter your base URL.
- This can be found in the Command Prompt window after entering the ollama serve command.
- Note: the default value is http://localhost/11434
Action: Generate Completion
Model: enter the name of the model you want to use
- In this tutorial, we are using a lightweight model by Google called gemma3
User Prompt:

You are a dispatcher for the City of Vancouver. Categorize this service request: '@Value(description)'. Return the following information:
- department (Transportation, Parks & Rec, or Engineering)
- priority (1 for Emergency, 2 for Urgent, 3 for Routine)
- equipment (what specific tool, if any, is needed for the task?)

Click OK to accept the new parameters.

5. Run the OllamaConnector and Inspect the Output

With data caching enabled, run the workspace in FME Workbench.

In the Command Prompt window, you will see that Ollama has received the API request and is processing the results.

Once all records have been processed, the OllamaConnector will output the response in FME Workbench.

Click on the OllamaConnector Output port to open the data inspector window and view the results.

In the Table view, you can see the OllamaResponse attribute that includes the response from the LLM. The Output also includes the raw_response attribute, which contains the raw JSON response from the LLM.

6. Using a JSON Schema

To obtain a structured response, we can provide a JSON Schema to the LLM, which will inform it of the expected output format. This allows for predictable responses and is generally a best practice when using any AI connector in FME.

Open the OllamaConnector parameters, enable Structured Output, and then enter the following value for the JSON Schema parameter:

{
"additionalProperties": false,
"properties": {
"department": {
"enum": [
"Transportation",
"Parks & Rec",
"Engineering"
],
"type": "string"
},
"equipment": {
"type": "string"
},
"priority": {
"enum": [
1,
2,
3
],
"type": "integer"
}
},
"required": [
"department",
"priority",
"equipment"
],
"type": "object"
}

Click OK to accept the new parameters.

Again, with data caching enabled, run the workspace and inspect the output. This time, you will see that the OllamaResponse attribute contains a structured JSON response.

Using an AI Model deployed in Azure AI Foundry

In this last section, we will cover how to integrate FME with an AI model that is deployed and managed through Azure AI Foundry using the AzureAIFoundryConnector. Azure AI Foundry exposes models through secure, scalable endpoints suitable for enterprise environments.

Requirements:

Access to Azure AI Foundry (ai.azure.com)
Have a model deployed and ready to use with Azure AI Foundry

1. Retrieve the Model API Key from Azure AI Foundry

In Azure AI Foundry, go to the “Models + endpoints” page to view your model deployments. Click on the model you want to use. Keep this window open as we will need the API Key and the Target URI later.

In this tutorial, we will be using DeepSeek-R1-0528; however, any model that supports the Chat Completion endpoint will work.

2. Open FME Workbench

Open FME Workbench and create a new workspace.

3. Add and Configure a CSV Reader

In this example, we will be using a dataset of public works service requests. We will utilize AI to assess the severity of these issues and categorize them based on their descriptions.

Add a Reader to the workspace and set the parameters as follows:

Format: CSV (Comma Separated Value)
Dataset: download the CSV file from this article and provide its file path.

Click OK to add the reader to the workspace.

4. Add and Configure an AzureAIFoundryConnector

Next, add an AzureAIFoundryConnector to the workspace. When prompted, download the workspace from the FME Hub. Alternatively, download the transformer from the FME Hub and install it with your preferred version of FME .

Open the AzureAIFoundryConnector and configure the parameters as follows:

API Key: enter the API Key for the model that you selected in Step 1.
Base URL: enter your base URL. For example: https://yourorganization.services.ai.azure.com/
- This can be found in the Target URI of the model in Azure AI Foundry. Example below:
Action: Chat Completion
Model: enter the name of the model you want to use
- In this tutorial, we are using DeepSeek-R1-0528
User Prompt:

You are a dispatcher for the City of Vancouver. Categorize this service request: '@Value(description)'. Return the following information:
- department (Transportation, Parks & Rec, or Engineering)
- priority (1 for Emergency, 2 for Urgent, 3 for Routine)
- equipment (what specific tool, if any, is needed for the task?)

Click OK to accept the new parameters.

5. Run the Workspace and Inspect the Output

With data caching enabled, run the workspace.

When the workspace has finished running, click on the AzureAIFoundryConnector Output port to open the Data Inspector window.

In the Table view, you can see the Response attribute that includes the response from the LLM.

Conclusion

As large language models continue to be deployed beyond traditional public cloud AI services, organizations are gaining greater flexibility in how and where they host their models.

By supporting integrations with platforms such as LM Studio, Ollama, Azure AI Foundry, and others, FME enables an “Any AI” approach, allowing organizations to integrate LLMs into their workflows regardless of where those models are deployed.

Search

Next Steps with AI: Accessing Models Deployed from Hugging Face

Files

Introduction

Step-by-Step Instructions

Using an AI Model deployed to LM Studio

Requirements:

Using an AI Model deployed to Ollama

Requirements

Using an AI Model deployed in Azure AI Foundry

Requirements:

Conclusion

Was this article helpful?