Breaking Language Barriers in FME Workspaces with AI

Siobhan Ryan
It's been a busy few weeks since I started at Tensing, with visits to see Dutch and French colleagues. They have shared with me a range of impressive Workspaces which I need to adapt for some UK projects I am working on.
Everyone loves a well-documented and annotated workspace. Unfortunately, my Dutch and French language skills aren't quite up to scratch. I was given a bunch of workspaces from the team in the Netherlands, only understanding how to say goedemorgen and dankje, it wasn't very easy to understand. That's when I thought there must be a way to use FME to translate the bookmarks and annotations in the workspaces so I could understand what each part of the process was doing. I’ve detailed my approach in this article—hope it gives you some ideas!
The solution: combine the power of FME and Generative AI to translate bookmarks and annotations in a workflow.
At Tensing, since the introduction of AI Assist tools in 2023 we have been exploring the use of Generative AI to enhance our workflows and improve efficiency through AI-driven automation. Over the past year we have released a few transformers in the FME Hub that make Generative AI accessible for everyday tasks.
At a quick glance, here are the main steps in the workspace translation:
- Read in the original workspace and extract the workspace parameters;
- Create a prompt for AI to read and make a request to the API;
- Extract the JSON response;
- Find and replace the translated segments in the txt file of the original workspace.
The original workspace in dutch.
Step 1: Reading in the workspace
It makes a lot of sense that FME is pretty good at reading in its own files. The FeatureReader proves itself very helpful when doing so. In this case, we are only exposing the bookmarks, annotations and workspace properties.
The first step is getting the data ready to make a prompt. Using the AttributeManager to remove the attributes we don’t need. As well as creating the attribute (to_translate) which holds the information from each of the feature types to be translated. Once the data is ready we can have a look at creating a prompt for the AI.
Step 2: Creating the AI prompt
Simplicity and clarity are key when creating a prompt for the AI model to use. In this case, we wanted a response in JSON and told it exactly how we would like the output formatted.
Translate the JSON input into English if it is not already in English, return JSON only follow the output structure.
####Input { "original_value": "@Value(to_translate)" }
####Example Output when a translation is needed { "original_value": "gis_etl_probleem", "translated_value":"gis_etl_problem", }
####Example Output when the original value is already in English { "original_value": "Problem", "translated_value":"Problem", }
After creating the prompt, there was still a bit of work required to tidy and clean the data for the API submission and then create the upload body for the API request. There are a number of APIs available for generative, this time we are using the Google AI Gemini API for the translation. In the future, we may look at switching to a locally deployed large language model.
Step 3: Extracting the response
A great thing about this process is the JSON response. Since we are able to specify to the AI exactly what we want in the prompt, it makes extracting the response nice and easy. From there, the AttributeTrimmer and StringReplacers are used to tidy the data and we are left with our original and translated bookmarks and annotations.
Step 4: Editing the workspace file
With the translation done, the next part of the workspace focuses on replacing the translated text, in this case the Dutch with the English bookmarks. Reading the original fmw file in as a .txt file this time allows us to find relevant the text line data we want to replace.
The FeatureJoiner performs a join between the translated data and the text file when the current and previous text line data match. Something that we noticed here was that the format of the data from the text file included extra HTML elements that were not present in the translated data. For example, an unjoined feature was #! NAME="Samenvoegen Export & AP". This is because the text file read this as #! NAME="Samenvoegen Export & AP".
This is especially prevalent in how annotations are read in, and requires a bit of tinkering with the XMLFlattener or StringReplacer to get these to join successfully.
From there it's just a matter of sorting the text file back into the correct order and writing to a new FME workspace. Ultimately, this is what we got at the end of the translation:
Translated workspace in English
Of course, a lot of care is needed when editing the mapping files of a workspace but we think this is a pretty cool use of AI and FME, in being able to understand our teams work from different countries and languages. Its's exciting to explore these different use cases of Generative AI. As these models continue to improve I look forward to seeing new and innovative ways of how we can integrate these into our work.