PB52 - Not Every Pipeline Needs to Use an LLM
👆Click the image above to explore the pipeline simulation👆
One of the most enabling features of Foundry as a platform is the ability to use LLMs within the ecosystem. You are not calling an external API, managing credentials, or stitching together separate tools. The model is there, inside the pipeline, ready to do work. But that accessibility is exactly why it is worth slowing down. Just because the LLM is one step away does not mean it is always the right step.
First some setup. We need more data, the previously loaded UFO sightings data contains only a short summary and not the full text of what was reported.
Head back to Kaggle, and search and download the Enhanced UFO Sighting Dataset
The .zip file contains 3 files, drag and drop ufo_sightings_enhanced.csv into the data\raw folder in foundry.
Add it to the Clean UFO Sightings Data pipeline that was created previously.
For the sake of the exercise, let's eliminate duplicates from the ufo_sightings_enhanced by adding a Transform and first doing a select and selecting datetime, city, and description. Then next add a DROP DUPLICATES and choose datetime and city. Rename the transform to Drop Enhanced Duplicates
Click ufo_sightings_raw and choose join
Click the Drop Enhanced Duplicates (to be the right dataset) and then Start
Choose datetime to match datetime
city to match city
Then click Deselect all and then choose only description (This is the new data we are adding)
Connect the Join to the Drop Duplicates
Extract the UFO color, let's try regex
We joined the descriptions. Now suppose we want to run a color analysis. What color did witnesses report most often? To answer that we need color as a column. You could open up the original spreadsheet, read each description and then type in the reported color. This would take you a long time. We have thousands of rows. So we need the pipeline to do it for us. The first instinct is pattern matching. We write a set of rules that scan each description for known color words. If it finds "orange," it returns orange. If it finds "red," it returns red. This approach is called regex. Let us try it first.
After the Join, add a Transform
First we want to lowercase the description, so that variations in the capitalization don't cause the pattern match to not work.
Search and choose Lowercase
Then choose the description column
Next type regex, and choose Extract all regex matches
Then choose description, Value, and paste in
\b(red|orange|yellow|green|blue|purple|violet|white|black|silver|gold|pink|brown|gray|grey)\b
We can see that the regex successfully pulls out some colors from the description
But if you scroll through the results you can note some interesting things. The regex is just blindly matching colors, so how do we know what the reported color of the UFO was? Also, if there is a color missing from our list we wouldn't get a match. You could spend an afternoon making the list longer and it would still miss entries. The descriptions are written in natural language by people who were not thinking about your color column when they filed their report.
This is where the LLM earns its place in the pipeline.
The Use LLM node in Pipeline Builder offers a convenient method for executing large language models on your data at scale, allowing you to seamlessly incorporate LLM processing logic between data transformations with no coding required. Instead of a list of rules, you write a prompt.
There has been a lot written about prompt engineering, and we can't cover all of it here. But there is one concept worth understanding before you start writing prompts against thousands of rows of data: tokens.
Every time the LLM reads your description and generates a response, it costs tokens. Tokens are roughly the unit of text the model processes, think of them as chunks of words. The longer your description, the more tokens it consumes. Run that across thousands of rows and it adds up fast.
Let's add a Use LLM
Choose Empty prompt
The Use LLM node has two main sections to configure.
The first is Describe the role the model will play and outline the task it will perform. This is your system prompt. It tells the model what it is and what you need from it before it reads a single row of data. For our color extraction, fill it in like this:
You are a data extraction assistant. Read the following UFO sighting description and extract any colors used to describe the observed object. Return only the color words as a list. If no color is mentioned, return an empty list. Do not explain your answer.
Paste it in the Instructions. Take note that to provide input data you press the forward slash /
Press / and choose the description
We want to capture multiple color outputs, to do so change the output type to an array
Keep the model at GPT-5 nano. Model choice matters. AIP supports a wide range of LLMs from providers like OpenAI, Anthropic, Meta, and Google. That range exists for a reason. Extracting color words from a sentence is not a complex reasoning task. A lightweight, fast model like GPT-5 nano is the right tool. Reaching for a larger model here is like sending a freight train to deliver a postcard. It will work, but it is wildly more machinery than the job requires. Save the heavy models for tasks that actually need them.
Rename the Output column to colors_llm
Before you commit to running the LLM against thousands of rows, use the trial feature. In Foundry, tokens are the basic units of text that LLMs use to process and understand input. The size of the text will dictate the amount of compute used by the backing model to serve the response. Every description you send costs tokens, and longer descriptions cost more. Running a bad prompt against your entire dataset and then fixing it is an expensive way to learn. The trial lets you test against a small sample in seconds, see exactly what the model returns, and adjust your instructions before you scale. Get it right on ten rows first. Then run it on ten thousand.
At the bottom click on Trial run
Then click, Select from input table
I will select the row that we were looking at previously, with the gray skies mixed in with silvery spaceship.
Press run
We can then see that it took about 1.2K LLM tokens, and 5 seconds
By clicking on the Use LLM node, we can get a preview of 10 rows.
Test run 100 rows
Next, we want to run a test run of 100 rows. To do so we want to sort the rows by date, and then take the top 100.
Insert a transform in front of the Regex Extract
Add a TOP ROWS, enter 100, choose datetime, Ascending, and Apply and then close
Rename the node to Limit 100
After the Use LLM add an output.
Choose New dataset
Name it 100 Test
Click Deploy, and the Deploy Pipeline
After it is deployed, right click and choose Open (Using the arrow method, to open in a new tab)
It's very interesting to profile the data. See how the LLM extracts the color versus the regex.
Wrapping up
Look at the two columns side by side. The regex gives you exactly what you asked for, every color word it could find, with no understanding of which one actually described the UFO. The LLM reads the sentence the way a person would and pulls out the color that matters.
That difference is the whole point of this lesson. Regex is fast, cheap, and predictable. When the pattern is simple and the rules are clear, it is the right tool. But language is messy, and the moment your data starts looking like something a human wrote, rules start breaking down. That is where an LLM earns its place.
A few things worth carrying forward:
Pick the right model for the job. GPT-5 nano handled this task in seconds for very few tokens. A larger model would have done the same work for far more cost and no better result.
Trial before you scale. Ten rows will tell you almost everything you need to know about whether your prompt is working. Ten thousand rows will tell you the same thing, just with a much bigger bill.
Use the LLM where it actually adds value. Not every column needs one. The pipeline is strongest when regex, transforms, and LLMs each do the part of the work they are best suited for.