Using local ChatGPT-like LLMs in OpenRefine for data wrangling

psm · March 19, 2025, 5:25pm

@Sunil_Natraj In one line - it's rocking

Thanks a lot for the 'History' tab.

One more option will be helpful I think - Throttle delay (in milliseconds) - as those who are using free tier APIs (like me ) has often a rate limit. Such an option may improve our situation.

-partha

belfra · March 19, 2025, 7:37pm

Open refine 3.9 worked, but no LLM provider list is available. How can I configure LLM providers?
Congrats for the great work.
B.

Sunil_Natraj · March 20, 2025, 12:29am

Thank you @psm. Glad this makes the extension more useful. Your input on support for supporting delay between requests makes sense, can you add this request here

Sunil_Natraj · March 20, 2025, 12:30am

Hi, There are no pre-defined LLM Providers, you will need to define them, refer the documentation.

belfra · March 20, 2025, 5:11pm

I am almost there, but what am I doing wrong?

psm · March 20, 2025, 5:45pm

@belfra Could you plz save it (configuration) first and then test the Ollama?

It should then provide response like this:

-partha

belfra · March 20, 2025, 7:15pm

Must be something in my LLM url: http://localhost:11434
Something is missing...
Great, great work. congrats u all
B.

psm · March 20, 2025, 8:18pm

@belfra May try this (what I'm using for Ollama):

[in place of the model llama3.1:8b use your own model - ollama list command will provide the model names that you are using. Use the exact name under the Name column]

belfra · March 20, 2025, 8:43pm

Thank you so much, you are the MAN!
B.

Sunil_Natraj · March 21, 2025, 5:12am

AI Extension V0.1.2 is now available for download. Link

Thank you for all your inputs and support in this release.

Martin · March 24, 2025, 1:45pm

@Sunil_Natraj @psm it may be good to update the AI - LLM Provider Definition Guide page with ready to use template for the different LLM provider.

Sunil_Natraj · March 24, 2025, 2:05pm

Hi @Martin I did request @psm if he can help on this one.

psm · March 24, 2025, 5:16pm

@Sunil_Natraj and @Martin, good idea! If I provide the content in a word processor (whatever services I've attempted so far), will that suffice?

Sunil_Natraj · March 25, 2025, 8:35am

Yes that will be helpful. I can add that into the help guide.

psm · April 1, 2025, 8:07am

@Sunil_Natraj and @Martin

Is this model okay to follow for other LLM providers?

Plz let me know.

NB: Upload is not allowing .odt/.docx (only pdf format)
llm-services.pdf (65.3 KB)

Sunil_Natraj · April 1, 2025, 8:49am

Hi @psm This looks good. PDF format is fine; I will be able to create the MD format from it

psm · April 1, 2025, 11:48am

@Sunil_Natraj

Please find the first version (to be updated gradually, lets start with this).

Regards

Partha
llm-services.pdf (165.0 KB)

Sunil_Natraj · April 28, 2025, 1:50pm

Thanks @psm I have published guides for Ollama, Groq and OpenRouter and links included in the main help page

colognella · May 15, 2025, 6:21am

As someone quite familiar with OR and conventional data wrangling, but not in Python I also want to thank you for this very exciting lesson, which finally got me to take a closer look at Python. With deepseek-r1-distill-llama-8b I've achieved good results for my purpose in the first tests (I tried different LLMs but my challenges were not model related in the end).

Learnings:

Update your Mac to Sequoia …
If the prompt in the script contains an Umlaut (english language prompt), a utf-8 decoding error is thrown (only in OR/Jython). Took me a few hours, I just wanted to help out with austrian month names...
Removing the processes from the output seems easier (for me) in OpenRefine via GREL than doing it in Python

Learned a lot, thanks!

Michael_Markert · May 15, 2025, 6:40am

Hi @colognella, thanks for the feedback! Have you also tried the great LLM Extension by @Sunil_Natraj? It makes life much easier when dealing with local (but also remote) models.

Topic		Replies	Views
Using LLMs in OpenRefine for data wrangling with Hugging Face inference API Support and Helpdesk hints-and-tips	2	94	December 4, 2024
Script to interact with open source LLM Support and Helpdesk llm	5	191	May 28, 2024
OpenRefine access using python API Support and Helpdesk	1	409	February 16, 2023
How to use your own Python scripts as APIs in OpenRefine Support and Helpdesk hints-and-tips	3	555	April 26, 2024
OpenRefine 2024 Barcamp: Support OpenAPI in OpenRefine Development & Design barcamp-2024	0	50	July 9, 2024

Using local ChatGPT-like LLMs in OpenRefine for data wrangling

Related topics