Using local ChatGPT-like LLMs in OpenRefine for data wrangling

@Sunil_Natraj In one line - it's rocking :slight_smile:

Thanks a lot for the 'History' tab.

One more option will be helpful I think - Throttle delay (in milliseconds) - as those who are using free tier APIs (like me :frowning: ) has often a rate limit. Such an option may improve our situation.

-partha

1 Like

Open refine 3.9 worked, but no LLM provider list is available. How can I configure LLM providers?
Congrats for the great work.
B.

Thank you @psm. Glad this makes the extension more useful. Your input on support for supporting delay between requests makes sense, can you add this request here

Hi, There are no pre-defined LLM Providers, you will need to define them, refer the documentation.

I am almost there, but what am I doing wrong?

@belfra Could you plz save it (configuration) first and then test the Ollama?

It should then provide response like this:

-partha

Must be something in my LLM url: http://localhost:11434
Something is missing...
Great, great work. congrats u all
B.

@belfra May try this (what I'm using for Ollama):

[in place of the model llama3.1:8b use your own model - ollama list command will provide the model names that you are using. Use the exact name under the Name column]

1 Like

Thank you so much, you are the MAN!
B.

1 Like

AI Extension V0.1.2 is now available for download. Link

Thank you for all your inputs and support in this release.

3 Likes

@Sunil_Natraj @psm it may be good to update the AI - LLM Provider Definition Guide page with ready to use template for the different LLM provider.

Hi @Martin I did request @psm if he can help on this one.

1 Like

@Sunil_Natraj and @Martin, good idea! If I provide the content in a word processor (whatever services I've attempted so far), will that suffice?

1 Like

Yes that will be helpful. I can add that into the help guide.

@Sunil_Natraj and @Martin

Is this model okay to follow for other LLM providers?

Plz let me know.

NB: Upload is not allowing .odt/.docx (only pdf format)
llm-services.pdf (65.3 KB)

Hi @psm This looks good. PDF format is fine; I will be able to create the MD format from it

@Sunil_Natraj

Please find the first version (to be updated gradually, lets start with this).

Regards

2 Likes

Thanks @psm I have published guides for Ollama, Groq and OpenRouter and links included in the main help page

2 Likes

As someone quite familiar with OR and conventional data wrangling, but not in Python I also want to thank you for this very exciting lesson, which finally got me to take a closer look at Python. With deepseek-r1-distill-llama-8b I've achieved good results for my purpose in the first tests (I tried different LLMs but my challenges were not model related in the end).

Learnings:

  • Update your Mac to Sequoia …
  • If the prompt in the script contains an Umlaut (english language prompt), a utf-8 decoding error is thrown (only in OR/Jython). Took me a few hours, I just wanted to help out with austrian month names...
  • Removing the processes from the output seems easier (for me) in OpenRefine via GREL than doing it in Python

Learned a lot, thanks!

2 Likes

Hi @colognella, thanks for the feedback! Have you also tried the great LLM Extension by @Sunil_Natraj? It makes life much easier when dealing with local (but also remote) models.