Use Sunil's LLM extension to describe images from URL

I didn’t want to revive @Michael_Markert long thread on LLMs, which ended with @sunil_natraj extension, so I’m starting a new topic here. @Rory mentioned last year the unfortunate Boston 311 dataset, which includes photos… :grimacing: that one might want to filter out before opening. I’m working on an intermediate OR workshop and I thought that it’d be a great general use case: have some LLM describe the content of an image from a URL and facet them based on the results, maybe output a JSON, I don’t know. I’m not sure yet what kind of prompt would be best to separate, say, spread eagle from city bike but for now, I’m more interested in the mechanics of fetching an image from a URL and describing it using an LLM (all in OR, naturally).

Using Ollama in OR, I am not able to force any of the models to just fetch the jpg from the URL and describe it. I wasn’t able to do that either directly in Ollama. In my browser, ChatGPT gaslit me in typical GPT fashion and grabbed whatever image of Boston it could find, after assuming from the URL that I was looking for an image of Boston, and said it was in fact the image from the URL. It did perform decently well after I fed it the image directly.

This makes me think that perhaps a solution would be OR → column with the URL → Add column based on this column → python code to save images locally and output the file location → add column based on this column → some code to feed the file to an LLM

I tried the first python code (save images locally) in Python 3 Python 2 and I didn't get very far. I'd have install libraries and if I have to teach graduate students how to get some Python 2.7 code to work on their computer just for funsies, we're looking at an advanced workshop for sure. Now I'm wondering if there would be a different way to go about it? It'd be such a cool workshop if it all worked inside of OR.

Salut Julie,

D’après moi, pour traiter des images d’un dataset, il faudrait exécuter une conversation en Base64 avant de les envoyer au modèle. Je ne l’ai pas fait dans OpenRefine, mais je le fais régulièrement dans un autre framework. Est-ce que ça peut se faire dans OpenRefine? Je ne sais pas. Je ne crois pas que l’extension de @sunil_natraj puisse gérer ça, à moins d’une mise à jour.

Téléchargez Outlook pour iOS

Hi there! I had the same issue and question a while back: Specify input type · Issue #22 · sunilnatraj/llm-extension · GitHub

It seems to work out of the box with some LLM services (Gemini), not with others…

Hey, I just tried it using OpenRouter. Unfortunately, I did neither get it to work with the OpenRouter Qwen2.5VL free models nor with the ollama api as URL. I assume it is related to the way images are handled (which have to be base64 encoded as ollama at the moment does not support image urls). Here is the Python script I used in OpenRefine:

import urllib2
import json

url = "https://openrouter.ai/api/v1/chat/completions"

headers = {
    "Authorization": "Bearer YOUR-API-KEY",
    "Content-Type": "application/json"
}

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this image?"
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": value
                }
            }
        ]
    }
]

payload = {
    "model": "google/gemini-2.5-flash-lite",
    "messages": messages
}

data = json.dumps(payload)
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
response_data = response.read()

return json.loads(response_data)['choices'][0]['message']['content']

+1

Téléchargez Outlook pour iOS

@archilecteur allo, oui ok merci. C'est pas fou!

@r0man-ist Hi! Welcome to the forum! I'll +1 your issue and add some of what I've written here

@Michael_Markert yeah ok that's what @archilecteur also suggested. I'll have a look at OpenRouter instead. I'm not overly attached to using OpenRefine with LLMs, it was just an excuse to show how to install extensions. It also reminded me that we only bundle Python 2.7, and it really isn't something that I'm going to use in a graduate course material. Some students started coding years after 2.7 had officially died! They don't even know it exists! Thankfully, I'm planning for next fall at the earliest, so there's plenty of time for updates to come.