Help with python script to run the apply-operations api

I am attempting to run a simple apply-operations api from a python script and am getting this error:
java.lang.IllegalArgumentException: argument "content" is null

I have attached the script below. I know I am successfully getting the csrf token, and I know the project is correct, because when I vary those things, I get different errors. I think the problem is probably with how the command is encoded -- maybe json decoding is failing so it seems to be null?

I can also repro this with a curl command. This is close to what I've done (sorry about the basic auth!):

% curl -X GET -u xxxx:yyyy https://philobiblon.cog.berkeley.edu/openrefine/command/core/get-csrf-token {"token":"CFowFphGjqd1qBxXLexsMQKHYyT3CDLM"}% % curl -u xxxx:yyyy -X POST -H "Content-Type: application/json" \ -d "$(echo '{"operations": [{"op": "core/column-addition", "description": "Add column test", "engineConfig": {"facets": [], "mode": "row-based"}, "newColumnName": "new_column", "columnInsertIndex": 0, "baseColumnName": "TEXT_MANID_QNUMBER", "expression": "value", "onError": "keep-original"}]}')" \ 'https://philobiblon.cog.berkeley.edu/openrefine/command/core/apply-operations?csrf_token=CFowFphGjqd1qBxXLexsMQKHYyT3CDLM&project=1724962668661'

Has anyone done something like this? I would imagine that something like this is how OR backend unit tests are written, no?

Thanks in advance. The script is below.

Max

import requests
from dotenv import load_dotenv
import os
import json

load_dotenv('open-refine.env')

username = os.getenv("username")
password = os.getenv("password")

print(f'{username = } {password = }')

auth = (username, password)

or_server = 'https://philobiblon.cog.berkeley.edu/openrefine'
csrf_response = requests.get(or_server + '/command/core/get-csrf-token', auth=auth)
csrf_token = csrf_response.json().get('token')

print(f'{csrf_token = }')

# Ensure csrf_token is not None
if not csrf_token:
    raise ValueError("CSRF token could not be retrieved.")

operations_array = [
    {
        "op": "core/column-addition",
        "description": "Add column test",
        "engineConfig": {
            "facets": [],
            "mode": "row-based"
        },
        "newColumnName": "new_column",
        "columnInsertIndex": 0,
        "baseColumnName": "TEXT_MANID_QNUMBER",
        "expression": "value",
        "onError": "keep-original"
    }
]

# Add CSRF token to parameters
params = {'csrf_token': csrf_token, 'project': '1724962668661'}

# Use the token in your POST request
headers = {'Content-Type': 'application/json'}
data = {
    'operations': operations_array
}

response = requests.post(or_server + '/command/core/apply-operations', headers=headers, params=params, data=json.dumps(data), auth=auth)

print("Request details:")
print(f"Method: {response.request.method}")
print(f"URL: {response.request.url}")
print(f"Headers: {response.request.headers}")
print(f"Body (if present): {response.request.body}")  # Might be empty or binary

# Check for response and print output
if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}")
    print(response.text)

Hi Max,

yes there are several implementations of the OpenRefine API listed on OpenRefine API | OpenRefine

But afaik they do not support authentication.
So let's try to fix your script.

In your operations_array:

You have to specify what type of expression this is. In your example this would be grel:value.
But that's not the main problem.

OpenRefine is expecting form encoded data and not JSON. So you have to remove the JSON headers and perform the conversion from operations_array to string before putting it into the payload.

Hope this helps.

Fixed! Thanks very much!