Adding API column with special header

Hi everyone,
I must add header: "Accept-Language: en" while I'm adding a column by fetching URLs. Normally, my GREL code looks like this: "https://api-zbiory.mnk.pl/api/object/"+value, but to get data in English I must add this header... Now I know that I can't add this directly in GREL code when I'm creating a new column with API, but maybe there is another (and also simple) way?
Some Guy from support told me that I need to add this header and in "Postman" it works:

1 Like

@Ada the headers supported by OpenRefine when making this type of request are limited. When you do "Add column by fetching URLs" then there is a section of the expression window which says HTTP headers to be used when fetching URLs and if you click "Show" next to that you'll see the available headings. At the moment these are:

  • Authorization
  • User-Agent
  • Accept

But unfortunately not "Accept-Language"

Adding more options here isn't difficult but would be an enhancement request for a future release. You can create a request at the OpenRefine Github

However, in the meantime you can use Python instead of GREL to create a request with any headers you want. This Tutorial has a general intro to using Python (or Jython) Fetching and Parsing Data from the Web with OpenRefine | Programming Historian and there are some notes at _drafts | OpenRefine POST request with Jython which mention adding Headers to the request but don't go into the detail.

I expect someone will be able to give some detail on using headers in a Python request - otherwise I can work it out and post something - I just don't have it at the top of my mind

2 Likes

You could try a Jython script like this:

import urllib, urllib2, time
time.sleep(1)
url = "https://api-zbiory.mnk.pl/api/object/" + value
request = urllib2.Request(url, headers={'Accept-Language':'en'})
return urllib2.urlopen(request).read()

It seems to work on my side, as far as I can tell :slight_smile:

Unfortunately, this script doesn't work for me... Should I add this library urllib2 somewhere? Should I install something else?

Hi,
Try the same thing but instead of "add column by fetching", use "add column based on this column".

1 Like

that's an excellent point @h_piedcoq, I had missed that!

I am afraid that @Ada might encounter the same error when using "add column based on this column", since it seems to be a problem in the evaluation of the Jython expression.

This seems to be a known problem of Jython 2.7.3:

We could consider going back to Jython 2.7.2, it could perhaps fix this problem.

@Ada it would be useful to know which version of OpenRefine you are using (since that will let us determine which version of Jython it includes).

Thank you @h_piedcoq and @antonin_d for responding to me.
Adding a column "based on this column" doesn't want to work properly either... I have Open Refine in version 3.7.7, so I assume that I have this 2.7.3 version of Jython. How can I change this to version 2.7.2 in Open Refine?

I think with the values you show in your example you can use a slight variation on what @antonin_d suggested to avoid the issue:

import urllib, urllib2, time
time.sleep(1)
url = "https://api-zbiory.mnk.pl/api/object/" + value.encode()
request = urllib2.Request(url, headers={'Accept-Language':'en'})
return urllib2.urlopen(request).read()

That worked for me with OpenRefine 3.7.7

2 Likes

I tried this code and in preview everything was great, but when I clicked "ok" it returns an empty column...

just to check - was that using “add column based on this column” or “add column by fetching urls”?

I was using "add column by fetching urls" but with “add column based on this column” it works!!! Thank you <3

2 Likes