How much data can OpenRefine handle?

Hi @Nancy_Sack

I can't remember where I first posted this and what it was intended to do, but as it is written here it looks like this expression will look for any words that are written in ALL uppercase letters, and convert only those words to title case (i.e. so they just start with a capital letter). Words written in any other way will be left alone

forEach(value.split(' '),v,if(isNonBlank(v.match(/([^a-z]*)/)[0]),toTitlecase(v.match(/([^a-z]*)/)[0]),v)).join(" ")

So it would convert:
A CASE OF FAHR'S SYNDROME -> A Case Of Fahr's Syndrome
but convert
A case of Fahr's syndrome -> A case of Fahr's syndrome (i.e. do nothing)

I'm not sure if that's exactly what you are looking for from your description? The challenge of starting with a string like:

A CASE OF FAHR'S SYNDROME

Is that there is no way of knowing in advance there is a proper noun in this string (so we can tell looking at it that Fahr is a proper noun and so should start with a capital, but this is a harder task for a machine).

I think to more accurately covert titles to "cataloguer case" is a hard task because you can't rely on simple rules to know whether to capitalise certain words - in the above example there's no easy way to know that Fahr should be capitalised while case and syndrome should be lower case (and in the case of syndrome I had to look up to check if it would be Fahr's Syndrome or Fahr's syndrome - so not an easy task for a human either!

I suspect there are tools that can do this work and it's probably possible to integrate these with OpenRefine - but off the top of my head there's no easy way to do this. To give an example @michael_markert posted this use of the OpenAI API in another thread Using the OpenAI API to apply natural language queries to cells/data - that's not the specific answer in this case but I suspect that it could be adapted to do the work for you here - but it will require an OpenAI API account and possibly some payment for the service