Command help with flipping names

On November 3, I posted to the old list a query about how to flip names from First Last to Last, First. Owen Stephens generously replied, but unfortunately I missed his reply and just noticed it now. He explained that I cannot use the command value.split(‘,’)[1] + ’ ’ + value.split(‘,’)[0] if the original value doesn’t have a comma. How would I go about doing the opposite; how can I chance my name from David Roth and to Roth, David. I would be glad to share some of my dataset, but I don’t see a way to do that on the new forum. Thank you very much.

Hi David

I’ve just made a change to the forum configuraiton and you should now see a file upload option in the bar at the top of the post editor (previously it was only showing the option to add an image, now it should be possible to add csv, tsv and a range of other file types)

In terms of flipping the names… names are always tricky because they can vary so much in their structure. Its going to be impossible to have any rule that works in all cases. For example a pattern like “Owen Thomas Stephens” could be “Thomas Stephens, Owen” or “Stephens, Owen Thomas” - and once we start to consider how names work in a variety of cultures and languages it gets very complicated.

That said if you have a very restricted set of patterns in your names there might be ways of doing this for a your particular data. Hopefully now I’ve changed those settings on the forum you can share some examples.

Owen

Thank you Owen! I still cannot upload an excel file. Can the settings be changed again to accommodate that? If not, is there a format that you recommend? Thanks.

I think xls and xlsx now should be allowed

I am now receiving a message “Sorry, new users can not upload attachments.”

Adding hebrew names from wikidata.xlsx (8.0 MB)
I am trying to add the Hebrew names from Wikidata in column 3 of the attached file. I know that the flip will not be 100% and I will look things over, but I would like to do an automatic run to convert “Owen Thomas Stephens” to “Stephens, Owen Thomas” which should cover most of the cases.

Hi David,

I’d suggest you could try either partition or rpartition functions in GREL.

These commands split a string into two parts based on a defined separator character - either looking for the first occurrence of that separator from the left (partition) or the first occurrence from the right (rpartition). The output of both of these functions is an array (list), and you can then pick the parts from that list that you want to join together.

So for example with the value “Owen Thomas Stephens” you could do:
value.rpartition(" ", true)
to get an array containing two values:
“Owen Thomas”
and
“Stephens”

You cannot store an array directly in a cell in OpenRefine - you have to turn it back into a string. In this case we also want to swap the order of the array. So you can do something like:
value.rpartition(" ",true).reverse().join(", ")

This will split the string on the right-most space, reverse the array created, and join the values in the array back together with a comma and space.

Looking at the data you may need to experiment a little to see if you need partition or rpartition due to the right-to-left nature of Hebrew text but I hope this is enough to get you to what you are looking for. Don’t hesitate to ask if not

Owen