Transpose values from all rows of a certain column into one cell for a new column

My first thought is that you could use the "Records" functionality in OpenRefine. To do this you'll need to have a column that has a single identifier for the work/book so that OpenRefine can group all the related rows together into a single "Record" and then you can combine all the information that's in a single column together.

Do you already have a column with such an identifier in it?

If so you should:

  • Move the column with the work identifier to be the first column in the project
  • In "Rows" view mode, in this column use Edit cells -> Blank down
  • Switch to "Records" view mode - you should see the rows grouped together with a record number

  • from the 'test' column use Edit column -> Add column based on this column. Use the GREL row.record.cells.test.value.join("|") (where test is the column name containing the page numbers). Call the column "pageoverview"

The outcome should be something like

  • Then in the pageoverview column use Edit cells -> Transform and use the GREL filter(value.split("|"),v,v!=cells.test.value).join("|")

I think this will get the outcome you want

(you can use a different character in the join() if you want them separated in a different way, although using a space, as in your example, could be problematic if any of the labels in test already have a space in them (e.g. front cover)

1 Like