My first thought is that you could use the "Records" functionality in OpenRefine. To do this you'll need to have a column that has a single identifier for the work/book so that OpenRefine can group all the related rows together into a single "Record" and then you can combine all the information that's in a single column together.
Do you already have a column with such an identifier in it?
If so you should:
- Move the column with the work identifier to be the first column in the project
- In "Rows" view mode, in this column use
Edit cells
->Blank down
- Switch to "Records" view mode - you should see the rows grouped together with a record number
- from the 'test' column use
Edit column
->Add column based on this column
. Use the GRELrow.record.cells.test.value.join("|")
(wheretest
is the column name containing the page numbers). Call the column "pageoverview"
The outcome should be something like
- Then in the
pageoverview
column useEdit cells
->Transform
and use the GRELfilter(value.split("|"),v,v!=cells.test.value).join("|")
I think this will get the outcome you want
(you can use a different character in the join()
if you want them separated in a different way, although using a space, as in your example, could be problematic if any of the labels in test already have a space in them (e.g. front cover
)