In this record view, I want to label all records in which all values in IN_lref are identical. I’m not at all familiar with record operations in OR yet, so first, I’m trying to find out how I should address all the values in this column in the record.
But I’m not sure to see why you are using row.record.cells.
Naively (and if I understand you correctly), I would split the row in two (or more) records (see Cell editing | OpenRefine ) and then use a duplicates filter.
The reason for using row.record.cells is that I need to do a few operations with it.
Marking whether or not all occurrences in the record are duplicates is one.
Finding the highest value in a set of cells is another.
Facet by duplicates is a good idea, but one has to be careful. OR identifies this as a duplicate
So row.record.cells is a two dimensional structure called RecordCells.
The first dimension is the columns of the project and the second is "An array of the cells in the given column of the record".
As already was pointed out you need to first specify the first dimension (column name) to be able to access the second dimension (cell array).
row.record.cells[columnName]
Note that columnName is not a placeholder but actually a variable containing the name of the current column.
To be able to show the values of the cells you then would also need to add .value, because OpenRefine does not have a concept of representing "(Cell)Objects" in the GUI (yet).
row.record.cells[columnName].value
Let's wrap this knowledge into a more complex GREL expression, that you can use for example in a Custom text facet.
This expression checks whether the current column of the record only contains one unique value by using uniques and length. We have to add a second expression to distinguish between records that only have one element using record.rowCount and combining the two expressions using and.
Let's also add some sugar to the output by using the conditional if:
Good detective work! Would you be willing to update/correct the documentation based on your research?
The current design seems a little odd / non-regular to me since normally the columnName lookup is done implicitly.
Perhaps we could support row.record.cell (or some other notation) as equivalent to row.record.cells[columnName].
This would be a parallel to the equivalence between cell and cells[columnName]
The current syntax certainly isn't very easy to discover.
Tom
p.s. You can also the literal column name e.g. row.record.cells['IN_lref'].value
The current design seems a little odd / non-regular to me since normally the columnName lookup is done implicitly.
Perhaps we could support row.record.cell (or some other notation) as equivalent to row.record.cells[columnName].
This would be a parallel to the equivalence between cell and cells[columnName]
Actually, that's not a great idea, because the equivalence is actually cell and row.cells[columnName]
and in the context of a row or record, we're dealing with all the columns and there is no implicit current column.
Added it to my todo list. Winter is comming, so there will be some time to take care of it.
Not so sure because row.record.cells[columnName] basically returns a subset of the column (multiple cells), whereas row.cells[columnName] returns a single cell.
As far as I remember the code, columnName is usually available in the context for GREL expressions, but not in the context of Clojure and Jython expressions. Not sure how other language extensions built and manage their context.
I’ve written up an issue to correct the documentation. Added a little extra detail which is that this only returns non-blank cells from the record, which may be worth mentioning in the docs