Yes cross is very similar to. VLOOKUP. You can also use 'cross' within a single project - you just need to put in the current project name in the cross function arguments.
Yes - at least to a degree.
The 'cross' function returns an array of rows from the named project. It's an array because there could be more than one match. You can extract any cell from the rows returned. If we imagine that you had a table like:
Row No |
DOI |
Title |
Author |
1 |
10.1234/5 |
A guide to OpenRefine |
Stephens, O |
2 |
10.1234/6 |
A guide to DOIs |
Stephens, O |
If you were to do a cross() based on the DOI 10.1234/5, you'd get back an array like:
[ row1 ]
but if you were to do a cross() based on the Author name "Stephens, O" you'd get an array like:
[ row1 , row2 ]
So for example we can do:
cell.cross("DOI project", "DOI")[0]
<---- this the first matching row from the project
Since a row is a collection of cells in columns with values, we can then extract the value from a specific cell in the row:
cell.cross("DOI project", "DOI")[0].cells["Title"].value
<----- this is the Title from the first matching row from the project
or
forEach(cell.cross("DOI project", "Author"),r,r.cells["DOI"].value)
<---- the forEach iterates across all matching rows and gets the DOI in each case - the result is an array of DOIs
This gives a way of getting back specific data from a row:
row.cells["column name"].value
But if you want all the values from a row there's another trick which is to make use of the fact that a row contains not just all the cells in the row, but also the column names which you can then iterate through like:
forEach(row.columnNames, cn, row.cells[cn].value)
<---- an array containing all the values from the row
So in the above example you can do something like:
with(cell.cross("DOI project", "DOI")[0],r,forEach(r.columnNames,cn,r.cells[cn].value))
<---- array of all the values from the first matching row
This returns an array - so you'd have to join together to get a value to store in a cell. But once that is done, you can get a new cell containing the full data from the row from the other project - e.g.
with(cell.cross("DOI project", "DOI")[0],r,forEach(r.columnNames,cn,r.cells[cn].value)).join("|||")
This can be then split using the Edit Columns -> Split into several columns ...
And this way you can get all of the data in a single project!
The "cross" function, the use of arrays and the manipulation of rows are definitely in the set of less intuitive areas of OpenRefine but once you get familiar with them they can be pretty powerful.