To check I have understood, in the example the number of matches between TF and Gold is 4 because “Automatic control”, “System theory”, “Vibration” and “Control theory” appear in both the TF and Gold columns?
Thanks for the clarification
In this case, to (for example) to find the number of matches between the TF and GOLD columns you can make a custom text facet the GREL: filter(cells["TF"].value.split("; "),v,cells["Gold"].value.split("; ").inArray(v)).length()
And of course substitute relevant column names to do other comparisons
This is splitting the cells in the first column mentioned into an array, and then for each value in that array checking whether it appears in the array created by splitting the cell in the second column mentioned.
There are other approaches, but this was the first one that occurred to me based on the data you shared
It’s working as per our expectations. The inArray function is a new entity for us. Thanks for the nice explanation, as usual.
However, in some rows for the Gold column, one value is repeated more than once (as it is handcrafted, a few errors are present), and thereby it is producing the wrong number of matchings as it is finding more than one match but the same match (not unique). I’m sorry that I hadn’t noticed this weakness of the dataset during my earlier reporting. The table below is an example.
Is it possible to filter for unique values in the Gold column first, then match?
Yes that’s no problem. You can use .uniques() to remove any duplicates from the array. You could either do this on each column before you do the comparison: value.split("; ").uniques(),join(“; “)