Creating a conditional (if-then) regex

Hello! I’m new to OpenRefine, and to regular expressions. I wonder if anyone can give me guidance on creating a regex that will accomplish the following:

If column DEED-OF-GIFT contains any text, enter “donation” in column accession_acquisition_type
If column RECORD-TRF-FORM contains any text, enter “transfer” in column accession_acquisition_type


Do you absolutely want Regex or GREL?

This could be solved instead with Facets.

  1. Apply a customized facet → Facet by blank on your column DEED-OF-GIFT.
  2. Then select “false” because you want to filter on rows with no blank values, in other words having “any text”.
  3. Then on column accession_acquisition_type choose Edit Cells → Transform and in the expression editor, change value to "donation" (note the enclosing strings), check the preview and if things look ok then click OK.

Rinse, repeat this strategy on your other columns as needed. Use Undo / Redo if things go astray and try other ways.

Facets are powerful to filter or apply conditionals oftentimes when the logic is somewhat simple:

  • checking for nulls or blanks, simple text filtering, etc.
  • When more logic is needed, you can use a Custom text facet or Custom numeric facet.

If you wanted to use GREL with a Custom text facet, then you could use the expression:


If you wanted to use Regex with a Custom text facet, or just use the Text Filter with something like:


which says, no carriage returns, or new lines, or empty whitespace,
then click invert.
But that regex still does not account for a few other things that may or may not be found in your column values and it doesn’t account for zero characters! Which is why it’s best to skip regex for cases like this, if you are doing something simple like text containment of a condition of “any text” because that simple rule ends up encompassing nearly the entire set of Unicode. And then you have to somehow negate the parts or categories of text that you want or don’t want (whitespaces, all kinds of whitespaces? what about hidden spaces, NBSP, hidden control chars, punctuation, separators, etc. etc.)

By the way, we have GREL contains() which is a wide way to find strings or even patterns if you wanted to experiment with regex, rather than using find() or match(). Experiment!

Finally, the code way that often ends up being generally helpful to folks to know is that of a simple byte length check of a cell’s value with the GREL expression:


which returns 0 zero when the cell contains no character bytes (that might be storing hidden chars, or control chars, any chars). If the length of a cell’s value is 0 then we are not storing any value whatsoever for that cell currently.

Have fun and welcome to OpenRefine!

And if you indeed want some one-liner GREL expressions to accomplish what you asked… well just reply again and @ostephens might construct some for your use case. :slight_smile:

While I’d agree with @thadguidry that this is a classic Facet use case and that saves writing any overly complex GREL, I can’t resist a challenge like this :slight_smile:

You don’t say what the behaviour should be if both columns are non-blank - so I’m going to assume in this scenario, you want to record as a ‘donation’.

In column accession_acquisition_type from the dropdown menu choose Edit cells → Transform and use the following:


This expression checks to see if “DEED-OF-GIFT” has any text (isNonBlank) and if so, inserts “donation” - in this case RECORD-TRF-FORM is ignored. If DEED-OF-GIFT is blank, then the expression then does the check for “RECORD-TRF-FORM”.

Let me know if I can offer any further information


Wow, thanks @thadguidry and @ostephens! I was told I may have to change this data individually, so I’m delighted to have two solid (and quite comprehensible) methods to try. Owen, thanks for stepping up, and Chad, thanks for challenging Owen to do so :slight_smile: