Terminology for lists of operations meant to be re-applied

As part of the reproducibility project, I am working on improvements to the existing Extract/Apply dialogs to make them more usable.

To that end, I would like to introduce a term to refer to those lists of operations that are extracted to be re-applied later. Because they are currently exposed as a JSON blob to the user, they are currently often referred to as "a JSON", which I don't find very informative (JSON can be used to represent a lot of different things).

So I have been thinking about introducing a more descriptive term. I have considered the following options:

  • a recipe, because it captures well the reusable aspect (and the list of steps)
  • a workflow, which also refers to a sequence of steps, but feels perhaps more abstract. Also, @tfmorris does not like it when I use this term to refer to a list of OpenRefine operations, because it reminds him too much of ETL software where operations are directly combined together into a reproducible arrangement (as I understand it). So I have been trying to avoid using this term.
  • a pipeline? If we embrace the "oil refinery" metaphor, then perhaps that makes for a consistent theme, but I am not so keen to go in that direction…
  • a script? It might put more emphasis on the textual nature of the object, which would imply continuing to encourage users to edit those manually
  • a program? It might scare off people who don't see themselves as programmers
  • a macro? Perhaps too old-fashioned and gives the impression that the exact click positions are being recorded?
  • anything else?

Let me know what your thoughts and preferences are!

We call these recipes in OpenRefine and sometimes "step" if talking about workflows in a more generic fashion. We use the term workflow to describe the whole import->recipe->export process.

We also use the term "snippets" for reusable GREL code.

2 Likes

I like the workflow and recipe descriptions @abbe98 which aligns with how I used to train, and I think many in the GLAM space use the term workflow for their entire processes, often including pre/post data wrangling in other tools like MarcEdit, Koha and Evergreen APIs, TMS cataloguer, Python/PostgreSQL scripts (newsrooms), etc. Recipe was used and picked up simply because we in OpenRefine called the history of operations with particular use of reusable GREL snippets as a "recipe" and had this term in our old wiki and Google code since the beginning.

So, I'd rather stick to:

  • Recipe - in OpenRefine, a set of data cleaning and transformation operations that may also include GREL expression snippets.
  • Workflow - an entire data processing chain that might include use of OpenRefine in that chain.
2 Likes

Thanks for the feedback! Great, it sounds like going for "recipe" is consensual so far, so I'll go with that for now.

1 Like