Terminology for lists of operations meant to be re-applied

antonin_d · February 7, 2025, 9:29am

As part of the reproducibility project, I am working on improvements to the existing Extract/Apply dialogs to make them more usable.

To that end, I would like to introduce a term to refer to those lists of operations that are extracted to be re-applied later. Because they are currently exposed as a JSON blob to the user, they are currently often referred to as "a JSON", which I don't find very informative (JSON can be used to represent a lot of different things).

So I have been thinking about introducing a more descriptive term. I have considered the following options:

a recipe, because it captures well the reusable aspect (and the list of steps)
a workflow, which also refers to a sequence of steps, but feels perhaps more abstract. Also, @tfmorris does not like it when I use this term to refer to a list of OpenRefine operations, because it reminds him too much of ETL software where operations are directly combined together into a reproducible arrangement (as I understand it). So I have been trying to avoid using this term.
a pipeline? If we embrace the "oil refinery" metaphor, then perhaps that makes for a consistent theme, but I am not so keen to go in that direction…
a script? It might put more emphasis on the textual nature of the object, which would imply continuing to encourage users to edit those manually
a program? It might scare off people who don't see themselves as programmers
a macro? Perhaps too old-fashioned and gives the impression that the exact click positions are being recorded?
anything else?

Let me know what your thoughts and preferences are!

abbe98 · February 7, 2025, 5:19pm

We call these recipes in OpenRefine and sometimes "step" if talking about workflows in a more generic fashion. We use the term workflow to describe the whole import->recipe->export process.

We also use the term "snippets" for reusable GREL code.

thadguidry · February 8, 2025, 12:49am

I like the workflow and recipe descriptions @abbe98 which aligns with how I used to train, and I think many in the GLAM space use the term workflow for their entire processes, often including pre/post data wrangling in other tools like MarcEdit, Koha and Evergreen APIs, TMS cataloguer, Python/PostgreSQL scripts (newsrooms), etc. Recipe was used and picked up simply because we in OpenRefine called the history of operations with particular use of reusable GREL snippets as a "recipe" and had this term in our old wiki and Google code since the beginning.

So, I'd rather stick to:

Recipe - in OpenRefine, a set of data cleaning and transformation operations that may also include GREL expression snippets.
Workflow - an entire data processing chain that might include use of OpenRefine in that chain.

antonin_d · February 10, 2025, 2:25pm

Thanks for the feedback! Great, it sounds like going for "recipe" is consensual so far, so I'll go with that for now.

Topic		Replies	Views
Looking for example recipes (JSON representations of operations) Development & Design	9	94	February 12, 2025
Reproducibility project: December 2024 report Day-to-day project operations	1	23	January 2, 2025
Which reproducibility should we focus on? Development & Design	5	201	May 6, 2024
Recipe visualization prototype Development & Design	7	90	March 10, 2025
Operation history research Development & Design	4	256	February 9, 2024

Terminology for lists of operations meant to be re-applied

Related topics