Tutorial: Integrating OpenRefine into Automated ELT Workflows – Looking for Guidance

Hi everyone,
I'm new to OpenRefine, but I’ve been working in the ETL/ELT world for quite some time.

I'm exploring ways to integrate OpenRefine into a broader, automated data pipeline, and I’d really appreciate any pointers or documentation to help with this.

Here's a quick overview of our setup:

  1. We collect data from a variety of IoT devices.
  2. We run data transformations to clean and standardize it.
  3. We feed the processed data into dashboards for analysis and visualization.

What I’d love to know is:
Can OpenRefine transformations be automated and executed programmatically within such a workflow?

Thanks in advance for your insights and any real-world examples or resources you can share!

Best,
Abdelkrim from Brussels

Hi,

you can find the documentation of the API here: OpenRefine API | OpenRefine.
Also, have a look at this Python implementation of the API: OpenRefine Client.
To automate edits, you have to have a look at the Undo/Redo tab (Extract...) for the structure of the JSON that is needed to apply operations via the API.

Best,
Michael

1 Like