For scripting OpenRefine there is now another option for Linux users: orcli
It is a one-file Bash script that uses standard tools like jq and curl and has no other dependencies. By using bashly for development it has a friendly command line interface and is well maintainable for a shell script.
Compared to openrefine-client it has two advantages:
- integrated batch mode: when using the run command, orcli takes care of starting and stopping OpenRefine with temporary directories.
- error handling: when using the transform command, orcli splits the submitted undo/redo JSON file and uses the specific endpoint (e.g. command/core/split-column instead of the generic command/core/apply-operations) for each operation.
I’m only working part-time for the next 1-2 years because of the kids. So unfortunately I don’t have time to learn more Python and to do the porting of the openrefine-client to Python 3. This new approach is easier for me to maintain and I will use it for my own projects in the future. It is intended as a replacement for the openrefine-client and openrefine-batch projects.
The first release v0.1.0 is fully usable and tested, but does not yet support many importers and exporters. I have created issues for planned features. If someone else wants to use orcli and is missing a particular feature, I’m happy to tackle that next.