I want Help Understanding OpenRefine Extensions – Where to Start?

Hey everyone,

I have been diving into OpenRefine for some messy data projects and I am impressed by what it can do. I am about the development side— how to build or use extensions. I have browsed some of the GitHub content and skimmed through the docs but I am still not 100% sure where to begin.

I have got a decent handle on Java and JSON but I want a plain explanation (or even a basic example) of how an extension is structured; how it loads into OpenRefine & how to test it properly during development. Any guidance would be helpful!

Also, if there are any beginner-friendly sample extensions you would suggest, please drop them here.

By the way, while working on this stuff, I came across a CISSP course online that covered data privacy topics—it made me think more deeply about data handling best practices, even in tools such as OpenRefine.

Thank you.:slight_smile:

1 Like

Welcome to the community! What type of extension are you looking to build?

I have been diving into OpenRefine for some messy data projects and I am impressed by what it can do. I am about the development side— how to build or use extensions. I have browsed some of the GitHub content and skimmed through the docs but I am still not 100% sure where to begin.

What docs did you find? Where did the trail go cold? I'm curious as to where the discovery process broke down.

This page is probably the best starting point: https://openrefine.org/docs/technical-reference/writing-extensions

It's a little bit out of date and I'm in the process of updating it now. Basically the material from the Migrating older extensions page needs to be incorporated into it.

Also, if there are any beginner-friendly sample extensions you would suggest, please drop them here.

The sample extension has been split out into a separate repository: https://github.com/OpenRefine/sample-extension

Tom

1 Like