Dear OpenRefine Community,I hope this message finds you well. My team is developing a data engine platform focused on aggregating diverse data sources to generate actionable insights. We are exploring how OpenRefine’s powerful data cleaning and transformation capabilities can complement Airbyte’s 600+ connectors and Apache Airflow’s orchestration to optimize our ETL pipelines and data sourcing workflows.Could you kindly recommend a contributor or expert within the OpenRefine community with experience in integrating OpenRefine with Airflow or Airbyte for data processing use cases?
Hello @Daniel_Njiu,
Thank you for your interest in OpenRefine, and welcome to the forum!
I usually advise against using OpenRefine for large ETL workflows, as it was not designed for this purpose. While I have seen some organizations use it in this way, it often leads to unnecessary complexity and requires additional infrastructure. Typically, these projects serve as temporary solutions until a more scalable framework can be implemented.
Unless you have a very strong reason to use OpenRefine, I recommend starting your project with a programming language or a dedicated data processing tool.
Hi Daniel,
I have solid experience using OpenRefine with both Airbyte and Airflow in ETL workflows.
I have helped teams streamline data cleaning and transformation as part of automated pipelines.
Let me know if you want to explore how this could fit into your setup.
You can reach out to me on my email here
Colin