User Interviews Results Part 3: Cultivating a Thriving Developer and Trainer Community

Please find below the third publication as part of my series of posts presenting the results of 19 interviews. If you missed the previous posts, you can find them here:

This post focuses on feedback regarding how the community operates. Specifically, we will explore the perspectives of two contributor groups:

  • Developers (from 6 interviews)
  • Trainers (from 15 interviews)

Extension Developer

Extension developers appreciate OpenRefine 3.x for its technical documentation, stability and support via the forum.

Here are their suggestions for improving OpenRefine extension developer experience:

  1. Two developers have indicated that OpenRefine lacks comprehensive and official guides for building extensions.

  2. The way dependencies are inherited in OpenRefine needs to be more explicit. For example, a developer expected to use their version of Jena in their extension, not the one in OpenRefine.

  3. Working with Butterfly is odd and more guidance would be appreciated.

  4. Three users suggested improving the extension mechanism with a plugin manager for easier discovery, installation, and development. Last year, we made an unsuccessful grant application to the Mozilla Infrastructure Fund for that purpose.

  5. One developer is looking for a better way to integrate OpenRefine into a data pipeline, focusing on improving the API for reproducibility so that OpenRefine can be programmatically interacted with. That would also involve removing the client mode so OpenRefine can be embedded in DAGs like Airflow and Dagster for data processing.

  6. One developer indicated it would be helpful to have the ability to customize the CSS using variables fully.

Extension developers cite time constraints and a lack of knowledge of OpenRefine's development languages (especially from those developing reconciliation services in python) as barriers to contributing to the core project.

Core Developer

My conversation with three core developers provides insights into how we can improve and grow our developer community. In my opinion, this feedback is crucial if we want to attract and retain more developers. I think it may also explain why we witness extensions with features that we wish to see in the core, forks, or separate releases in the past.

  1. While five contributors enjoy good community support, highlighting the positive and helpful atmosphere on the mailing list or forum when asking for help. Three experienced contributors to the project have noticed that some communication between team members can be off-putting or aggressive at times. They also believe that to contribute to the project in the long term, one needs to be persistent and tenacious. They are also concerned about how newer contributors perceive those exchanges.

  2. I received similar feedback from two developers who integrated and customized OpenRefine to meet their organization's requirements. They created new features and integrations but found it challenging to justify the additional effort required to contribute upstream to OpenRefine. Contributing to OpenRefine may require them to double their development efforts, and they are unsure if their contribution will be accepted. They are also worried about encountering friction and are uncertain how their contribution will be received.

Roadmap and Feature Priorities

  1. Long Term Roadmap: Five contributors have expressed a need for more transparency regarding the high-level description of the development efforts and the current roadmap. Although the team provides low-level details of our progress, it can be challenging for users to understand the big picture of where the project is heading or how it will impact them. For reference, this is the latest conversation regarding managing OpenRefine's roadmap: OpenRefine 2032 ... what direction does OpenRefine want to go?.

  2. Short-Term Roadmap: During the interview, one of the participants expressed their desire to create more bounties for particular features.

  3. One person mentioned that the project history needed to be clarified. We only have one outdated blog post buried at the end of our blog.

Training, Outreach, Documentation and Support

From those conversations, it appears that trainers play the role of ambassadors, advocates, and educators. Moreover, some of them act as the first line of support within their communities.

  1. Outreach and Ambassador. One user indicated that we do have a structured outreach effort and that we should increase OpenRefine's visibility to other communities, including startups and statistics-related groups.

  2. Trainer Support Three trainers want to connect with like-minded individuals working in a similar context. Some of the suggested ideas to improve collaboration between trainers include

    • Conduct "Ask Me Anything" (AMA) sessions and webinars within relevant communities.
    • Schedule demos and office hours for trainers
    • Better communicate new features to the trainer community (see also the previous discussion on the long term roadmap).
  3. Documentation & Support: Two trainers suggested creating a community-editable tutorial or cheat sheet tailored to their community. They noted that GREL recipes might differ between librarians and journalists, for example.

  4. Documentation: Non-English speaking trainers recommend translating the documentation instead of the interface. Most users are comfortable with an English-based interface; however, the language barrier is greater when reading the documentation. Two trainers indicated that translated documentation with screenshots and references to the software in English is acceptable.

  5. Four contributors supported building contributor pathways for non-technical contributors in support, training, documentation, and translation. This is something I am currently exploring via the CSCCE Creating Community Playbook workshop. Overall, we are looking to better support the training community via the EOSS-6 grant application.

  6. Github for non-developer. Many non-technical users find GitHub difficult to approach, and the forum isn't always clear for reporting bugs or requesting new features. As a result, trainers are unsure about the appropriate channels for different interactions within the OpenRefine community, such as questions and contributions. Despite the availability of GitHub issues, most users are hesitant to create them for different reasons. Five users cited the following reasons:

    • One trainer does not know the process to create issues.
    • Trainer does not want to trouble volunteer developers with requests for non-essential features
    • Trainers are afraid of asking the wrong questions on GitHub
    • One person prefers to connect directly and privately with someone.

Regarding 14. Documentation and Support - and community editable tutorials

What if we had a real wiki? Or maybe better would be to simply use Wikiversity for that need? Search results for "openrefine" - Wikiversity