Which kind of work, which features, which direction will OpenRefine prioritize, going forward? Which use cases and communities will the tool serve in the next 10 years?
I’ll be bold… I want to propose an OpenRefine strategy- and roadmap-building year in 2023 (code name: OpenRefine 2032?). As I am very interested in its outcomes, and I represent communities for whom OpenRefine is currently quite useful (but will it be in the future?), I want to volunteer some of my time (within limitations) to work on this as well.
I can imagine various approaches (and they can be combined):
-
A general in-depth user survey which is very widely advertised (perhaps even advertised inside OpenRefine itself, so that every user hears about it). More specific than the general two-yearly survey, and with the intention to collect many more responses. Really an in-depth exploration of how often users do certain things with the tool, and what kind of tasks they want to do with it going forward. (I have ideas of how to frame, build and distribute such a survey and would be happy to help with it.)
-
Roadmapping sprints per OpenRefine user community. Organize a large roadmapping sprint in each significant existing OpenRefine user community as currently already identified through OpenRefine’s two-yearly survey. I can imagine a three-pronged approach for each community separately:
- In-person roadmapping meetings during / attached to major sector conferences in 2023 (eg IFLA conference for the international library community)
- Online roadmapping and prioritization sessions for the same community around the same time (for people who can’t attend said conference / in person meetup)
- And longer-running surveys or dedicated prioritization exercises, possibly here on this forum (can be different from the general survey - e.g. a prioritization survey of what the community has already come up with in the sessions mentioned above)
- The intended outcome: a prioritized 10-year roadmap / wishlist for each community.
- For such community-specific roadmapping exercises, OpenRefine can perhaps (via funded projects) provide a (paid) facilitator and a structure that can be re-used across communities, so that the process is uniform and participants just need to use existing materials to get going.
- I think turnout for each community should also be taken into account in some way: if (despite similar outreach efforts) only three Wikimedians show up, but 80 data journalists and 130 digital humanities scholars too, then that’s something to keep in mind as well.
(I am willing to help with such community-specific exercises for the Wikimedia community.)
This is of course just a proposal and a first idea on how to approach this. I’m curious what others think; that’s why I’m posting this here.
Why am I proposing this? I work a lot with Wikimedians, and I teach OpenRefine in the general cultural sector. The strongest requests I hear from these communities are related to Linked Data use cases: data operations, but with the goal to export to / batch edit other databases; reconciliation and data enrichment. However, I have the impression that other communities would probably prefer OpenRefine to be a tool for cleaning and analyzing ‘big data’, which is a different use case. What are the most clear needs? And how do we expect these needs to evolve over time?
Once that is discussed in depth, a further conversation can start whether OpenRefine can accommodate all these needs, whether it will go a specific path, and who is willing and able to work on it. For “my” communities, this will be very helpful to know - is OpenRefine going to be a tool of choice for the next 10 years, will it provide sufficient support for common use cases and hence be worth the investment, or should we look in different directions?