OpenRefine presence at Wikimania 2024

Here is a summary of my impressions from the conference :slight_smile:

The training by Asaf was well attended (the room was packed) and well received, with Asaf doing a great job at demonstrating the Wikidata integration on a local dataset (population figures in Poland). Some problems were encountered in the process (I have opened issues about them).

I have met many OpenRefine users who shared their problems or wishes for the tool. The most salient needs were, in my opinion:

  • using OpenRefine with Wikibase.Cloud instances. This is a topic that multiple attendees brought up. The current situation of having to deploy the old reconciliation service manually is a big hurdle for many. This is an issue that we are well aware of. With @Gnoeee we sat down to have a look at his struggles in this regard (running the reconciliation service on his Windows laptop). The service was running but he had issues with type filtering, caused by some misconfiguration in config.py. The fork at github.com/judaicadh/wikibaseopenrefine and the accompanying tutorial were helpful in fixing this configuration bug (if I recall correctly). The new reconciliation service by @abbe98 is a helpful initiative but is designed for a quite specific use case (intentionally without support for type hierarchies or data extension, not compatible (yet) with the Wikibase integration in OpenRefine) so in my opinion it still makes sense (and is even quite urgent) to get a fully-fledged reconciliation service implemented in Wikibase as an extension. Unfortunately no-one from the Wikibase.cloud team attended Wikimania (as far as I could tell) but I will follow up with them separately to check what their plans are regarding this.
  • Lexeme support in OpenRefine was requested by many during the meet-up. As @abbe98 summarized in the corresponding issue, one helpful first step would be to be able to edit lexemes just like items (without support for forms / senses), just editing the statements on them.
  • Better support for fetching and editing qualifiers was also a popular request at the meetup, alongside other improvements such as support for custom ranks on statements. Some designs have been drafted, I'll try to summarize them in digital format soon. This reminded me of the request to improve the support for "no value" / "some value" which is considered for the improvements targeting Wikimedia Commons integration.

Getting further funding to work on such improvements seems doable if we can identify the most urgent needs in a clear fashion. Perhaps our ongoing user survey will help towards that.

Overall, I had the impression that OpenRefine is perceived as a quite essential tool by the community, and is viewed as rather stable and sustainable (in comparison to other Wikimedia tools, generally from a single volunteer author). People are counting on OpenRefine to remain a central data import tool for data-savvy people who want to contribute to Wikimedia projects.

I also had a good chat with @abbe98 about all sorts of OpenRefine topics (given that it was the first time we met in person), including about the tensions regarding governance and transparency. I wouldn't say we resolved everything, but I think I got a much better understanding of his needs and positions, which I hope will be helpful in finding concrete solutions to the existing frustrations.

I also sat down with @Andre_Costa and @Sebastian from WMSE to check in on the ongoing effort to polish and improve the Wikimedia Commons integration. The coming weeks should be rather quiet on this front because of summer holidays but work will resume after that, to tackle further improvements beyond the support for large uploads which has been completed. The idea to continue this relationship between WMSE and OpenRefine can still be considered, depending on funding and on capacity on WMSE's side.

That's all for now, I hope I have represented the positions of the people mentioned here accurately and didn't forget anything too important.

3 Likes