Reproducibility project: February 2024 report

Here is a progress report of the reproducibility project in February!

On Zoe's side

Aside from being sick for a week, I made progress on sketching the UI for operation history. Much of the design thinking, sketching, and discussing revolved around the best ways to visually (and textually) signal to users which subsequent operations would be affected by potential changes (deletion, necessary recalculation, etc), and how each step may relate to others in the history. Sketching sessions helped me think through the complexity and experiment with different visualization solutions, both in terms of how Operation History might appear in the overall interface, and the look and UX of the Operation History window itself.

We landed on two different approaches to the design to test: one which relies more heavy on text-based warning panels which let the user know the consequences of a potential change to the history, the other relies more on visualization and color in order to indicate these possible changes to the user. As we move into the interviews this month and show these designs to more people, I look forward to learning from their feedback.

Another aspect of the interface I worked on were operation logos (or icons, or symbols as we might refer to them). I’ve been developing logos for each type of operation that could sit next to it in the history. This idea arose from the desk research phase in the previous month, which help me think more creatively and broadly about different UX metaphors intended to help with usability. Here’s a link to the first iteration of the sketches, I’d love more feedback on them. They’ve been posted in both the forum and GitHub for maximum visibility within the community.

I’ve also been thinking about the way we invite designers into OpenRefine, and how we might make the documentation more clear for designers who are new to the project and new to open source work (GitHub, etc).

As I look ahead I’ve been thinking about future design projects I’d like to work on with OpenRefine and have shared them here:

Two GitHub issues that came up and sparked conversation:

On Antonin's side

This month, my work on the reproducibility project was split between three main tasks:

  • Addressing bugs discovered during interactive testing by Zoe or myself as a preparation for our first testing campaign.
    In this first testing campaign, we plan to ask experienced OpenRefine users to go through a sample data cleaning task with us, working with OpenRefine from the 4.0 branch. The goal is to observe their reaction to various preliminary changes to our reproducibility improvements: handling of partial results of long-running operations, new process panel, ability to run operations in parallel, and so on. We also anticipate they will discover more bugs during this testing.
  • Work on restructuring the commit structure of the 4.0 branch, following the approach proposed in the November 2023 report.
    This primarily consisted in refactoring in the 3.x test suite, to align the structure of the tests that of the 4.0 test suite. See the corresponding pull requests: #6371, #6383, #6388 and #6389. In parallel, I have done similar work on the 4.0 branch, comparing the test suites to identify any new test case I added.
  • Participating in the design of the history UI changes, together with Zoe. For now we are working on UI mock-ups which aim at relatively small changes to the existing history tab UI, attempting to add support for making various sorts of changes to the list of operations (see below).
    For now, we are not actively looking at a way to integrate a graphical representation of the operations list nor making changes to the Extract/Apply functionality. My intuition is to try to push for incremental changes to make sure we have the capacity to deliver relatable and implementable proposals first.
    But the order is debatable. Perhaps we should have started with the graphical history representation instead, and have added those new features on top of this new representation instead. By putting those features first, my hope is that it also makes it clear to the broader team what user needs we are trying to address, as the benefits of a graphical history representation are likely less tangible.

Here are the interactions we are trying to enable with the history tab:

  1. Deleting an old operation without discarding the following operations (#183, #369, mailing list thread).
  2. Re-computing an old operation without discarding the following operations (#655).
  3. Changing the settings of an earlier operation, again without discarding the following operations (no issue yet as far as I know).

Those are all things a user could want to do on a particular history entry. Are there other such actions we should have on our radar? How would you prioritize them?

Internally, the three ones listed above pose the same sort of challenge in the backend: one needs to be able to detect the potential effects of the action on the grid, and determine which of the future operations can be kept. This will rely on the columnar metadata exposed by operations in the new architecture, letting the backend enforce certain guarantees about which parts of the grid are touched by the operations.
For instance, if the user deletes an operation that creates a new column, any future operations that make edits in that column would also get deleted in the same go, but operations making edits
to other columns could be preserved. Similarly, if the user deletes an operation that made changes in a column, any future long-running operation which depends on this column will need re-computing since its input data will have changed. I am therefore working with the assumption that when the user requests the deletion of a particular history entry, the backend will be able to produce a list of operations that will need re-computing or will be discarded. This would be produced on a best-effort basis: by default, in the absence of sufficient columnar metadata, the backend would fall back on discarding as many future operations as needed. The question of how to convey those potential effects to the user was central to this month's design work.

Apart from that, the workload on general OpenRefine development was noticeably higher this month, with the release of 3.8-beta1 and the preparation for GSoC, primarily.

1 Like