Column selection UI for applying recipes

antonin_d · January 2, 2025, 5:49am

As part of the reproducibility project, I am working on improving the UX of applying JSON recipes (in the "History > Apply" dialog).

One feature I have implemented last month is the ability to adjust the column names mentioned in a recipe, so that they are compatible with the project the recipe is applied to. This means:

mapping any columns read by the recipe to columns already existing in the project;
giving names to columns created by the recipe, so that those names do not exist yet in the project.

Currently, the UI for this looks like this (for an example recipe taken from this tutorial):

Beyond the layout and wording, which can be improved I'm sure, I'd be interested to read what you think about the best way to select existing columns (in the first "Required columns" section). At the moment it's using simple text fields where the user needs to type the name of an existing column, but I think that would be improving, to help the user select a valid column name. I can think of multiple ways:

adding an auto-complete widget like for selecting reconciliation entities, adapted to work for columns,
no auto-complete widget, but some visual indication of whether the column name is valid or not (such as a green checkmark / red cross on the right-hand side of the field). The same sort of validation could be used for the created columns (which are required not to exist in the project);
using a UI similar to the Wikibase schema, with drag-and-drop of columns.

I think it would be useful to develop some UI elements that we would reuse in other places of the UI, to have a consistent experience around selecting columns.

thadguidry · January 2, 2025, 7:06am

I think autocomplete works great in this regard.
The example demo experience that I'm thinking of uses jQuery UI Autocomplete and has many examples on it's right side of it's page which can be combined for various behaviors.

Of particular note and usefulness, I would think would be to combine Accent folding with Scrollable results using maxheight to account for many existing column names?

abbe98 · January 2, 2025, 8:55pm

At the moment it's using simple text fields where the user needs to type the name of an existing column, but I think that would be improving, to help the user select a valid column name. I can think of multiple ways:

Why not a native select-element(when picking existing columns)? It is; more or less resistant to user-errors, accessible by default, an existing pattern in OpenRefine, supports jumps in very long lists, ect.

A better select with search would indeed do good across OpenRefine but it might be better suited in a larger effort to tackle the current mix of UX patterns. Don't let perfect be the enemy, this functionality on its own moves the project forward!

but some visual indication of whether the column name is valid or not (such as a green checkmark / red cross on the right-hand side of the field).

Maybe just a native input with the pattern attribute? and possibly some CSS for border colors and such? It would be reusable in existing places more or less straight away.

Feel free to ping me on PRs.

Martin · January 7, 2025, 1:36pm

Would this UI also be the place to notify the user if the recipe will delete columns that were not initially present in the project when the JSON was created? This is coming from this feedback

When you create workflows using JSON history with the step Re-order/remove columns, if you import a new file with extra columns you want to maintain, they are deleted when running the workflow. The workaround is to move and delete each column individually, which is cumbersome. (1 user)

I understand this is a scope change and I am OK having this addressed separately.

antonin_d · January 8, 2025, 11:40am

Yes, this is something I have been thinking about:

github.com/OpenRefine/OpenRefine

Better generalizability for the reorder-columns operation

opened 12:30PM - 19 Jan 23 UTC

wetneb

Type: Feature Request undo/redo/history columns

We offer an operation, `reorder-columns`, which is able to change the order of c…olumns in an arbitrary way and remove any number of columns, in a single step. ![image](https://user-images.githubusercontent.com/309908/213440782-41467e32-c841-424a-bb17-6a049a61691e.png) The operation which corresponds to this screenshot is represented internally as: ```json { "op": "core/column-reorder", "columnNames": [ "Identifiant du lieu", "Année du tournage", "Type de tournage", "Titre", "director", "producer", "Réalisateur", "Producteur", "Code postal", "Coordonnée en X", "Coordonnée en Y", "geo_shape", "geo_point_2d" ], "description": "Reorder columns" } ``` In other words, the operation parameters simply remember what is the final order of the remaining columns after reorder. As explained in #4055, this operation does not generalize well to other datasets, when it is used in the "Apply" dialog of the Undo/Redo tab, because any columns that were not in the original dataset but are present in the new one will be deleted by this operation. ### Proposed solution We should find other ways to specify this operation so that the actual intent of the user is captured better. It is difficult to come up with a precise specification, but as a litmus test, I would expect the following criteria to be satisfied: * If I use the dialog above to delete columns only, without reordering the remaining columns, I would expect that the generalization of this operation only deletes the said columns, and leaves any other column (including those not present in the original dataset) untouched. This could be achieved by calling the `column-removal` operation instead, as it supports removing multiple columns at once after #5563. * If I use the dialog to reorder columns only, without deleting any column, then I expect that the generalization of this operation will not delete any column in any other context. Additional columns not present in the original dataset should be placed in a *sensible* location in the resulting table. It is not a problem if this exact order is complicated to define, as the exact position of columns should generally not have a significant influence on operation workflows except in certain cases (such as the use of transpose operations). ### Alternatives considered One could decide to break down the operation into multiple steps, applying the operations that remove or move a single column multiple times, to reach the desired state. Working on such a decomposition is likely useful to understand this issue better, but I would rather prefer that this dialog does not generate lengthy lists of steps in the project history. ### Additional context Follow-up to #4055 and #5563.

I am hoping that the visualization of the recipe (which I propose to show alongside the column mapping UI) will help highlight this issue. In this visualization, the column reorder operation will (so far) be shown as a big unanalyzable block that users should hopefully learn to avoid.

Beyond that, the introduction of more expressive operations for deleting multiple columns in one go (without reordering any other) and reordering columns without deleting any column could help with this issue, assuming we are able to expose them in the UI in a satisfactory way.

antonin_d · February 27, 2025, 10:49am

The PR for this feature is open: Map recipe columns to project columns in 'Apply' dialog by wetneb · Pull Request #7158 · OpenRefine/OpenRefine · GitHub

It should already let you play with the UI, although full support (when clicking the "Apply" button) will require the other backend PRs (#7153, #7154, #7155, #7156 and #7157).

Topic		Replies	Views
Reproducibility project: December 2024 report Day-to-day project operations	1	23	January 2, 2025
Recipe visualization prototype Development & Design	7	104	March 10, 2025
Feature request: global delete empty columns in a project Development & Design	5	125	August 6, 2024
Reproducibility project: October 2024 report Day-to-day project operations	0	16	November 7, 2024
Update on column mappings - reconciliation dialog redesign Development & Design reconciliation	2	263	December 11, 2023

Column selection UI for applying recipes

Related topics