Global replace on column names

Whenever creating a project from an XML file, and selecting the first XML element corresponding to the first record to load, you get the name of that element preceding each of it's children's names as column names

For example this xml, you select the row element,

<some_XML>
  <row>
    <foo> bar </foo>
    <bar> baz </bar>
    <baz> foo </baz>
  </row>
  <row>
    <foo> baz </foo>
    <bar> bar </bar>
    <baz> foo </baz>
  </row>
  <!-- and so on and so forth -->
</some_XML>

you get columns

row - foo | row - bar | row - baz

This is useful when elements have grandchildren (and great-grandchildren and so on), but for a flat XML file with just parent-child nodes and no deeper, I end up removing the row - in my column names, by hand, one by one. I have to do this a lot, so I wonder: is there a way to do a global replace on column names?

So this is the workaround I am currently using:

  1. Make sure you have an id column at the beginning (if not just create one using row.index).
  2. Use Transpose to create a key and a value column from the other columns.
  3. Transform the key column (e.g. using replace or other mass operations.
  4. Fill down the id column.
  5. Transpose the key and value columns back to the column based format.

If you have some deeper nesting in your XML data, make sure you keep the empty cells to be able to reconstruct the original layout. There is an option for that in the Transpose dialog.

1 Like

Thanks! It did take me some trial and error. A few additional notes:

  1. You need to be in record mode

  2. I did need to create an ID column. The All-column (with the auto-numbered rows) does not have the Transpose menu item.

  3. The menu item you then need is Transpose -> Transpose cells across columns into rows.

  4. Do not fall into the trap of leaving From Column selected on the ID created in step 1, or the ID value will end in a row, like this (and can't be filled down):

Schermafbeelding 2024-06-25 103718

You want it like this.

Schermafbeelding 2024-06-25 104114

So for From Column, select the row below the ID to transpose from. Reading can be really hard. (Perhaps the ID column shouldn't show up there at all if it makes no sense to select it?)

From there on it's straightforward, just replace (in this case ROW - with nothing), fill down the ID column and then, again Transpose -> Columnize by Key/Value Columns.

Nice!

1 Like