One of OpenRefine's most useful basic features is the 'record' mode, which allows you to replicate nested structures. However, when I try to export to JSON format using the 'templating' output menu, I lose this hierarchy. I can invoke it to populate a field (row.record.cells), but each object in the output file is flat, one object per row:
{
"rows : [
{
"key_1” : "value of column 1 of row 1",
"key_2” : "value of column 2 of row 1",
"key_3” : "value of column 3 in row 1",
"key_4": "value of column 4 in row 1"
]
},
{
"rows” : [
{
"key_1” : "value of column 1 of row 2 (which is empty)",
"key_2” : "value of column 2 in row 2 (which is empty)",
"key_3": "value of column 3 in line 2",
"key_4": "value of column 4 in line 2"
]
}
Whereas the structure I need would look like this:
{
"records” : [
{
"key_1” : "value of column 1 of row 1",
"key_2” : "value of column 2 of row 1",
"parentKey" : [
{
"key_3” : "value of column 3 in row 1",
"key_4” : "value of column 4 in line 1"
},
{
"key_3” : "value of column 3 in row 2",
"key_4": "value of column 4 in row 2"
}
]
]
}
As I said, I can invoke the 'record' index to populate the missing values in the second line with the values in the first line, but I still get two objects.
It's probably not evident, but you'll need to jsonize (sometimes/sometimes not) the output from the overlay model of row.record.cells output depending on the hierarchy (formatting, arrays, etc.) you are wanting.
Simple example where my first column "A" is the record identifier/key:
Actually, I always forget about the most obvious workaround...
Jsonize and build the JSON you want inside a new OpenRefine column in the datagrid, then export and select only that column? But more pain/less pain, dunno?
Hey @Thadguidry,
Sorry for the delay in responding!
Thanks again for the prompt responses and helpful suggestions.
In between bites of turkey, I finally explored your suggestions for tackling this challenging task. Selective use of jsonize does indeed allow me to mimic nesting a bit more, but I get several rows with identical content, not true nesting. Is this something I'm doing wrong?
As suggested, I've added a thumbs-up to @answerquest's “feature request” (from dec. 2018!) Indeed, a JSON exporter would be ideal.
I considered building the JSON object into the main interface, — that's what I've done in the past. In this case, with three dozen columns, and a three-level hierarchy, multiple properties to a single nested object, I admit I'm afraid I'll get tangled up and make mistakes.
In the end, I resorted to a Python script (in a notebook) to get my objects nested correctly, relying on a unique identifier produced after some careful ordering in OpenRefine.
A solution within OpenRefine is always preferable, of course.