Extra backslashes in MARC fields after OpenRefine TSV export

I’m working with MARCEdit and OpenRefine and running into an issue with backslashes.

Here’s my workflow:

  1. I export MARC data from MARCEdit in JSON format using the “Export for OpenRefine” option.

  2. I upload that JSON file into OpenRefine, clean the data, and then export it as a TSV.

  3. I re‑import the TSV back into MARCEdit using the “Import from OpenRefine” option.

The problem: after re‑import, every backslash in the MARC fields is doubled. For example, the correct 020 field should look like:

=020 \\$a9789351611066

But after import it becomes:

=020 \\\\$a9789351611066

Has anyone else encountered this? Is there a way to configure OpenRefine’s TSV export (or MARCEdit’s import)?

I am using OpenRefine version 3.10.0

And my colleague is using version 3.5.1, and in that, we are not getting any error like this.

Does MARCedit really export JSON, but want TSV for import? That seems unusual. Can you tell where the extra backslash escaping is being added? ie is it in the exported TSV?

Yes, when we export from OpenRefine through MARCEdit, it converts the data to JSON. We then upload that JSON file into OpenRefine.

Once we clean the data and export it as a TSV file, we can open it in MARCEdit using the option Import from OpenRefine.

that time We are getting an extra backslash when we export the data as a .TSV file from OpenRefine again

For example

Before cleaning the data in OpenRefine

=LDR 09130nam a2200769Ia 4500
=001 \\$c38638$d38638
=008 231226s9999||||xx\||||||||||||||\||und||
=020 \\$a9789353162511
=040 \\$aRGPV
=041 \\$aEnglish
=082 \\$a621.38$bKAL
=100 \\$aKALSI$92225
=245 \0$aElectronic Instrumentation And Measurements 4e
=260 \\$bMc Graw Hill$c2019$aNew Delhi
=300 \\$a994$e1

After cleaning in OpenRefine and exporting.TSV file and open it again in MarcEdit

=LDR 09130nam a2200769Ia 4500
=001 \\\\$c38638$d38638
=008 231226s9999||||xx\\||||||||||||||\\||und||
=020 \\\\$a9789353162511
=040 \\\\$aRGPV
=041 \\\\$aEnglish
=082 \\\\$a621.38$bKAL
=100 \\\\$aKALSI$92225
=245 \\0$aElectronic Instrumentation And Measurements 4e
=260 \\\\$bMc Graw Hill$c2019$aNew Delhi
=300 \\\\$a994$e1

You may see the Difference

I have checked the .TSV file in Notepad++ as well, and the same error is appearing there as well.

I have checked the .TSV file in Notepad++ as well, and the same error is appearing there as well.

Yes, this unfortunately seems to be a behavior change with the new exporter that we adopted in 2023 and it's something that it doesn't offer an option to control. It always escapes TAB, CR, LF, and backslash.

I've created issue #7704 to track this and created a patch for it.

Tom