Problems importing from JSON and RDF files

I’m having problems creating a project with JSON and RDF sample files. Both data in those files is the same. They are the export of a mediawiki website (https://tunearch.org/) I would like to work with to upload their data to WD. I already have their permission.

I attach a zip with the sample files.

The problem with both files is that they don’t create the right project, which would be only seven records, whichever record path I specify on the json file. There is nothing I can set on the RDF file, but it has the same problem. I wonder if there is a problem with their format.

The imported records should be these, but the projects created are anything but these records:

I would appreciate if someone can confirm he/she can create a project with those files or has the problem as me.

Thank you very much.

sample-files.zip (3.8 KB)

I can confirm that I see the same problem with the JSON. The problems is caused bythe use of the tunes name as a JSON property, rather than a consistent property that is the same for each tune. e.g. we see:

 "results": {
        "Green Joke (The)": {
            "printouts": {
... 
 },
        "Harp that in Darkness": {
            "printouts": {
                "Also known as": [
etc.

rather than something more like:

 "results": {
        "tune": {
            "name": "Green Joke (The)",
            "printouts": {
... 
 },
        "tune": {
            "name": "Harp that in Darkness",
            "printouts": {
                "Also known as": [
etc.

The use of a different property for each entry in the results object means that there isn’t a way for OpenRefine to understand the structure correctly.

However, the RDF basically works for me. I get 7 records plus some lines generated from the
<owl:DatatypeProperty> and <owl:ObjectProperty> statements. The rest looks, at a glance, OK to me - but as I’m not familiar with the data I could easily be missing something.

How does the project created by the RDF import look to you?

Thanks, @ostephens. I realized that the json had that problem. I see in the RDF also the extra entries you mention, which is quite messy. At least I can confirm it’s not me but the files, which will not be of much use. A simple CSV is much more useful for now.

What it seems odd to me is that those files are generated by the Semantic Mediawiki extension, and I supposed they would be compatible with OpenRefine.

Although there is good integration between OpenRefine and various mediawiki platforms, they are completely separate projects - OpenRefine is not a mediawiki project. So as far as I know there's never been any discussion of how OpenRefine should work with the Semantic Mediawiki extension. I hadn't come across this extension until you mentioned it here so I don't really know how it works, but a brief glance does suggest that there are quite a lot of configuration options for the extension - so possibly the way it's been configured for tunearch is also an issue - but I really don't know :frowning:

Is there a SPARQL endpoint for the traditional tune archive?

No, there is not. It’s provably a matter of configuration. Thanks, anyway. I’ll use the cvs files.

If you are not afraid of the Go programming language, you could try ojg which is a extremely fast Json parser that recently landed a modify() function in its JsonPath expressions. jp feature request: Set() that only replaces existing values · Issue #99 · ohler55/ojg (github.com)

You could ask Peter in an issue if/how it might help your situation to move things around.

1 Like