Best practices with SKOS files


I'm interested in best practices working with SKOS files with OR. Any suggestion?

Thanks in advance.

Hi @olea - are you trying to do something directly to the SKOS files in OpenRefine or are you trying to match/reconcile data in an OpenRefine to data in the SKOS?

First I've been interested in how practical OR can be processing the content of SKOS files.

Now I curious about the matching task. How people get it done? Setting up an ad hoc reconciliation service?

You should be able to import SKOS in various RDF formats into OpenRefine - rdf/xml, ttl, nt etc. Generally the more compact the syntax, the smaller the file, and the easier to import into OR (i.e. you may find that nt is more efficient than rdf/xml for the same SKOS when importing to OR

Once in OpenRefine it's likely that a single subject will have multiple related rows in OR - so you will want to work in Records mode to keep rows together

I can't say I've got much experience beyond that and it might depend on the SKOS you are working with, the properties included and what task you are trying to complete. Essentially each property should end up as a column which means the column names can be a bit verbose e.g. "SKOS Simple Knowledge Organization System Namespace Document - HTML Variant, 18 August 2009 Recommendation Edition"

I will say that the OpenRefine record mode isn't perfect - you have to understand the structure of the SKOS and how that's come into OR - essentially don't treat each 'row' as a set of connected statements - they may not be

In terms of matching - I asked because I was wondering about this thread RDF extension missing clear matches

Also worth mentioning regarding SKOS and matching: SkoHub, and in particular its SkoHub Reconcile part (which seems to be down at the moment, but I am sure it's temporary):