So the reconciliation service will take the column that you use for reconciling as basic search tokens.
It is up to the reconciliation service where it will search for this tokens and how the results are scored.
The additional columns are used for either filtering and/or boosting the results.
This also depends on the reconciliation service.
The numbers (scores) returned may also differ from each reconciliation server.
- Some attempt to give a score between 0 and 100 with 100 meaning a perfect match.
- Some will return numbers between 0 and 1.0 where 1 is a perfect match.
- Some will return some arbitrary numbers between 0 and 200 where generally bigger is better.
Column Name:
In the column name I just used the name to reconcile against Wikidata.
Note that the painter has a score of 83 because my search token matches very good (but not perfectly) with the title/name of the wikidata object.
The other items have a low score around 30, because the search tokens "Rosaria" and "Quesada" can be found somewhere in the Wikidata object. But the fields where they occur are somehow considered not that important.
Column url:
In the column url I just used the url to reconcile against Wikidata.
As a result I only receive the painters Wikidata url as result, because it is the only one where a match could be found. But again the score is quite low, because my search token (the url) and the title/name of the Wikidata objekt are quite different.
Column combined:
In the column combined I used the name "Rosario Quesada" as search token but added the column "url" as property P973 for boosting my search results.
Therefore the scores of the painters result is boosted (from 83 to 88) and the scores of the other items are lowered.
Using SPARQL:
So the reconciliation protocol is used to somehow search a database which sometimes works like a black box with unknown rules.
For defining a query like: Return the Wikidata Item where P973 equals url
SPARQL is way more efficient. For this you could use the Wikidata Query Service and the Add column by fetchung urls feature in OpenRefine.
"https://query.wikidata.org/sparql?query=SELECT%20DISTINCT%20%3Fitem%20WHERE%20%7B%0A%20%20%3Fitem%20p%3AP973%20%3Fstatement0.%0A%20%20%3Fstatement0%20(ps%3AP973)%20%3C"+escape(value, "url")+"%3E.%0A%7D%0ALIMIT%201"
Please use a reasonable value for throttling as not to overload the query service with automated requests via OpenRefine. You will receive XML as a result, where you will have to parse the Wikidata QID e.g. via a regular expression like (?<=http:\/\/www\.wikidata\.org\/entity\/)Q\d+
.
Example response:
<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
<head>
<variable name='item'/>
</head>
<results>
<result>
<binding name='item'>
<uri>http://www.wikidata.org/entity/Q23902679</uri>
</binding>
</result>
</results>
</sparql>