OpenRefine is designed to run on a local machine and our documentation about running it as a remote server features some prominent warnings:
At the moment I see:
- quite some interest from users in hosted versions: for instance, the documentation about OpenRefine on Wikidata recommends to run OpenRefine on PAWS, a cloud hosting service of the Wikimedia movement
- some tensions in the dev team about how to approach this topic: whether we could officially support this use case, whether the impact of a particular change in a hosted use case should be taken into account, whether to accept a security advisory as a vulnerability, and so on.
From my perspective, we are struggling to find the right balance between meeting users' interest by advertising OpenRefine as a tool that can be hosted, and keeping users safe by warning them about the security implications of running a tool in an environment it is not designed for.
I think it is worth discussing this topic as an attempt to improve the situation.
Our documentation on this topic is not really satisfactory: we do not do a good job at explaining what sort of problems come with running a hosted OpenRefine. Also, I think it is worth not putting all use cases in the same bag. I see two aspects of the problem:
- running OpenRefine's server on a different machine than the one the browser used to access it is running
- having different users access the same OpenRefine instance, potentially concurrently
In my opinion, both of those come with their own issues, but those are fairly distinct. Do you see other useful distinctions to make between various "hosted" use cases?
I would like to document the known issues about using OpenRefine in those different contexts and improve the documentation accordingly. Once that is done, it would provide a good basis to classify security vulnerabilities, because those who fall into the scope of those known issues can be rejected accordingly.
Then, I would like us to reach an agreement on what sorts of use cases we are interested in eventually supporting, meaning that we welcome improvements that address usability or security issues in those contexts.
If you agree with the approach, let's start mapping the different sorts of hosted use cases and the issues we are aware of for them.