Adding error reporting with Sentry

As I am testing the new architecture before a 4.0-alpha2 release, I often find myself in situations where some errors are thrown in the backend or frontend, which I then fix. When this new version gets tested more extensively by other people, more of those errors will inevitably happen and I would like to streamline the process of reporting those.

As proposed in #4332, I think it would be really useful to have an automated error-reporting mechanism.

It would work like this:

  • In the frontend, we would set up a generic handler to catch any uncaught exception or any failing HTTP request to the backend.
  • When such an event happens, we would show a dialog notifying about the error and offering to report it
  • In a dedicated view (hidden by default), the user would be able to see all the information included in the report (exception details, stack traces, server logs, system information?). We could also consider letting them disable some of those.
  • If the user chooses to report the issue, the report would be submitted to Sentry (hosted on sentry.io, using a free plan for open source projects)

To avoid being locked in by a provider, we could consider making the Sentry instance URL configurable, for instance by storing it in a JSON file hosted on openrefine.org, which we could update later on if we want to migrate to a different provider. (For instance, Sentry can be self-hosted.)

I am not sure which criterion we would use to decide who has access to the Sentry instance. Error reports can include sensitive information (even though we can try to set up some anonymizing heuristics - they never work perfectly). But in my experience, reviewing the reports that come in is quite some work so I would be keen to spread the load on trusted contributors who would be interested in this work.

On top of this, we could consider making it possible to generate a report even if no exception was thrown anywhere, for instance to share it when submitting a support request on the forum or a GitHub issue. This would be particularly useful on MacOS, where accessing the logs manually is quite difficult. Making this possible would address an important part of #5557 (I don’t think we can/want to automate the submission of the generated report to the forum).

In 4.0, have done some preliminary work to improve the error reporting in the backend. For instance, making sure the commands have meaningful HTTP status codes, which makes it a lot easier to detect when a problem happened in the frontend.

What do you think?

I think we should rather implement a open metrics protocol of some kind. Sentry might be open source but they play the lock-in card.

Ok… do you have more info to share about this lock-in? Are there any alternatives you would recommend? Is this open metrics protocol a custom submission system we would implement ourselves?
So far we don’t have any server for the project where we can run things, so going for existing hosted solutions is generally quite tempting.

Sentry defaults to it’s own protocol and while it’s open source it’s a massive pain to self-host given their focus on SaaS and enterprise. OpenTelementry and OpenMetrics are two open alternatives protocol-wise and at least OpenTelementry appears to be supported by most vendors.

I’m very supportive of the broader use case and would be rather interested in it for our own use, and by supporting something like OpenTelementry I would be able to point OpenRefine to our existing Prometheus instance and be good to go.