Since the release there is not a lot talk about this feature. Also no user questions in the forum. Actually the feature was broken from the first release and nobody noticed for quite some time.
As I personally really enjoy this feature I suggest to give short presentation on what to do with this feature and then discuss on what is missing.
Format
Guided workshop. Probably 30-45 minutes long.
Session goals
Learn about this new feature and identify use cases.
The new clustering feature introduced in OpenRefine 3.9 allows users to define custom clustering functions. It is possible to combine several clustering algorithms into a single function to speed up the process, for example:
fingerprint(value).ngramfingerprint(value)
Custom clustering functions can also use standard expressions. For example, the replace() function can remove terms that may interfere with clustering.Example use case: removing common words such as “Place” or “Street” when clustering place names.
Related tutorials and resources
Examples and tutorials related to clustering in German:
Participants also discussed calling external services to expand functionality. One approach is to call external functions via FastAPI using Jython in OpenRefine. For example