Dear all
We have a dataset (around 750K) obtained from the Medline/Pubmed baseline repository and curated in OpenRefine. It is meant for an experiment with a machine learning framework, which takes training data in the following TSV format: column 1 - text corpus and column 2 - MeSH descriptor(s) (URIs separated by a space).
Before deploying this dataset for training different machine learning backends, we want to arrange them in random order to avoid any biases in sequencing and to prepare a representative dataset of Medline/Pubmed (they have around 3000K bibliographic records upto 2023).
How can we do random organization of rows/records in this dataset in OpenRefine? Our OpenRefine is version 3.8.4 with GoKB extension.
Thank and regards
-Parthasarathi