I have a basic question about cleaning data.
I’d like to use my list of correct product names to clean a data set, which includes a product name column with many typos and other errors, but also some completely new products, which I’d like to add to my list at the end ot the process.
Ideally, I would build a facets JSON file with the growing “from” lists for each my “to” item.
Open Refine 3.7.2
I think you can create a new column where you will apply the cleaning. Then, you can use the “Edit cells” → “Common transforms” → “Replace” function to replace the incorrect product names with the correct ones. You can use regular expressions to match variations of the same product name.
Isn’t a reconcile feature good for this?
The reconciliation feature works fine too but you would still need to create a new column in your data set where you will apply the reconciliation.
Then, you can use the “Edit cells” → “Reconcile” function to match the values in your data set
However, whatever approach you use generally depends on the specific needs of your project