I have a spreadsheet of attendees from an event registration that I'm trying to clean up. Each ticket that an attendee may have purchased gets exported as a row. I'm trying to transform the sheet so that instead of having multiple rows for one person's different tickets, there is one row per attendee with their different ticket types listed in columns across one row. But also, some people only have one ticket type and therefore one row. I want to maintain the additional information in the other columns.
So for example, I want there to be one Kevin McAleese row that has columns saying I have a ticket for "Thursday" and "Friday & Saturday," plus columns for pronouns, WORC, etc.
I'm new to OpenRefine and I'm getting lost with the transposing... Any help?
Welcome to the OpenRefine community! I don't think you need to worry about transpose. Here's a quick sketch of way to do what you want:
- Create a Full Name column from First Name + " " + "Last Name"
- Duplicate the Full Name column as Normalized Name
- Move Normalized Name to the far left
- Deduplication Normalized Name by using clustering (steps 2-4 unnecessary if you're sure the names all have correct and consistent spellings)
- Blank Down on Normalized Name (or Full Name is skipping 2-4)
- Join Multivalued Cells on Registration Column using an appropriate separator character which won't conflict with your data
- Join Multivalued Cells on the remaining columns
- Optionally uniquify your list(s) by using a transform like value.split('|').uniques().join('|') (replacing | with your actual separator character)
I hope this sketch helps you get started.