Data cleaning (emoji)

Hi there I’m analyzing some Twitter data and usually use regexp to transform text.
For example I use this to delete all words in a column except hashtags.
value.replace(/(?<!\S)[^#]\S*/, “”)

Could someone suggest me a similar one for deleting all text except emojis?

Thanks a lot

For deleting all text except emojis I would go with the “other symbol” category (So) and negate the category using uppercase P.

value.replace(/\P{So}+/, "")

For details on unicode categories read Unicode Categories on Regular-Expressions.info.

1 Like

Many thanks. It works!