I need to find out how many quotes in the column ‘quotes” in my document have words with 9 or more letters. This has to happen via “add column based on this column”. Does anyone know what GREL functie I need for this?
A GREL expression could be something like forEach(value.split(' '),v,length(v)) it would take the text, split it by space, and for each words give the length. The result would be an array of number representing the length.
Likewise, you could use forEach(value.split(' '),v,(length(v)>9).toString()).inArray("true") the result will only be “true” or “false” if there is (or not) at least one word longer than 9 letters.
PS: there is probably a more elegant way to do it ![]()
The challenge with counting words is you have to decide what qualifies as a new word. The way it’s done for the built in word text facet is:
value.split(/(?U)[\W]+/)
This will split a value into an array of words based on the regular expression “non-word” characters - this will catch situations where two words are separated only by punctuation (the example from @Nicolas_VIGNERON will only split based on a space character so would miss situations where only punctuation was used). But this also means hyphenated words would count as two words instead of one.
Anyway - whatever expression you decided to use in the split then the you need to iterate over the array to find words >9 in length. I’d suggest
filter(value.split(/(?U)[\W]+/),v,v.length()>8)
This gives you an array for the quote that only contains words of 9 letters or greater. From there you could use a further length check either to get the number of words with 9 letters or greater per quote, or you could just do check that the filtered array is non-zero length to get a true/false output
filter(value.split(/(?U)[\W]+/),v,v.length()>8).length()>0
The output of this would be true = there is at least one word 9 or more letters long in the quote, false = there are no words in the quote that are 9 or more letters long
Thank you to both @Nicolas_VIGNERON and @ostephens for your replies!
I actually happen to have the same assignment as OP; while I imagine the solutions you’ve given would work, we haven’t seen the filter() function or the ?U operator in our class–only more basic Regex/GREL.
One of my attempts to solve this looks like this:
value.match(/.(\b\w{9,}\b)./).toString()
The problem is that it only returns the last instance of a word that is nine letters or more in each cell. For example, if the cell content is:
This life is what you make it. No matter what, you're going to mess up sometimes, it's a universal truth. But the good part is you get to decide how you're going to mess it up. Girls will be your friends - they'll act like it anyway. But just remember, some come, some go. The ones that stay with you through everything - they're your true best friends. Don't let go of them. Also remember, sisters make the best friends in the world. As for lovers, well, they'll come and go too. And baby, I hate to say it, most of them - actually pretty much all of them are going to break your heart, but you can't give up because if you give up, you'll never find your soulmate. You'll never find that half who makes you whole and that goes for everything. Just because you fail once, doesn't mean you're gonna fail at everything. Keep trying, hold on, and always, always, always believe in yourself, because if you don't, then who will, sweetie? So keep your head high, keep your chin up, and most importantly, keep smiling, because life's a beautiful thing and there's so much to smile about.
then my function only returns
[beautiful]
whereas there are other words that are nine letters or longer in the contents (in italic).
My understanding is that the array returned by the match() function will only contain as many elements as there are pairs of parentheses in the pattern given as an argument to the function (in my case, (\b\w{9,}\b) is the part of the pattern that is in parentheses).
As we haven’t seen any recursion in class, I tried just adding that pattern multiple times with the ? operator, like this:
value.match(/.(\b\w{9,}\b)?.*(\b\w{9,}\b)?.*(\b\w{9,}\b).*/).toString()
but that then returns a null object in the array; for the quote above, for example, it returns
[null, null, beautiful]
and if I try different combinations of operators, like this:
value.match(/.(\b\w{9,}\b)?.(\b\w{9,}\b).(\b\w{9,}\b).*/).toString()
then I only get matches for quotes with at least two words of the desired minimum length.
Do you know if there is any combination of operators or a different pattern I could use in order to get a hit for every word in every cell that is nine or more letters long?
I do want to add that I’m not sure it’s necessary for me to get every word: it’s the before-last question in the assignment and the last one only requires filtering on rows that contain one or more words of the specified length. The specific wording of the question is this:
You must first figure out how many quotes contain words of 9 letters or more.
To do so, you need a GREL function. With it, you can add a new column based on the quotes column, which contains the first nine letters of each word with at least nine letters.
I hope you’ll excuse my very long question, especially after you’ve proposed viable solutions.
Thanks in advance!
My god, I'm embarrassed; I just needed to use the find() function instead. Posting in case it helps someone in the future.
