Data in a dictionary of unigrams, bigrams, and trigrams was generated here:
dat<- data.frame(word1=c( "will", "like","get", "look", "next", "social",
"cinco_de","manufacturer_custom", "custom_built"), word2=c(" ", " ", " ",
"like","week", "media", "mayo", "built", "painted"), frequency = c( 5153, 5081, 4821,
559, 478,465, 172,171,171 ) )
Here is a function to look up words in the dictionary
library(dplyr)
nxtword1<- function(word){dat %>% filter(word1 == word) %>% select(-word1)}
Here is a function to change ngram to n-1 gram
library(stringr)
less1gram <- function(x){str_replace(x, "^[^_ ]+_", "")
I tested these functions, and they were okay.
The purpose of the following code is to look up a text string in the dictionary. If no match is found, the text string will be shortened, then looked up again.
match <- nxtword1("new_manufacturer_custom")
if (nrow(match) > 0) {
print(match)
} else (nxtword1(less1gram("new_manufacturer_custom")))
The code worked correctly when I typed in “new_manufacturer_custom”, which wasn’t in the dictionary.
match<- nxtword1("new_manufacturer_custom")
> if(nrow(match)>0){print(match)
+ } else(nxtword1(less1gram("new_manufacturer_custom")))
word2 frequency
1 built 171
Next I put the code into a function.
match <- function(phrase){
nxtword1(phrase)
if (nrow(match) > 0){
print(match)
} else {
(nxtword1(less1gram(phrase)))
}
}
Typing the function resulted in an error message:
match<- function(phrase){nxtword1(phrase)
+ if(nrow(match)>0){print(match)
+ } else{(nxtword1(less1gram(phrase)))
+ }
+ }
> match("old_manufacturer_custom")
Error in if (nrow(match) > 0) { : argument is of length zero
Why didn't the code work when it was made into a function?