Selecting what a function will print out

Quack · July 22, 2019, 9:15pm

My n-gram dictionary looks like this

word1             word2        frequency  nprefix

during the day 1206 2
during the christmas 566 2
during the pm 480 2
during the recovery 440 2
during the night 406 2
during the dayi 395 2
during the month 373 2
during the weekend 321 2
during the campaign 239 2
during the scottish 217 2

Here nprefix is the number of words in word1.

My function was designed to match a typed-in phrase to the dictionary. Once a match was found, it would look up the next word, then print it out. If a match was not found, the phrase would be shortened by dropping off the first word, then looking it up again.
On occasions, there were more than one choices of next word. Before printing out, the next words were ranked by the frequency that they occurred in the training data. Here are the functions used

#Look up the next word of the phrase
nxtword1<- function(word){dat %>% filter(word1 == word) %>% select(-word1)}
##function to change ngram to n-1 gram
less1gram <- function(x){str_replace(x, "^[^ ]+ ", "")
}


whatsnext1 <- function(phrase){
   nwords <-str_count(phrase, pattern=" ")
   while (!(phrase %in% dat$word1)  && nwords >=1){
    phrase<- less1gram(phrase)
    nwords<-str_count(phrase, pattern=" ")
    print(nxtword1(phrase)[1:5, 1],col.names=FALSE)
    
   }
 }

I tried to complete this phrase

whatsnext1("the faith during the")

When an attempt was made to print out the five top word choices, an extra row of NA’s would appear on the top.

[1] NA NA NA NA NA
[1] "day" "christmas" "pm" "recovery" "night"

How can I get rid of the row of NA’s

cderv · July 23, 2019, 6:16am

You will print something at each iterations. when not match is found nxtword1 would select nothing, you should have an empty table. But you select [1:5, 1] on this empty table, so it will coerce to NA I guess.

Hope it helps.

Quack · July 23, 2019, 3:37pm

I fixed it after heeding your advice. I moved the print() to outside the while loop.

whatsnext5 <- function(phrase){
  nwords <-str_count(phrase, pattern=" ")
  while (!(phrase %in% dat$word1)  && nwords >=1){
    phrase<- less1gram(phrase)
    nwords<-str_count(phrase, pattern=" ")
    answer<-nxtword1(phrase)}
    print(answer[1:5, 1])
    # print(nxtword1(phrase)[1:5, 1]) 
      }

The output was:

whatsnext5("then you must be")
[1] "short" "postdictable" "what" "killed" "factored"

Thank  you :)

cderv · July 23, 2019, 4:19pm

Glad it worked out !
If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

system · July 30, 2019, 4:19pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.