Is there a way to tokenize sentences and iterate 'div' and 'withTags' to a text?

I want to iterate 'div' and 'withTags' to a text file after tokenizing the file . I have been able to iterate with div and withTags to untokenized text . However I am getting error when I iterate div & withTags after tokenizing it . Is there any way out of it ?

text<- "It's unclear what Florida victory McEnany and Trump were referring to, but last month, the Supreme Court rejected an emergency petition for Democrat-led attempts in Florida to overturn a law preventing former felons from voting if they haven't paid all their fines or restitution. That is not necessarily related to mail-in balloting.
 Meanwhile, the President's team is fighting to preserve aspects of mail-in voting they hope will offer Trump a strategic advantage."
library(htmltools)
library(tokenizers)
library(quanteda)

b <- tokenize_sentences(text)

trial <- b

buslist <- lapply(
  seq_along(trial),
  function(x, k){
    bus <- withTags(div(id = k,x[k]))
    return(bus)},
  x = trial)

names(buslist) <- q

getting this error :

`Error in writeImpl(text) : 
  Text to be written must be a length-one character vector
> `

Hi,

Based on this post, it seems that the issue lies with supplying a vector instead of a character string to the div function.

The cause is the tokenize_sentences, which returns a list containing vectors, instead of just vectors. When using lapply, you supply the three sentenced at once, because they are grouped in one sublist.

To solve this, simply unlist the results of tokenize_sentences

b <- unlist(tokenize_sentences(text))

Hope this helps,
PJ

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.