htmlParse with multiple html files

Hey there,
is there any way to use htmlParse with multiple html files? Got a folder with a bunch of files but I´m just getting the file names the moment I´m trying to use my list of files. The code Works fine on a single file.
files<-list.files(path = ".", recursive = TRUE,pattern = "\.html$", full.names = TRUE)

doc = htmlParse(filenames, asText=TRUE)
plain.text <- xpathSApply(doc, "//p", xmlValue)
plain.text<-gsub("Â|\n","",plain.text)
stri_remove_empty(plain.text, na_empty = FALSE)
cat(paste(plain.text, collapse = "\n"))

Result:
C:/Users/richard dobler/OneDrive/Desktop/38079/38079_10-Q_2006-05-10_0001104659-06-033149.htmlC:/Users/richard dobler/OneDrive/Desktop/38079/38079_10-Q_2006-08-09_0001104659-06-053129.htmlC:/Users/richard dobler/OneDrive/Desktop/38079/38079_10-Q_2006-11-09_0001104659-06-073607.html

You know how to handle one file.
Now create a function fun that does that.
Then use that function on your list html_files by specifying:

lapply(html_files,fun)

The result will be a list with the result of fun 'applied' to each htmlfile from the list.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.