I am attempting to scrape the webpage Kinotoppen 2.0 - Filmweb for title and other information under each movie. For other webpages I have been fine with running a couple of lines with html_nodes() and html_text() using SelectorGadget to pick the CSS selectors to get the different things I wanted as such:
html <- read_html("https://www.filmweb.no/kinotoppen/")
title <- html %>%
html_nodes(".Kinotoppen_MovieTitle__2MFbT") %>%
html_text()
However, when running those lines on this webpage I only get an empty character vector. Upon inspecting the webpage further I see that it is calling on javascripts.
I tried using html_nodes("script") together with the v8 library to run the javascripts, but to no avail.
What is the best way to deal with javascript-rendered webpages so that I can scrape data with rvest?
Thank you for the link.
The development of PhantomJS has been suspended, according to the website.
The solution (with help from StackOverflow) was to use RSelenium to automate the web browser.
A general solution: