Can anyone help me figure out how to use rvest to scrape the details of the events listed in this web page and return them in a data frame with one row per event?
After using the Inspector Gadget to find what I thought was the required file path, I tried the following to drill down to the individual events, but it's returning a null set.
I also asked this question on Stack Overflow (see here), and the solution I got there took advantage of the fact that the event feed on that page is loaded via javascript from a json file, a link to which can be found by inspecting the page source. So:
library(jsonlite)
feed <- fromJSON("https://zen-hypatia-739ed6.netlify.app/feed")
dat <- feed$events
str(dat)
'data.frame': 313 obs. of 22 variables:
$ id : int 78 404 260 224 286 108 187 265 326 334 ...
$ public_description : chr "Meet up with signs for:\r\nVote Biden, protect rights of people with disabilities, protect Roe VS Wade, prote"| __truncated__ "The womxn of the Oceti Sakowin, the Seven Sacred Council Fires of the Great Sioux Nation are marching to the po"| __truncated__ "As part of Worcester County's regular Blue Honk and Wave sign holding event (every Friday until the election), "| __truncated__ "Standout for Social Justice \r\nWear Mask \r\nMaintain physical distance of at least 6 feet\r\nBring your signs"| __truncated__ ...
$ campaign : chr "oct-17-march" "oct-17-march" "oct-17-march" "oct-17-march" ...
$ lat : num 42.4 44.1 38.3 42.3 40.8 ...
$ lng : num -71.1 -103.2 -75.1 -71.4 -111.9 ...
$ title : chr "Get Up, Stand Up - Stand Up for Your Rights!" "Oceti Sakwin Womxn’s March 2020" "Honor RBG and Stand for Democracy" "Social Justice" ...
$ event_doors_open_at : logi NA NA NA NA NA NA ...
$ venue : chr "Public island at a major 4 way stop. Intersection of North Harvard St and Western Ave Boston MA 02134" "Zoom webinar. https://aclu.zoom.us/j/5351676736 Rapid City SD 57701" "West Ocean City Park and Ride. 12940 Inlet Isle Lane Ocean City MD 21842" "Rt126 x Rt135. Rt126 x Rt135 Framingham MA 01702" ...
$ hasCapacity : int 1 1 1 1 1 1 1 1 1 1 ...
$ city : chr "Boston" "Rapid City" "Ocean City" "Framingham" ...
$ state : chr "MA" "SD" "MD" "MA" ...
$ zip : chr "02134" "57701" "21842" "01702" ...
$ start_datetime : chr "2020-10-16 11:00:00.000000" "2020-10-16 10:00:00.000000" "2020-10-16 15:00:00.000000" "2020-10-16 17:00:00.000000" ...
$ starts_at_utc : chr "2020-10-16 15:00:00.000000" "2020-10-16 16:00:00.000000" "2020-10-16 19:00:00.000000" "2020-10-16 21:00:00.000000" ...
$ end_datetime : logi NA NA NA NA NA NA ...
$ categories : chr "oct-17-march" "oct-17-march" "oct-17-march" "oct-17-march" ...
$ event_is_virtual : int 0 1 0 0 0 0 0 0 0 0 ...
$ is_official : int 0 0 0 0 0 0 0 0 0 0 ...
$ is_team : int 0 0 0 0 0 0 0 0 0 0 ...
$ url : chr "https://act.womensmarch.org/event/oct-17-march/78/" "https://act.womensmarch.org/event/oct-17-march/404/" "https://act.womensmarch.org/event/oct-17-march/260/" "https://act.womensmarch.org/event/oct-17-march/224/" ...
$ start_datetime_formatted: chr "Friday Oct 16 11:00 AM" "Friday Oct 16 10:00 AM" "Friday Oct 16 3:00 PM" "Friday Oct 16 5:00 PM" ...
$ end_datetime_formatted : logi NA NA NA NA NA NA ...