Scraping event details from web page?

Can anyone help me figure out how to use rvest to scrape the details of the events listed in this web page and return them in a data frame with one row per event?

After using the Inspector Gadget to find what I thought was the required file path, I tried the following to drill down to the individual events, but it's returning a null set.

library(tidyverse)
library(rvest)

marches <- read_html("https://map.womensmarch.com/?source=website")

events <- marches %>% html_nodes("event-list-item")

I also asked this question on Stack Overflow (see here), and the solution I got there took advantage of the fact that the event feed on that page is loaded via javascript from a json file, a link to which can be found by inspecting the page source. So:

library(jsonlite)

feed <- fromJSON("https://zen-hypatia-739ed6.netlify.app/feed")
dat <- feed$events

str(dat)

'data.frame':   313 obs. of  22 variables:
 $ id                      : int  78 404 260 224 286 108 187 265 326 334 ...
 $ public_description      : chr  "Meet up with signs for:\r\nVote  Biden, protect rights of people with disabilities,  protect Roe VS Wade, prote"| __truncated__ "The womxn of the Oceti Sakowin, the Seven Sacred Council Fires of the Great Sioux Nation are marching to the po"| __truncated__ "As part of Worcester County's regular Blue Honk and Wave sign holding event (every Friday until the election), "| __truncated__ "Standout for Social Justice \r\nWear Mask \r\nMaintain physical distance of at least 6 feet\r\nBring your signs"| __truncated__ ...
 $ campaign                : chr  "oct-17-march" "oct-17-march" "oct-17-march" "oct-17-march" ...
 $ lat                     : num  42.4 44.1 38.3 42.3 40.8 ...
 $ lng                     : num  -71.1 -103.2 -75.1 -71.4 -111.9 ...
 $ title                   : chr  "Get Up, Stand Up - Stand Up for Your Rights!" "Oceti Sakwin Womxn’s March 2020" "Honor RBG and Stand for Democracy" "Social Justice" ...
 $ event_doors_open_at     : logi  NA NA NA NA NA NA ...
 $ venue                   : chr  "Public island at a major 4 way stop. Intersection of North Harvard St and Western Ave Boston MA 02134" "Zoom webinar. https://aclu.zoom.us/j/5351676736 Rapid City SD 57701" "West Ocean City Park and Ride. 12940 Inlet Isle Lane Ocean City MD 21842" "Rt126 x Rt135. Rt126 x Rt135 Framingham MA 01702" ...
 $ hasCapacity             : int  1 1 1 1 1 1 1 1 1 1 ...
 $ city                    : chr  "Boston" "Rapid City" "Ocean City" "Framingham" ...
 $ state                   : chr  "MA" "SD" "MD" "MA" ...
 $ zip                     : chr  "02134" "57701" "21842" "01702" ...
 $ start_datetime          : chr  "2020-10-16 11:00:00.000000" "2020-10-16 10:00:00.000000" "2020-10-16 15:00:00.000000" "2020-10-16 17:00:00.000000" ...
 $ starts_at_utc           : chr  "2020-10-16 15:00:00.000000" "2020-10-16 16:00:00.000000" "2020-10-16 19:00:00.000000" "2020-10-16 21:00:00.000000" ...
 $ end_datetime            : logi  NA NA NA NA NA NA ...
 $ categories              : chr  "oct-17-march" "oct-17-march" "oct-17-march" "oct-17-march" ...
 $ event_is_virtual        : int  0 1 0 0 0 0 0 0 0 0 ...
 $ is_official             : int  0 0 0 0 0 0 0 0 0 0 ...
 $ is_team                 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ url                     : chr  "https://act.womensmarch.org/event/oct-17-march/78/" "https://act.womensmarch.org/event/oct-17-march/404/" "https://act.womensmarch.org/event/oct-17-march/260/" "https://act.womensmarch.org/event/oct-17-march/224/" ...
 $ start_datetime_formatted: chr  "Friday Oct 16 11:00 AM" "Friday Oct 16 10:00 AM" "Friday Oct 16 3:00 PM" "Friday Oct 16 5:00 PM" ...
 $ end_datetime_formatted  : logi  NA NA NA NA NA NA ...

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.