Hi folks,
I was looking at the rvest package and tried to use it to scrape this table from Wikipedia List of 5G NR networks - Wikipedia
Here is my code
library(rvest)
library(tidyverse)
url <- c("https://en.wikipedia.org/wiki/List_of_5G_NR_networks")
wikipage <- read_html(url)
data_table <- html_nodes(wikipage, "table")
# pick the second table using pluck from purrr
nr_table <- data_table %>%
html_table(header = TRUE) %>%
purrr::pluck(2)
# fill is now deprecated according to the documentation, so I left the argument (Fill =TRUE) out of the html_table() function.
# when I view the imported table, i notice some observations are in the wrong column. For example Vodafone is now under the country or territory column the same as "n5: 10 MHz(Mar 2021)". This is not correct.
View(nr_table)
How can I make sure rvest's read_html () function preserves the table structure and observations do not get shifted to the wrong column(variable)?
Thanks in advance.