Can't extract hrefs from table

I'm trying to get the links for the 2 tables in this webpage:
https://www.ipdb.org/lists.cgi?puid=43799&browser=1694266618&list=top300

For some reason neither css selector or xpath seems to pull the tables so I've had to first pull the tables into a list, then extract each. That's got me the tables. But I can't get the links. I've tried various combinations of selector. If I include the 'a' part of the selector I get nothing; it's as if rvest can't see the links on this page. Or, more likely, I'm doing something wrong.

My code is below. Any help will be greatly appreciated because I've run out of ideas.

url <- "https://www.ipdb.org/lists.cgi?anonymously=true&list=top300"
download.file(url, destfile = 'machines_top_300.html')
page <- read_html("machines_top_300.html")

# put the 3 tables into a list
top_300_tables <- page %>%
  html_nodes(xpath = "//table[.//th[contains(., 'Rank')]]")

# Get the table for the electronic machines (list item 2)
machines_top_300_electronic <- 
  top_300_tables[2] %>%
  html_table(fill = TRUE) %>% 
  as.data.frame() %>% 
  mutate(Category = "Electronic")

# Get the hrefs for the electronic machines
machines_top_300_electronic_links <- 
  top_300_tables[2] %>%
  html_nodes('tr > td:nth-child(3) > a') %>% 
  html_attr('href')

The page you wrote the URL to, is different to the one you screenshot.
There are no links when I look at https://www.ipdb.org/lists.cgi?anonymously=true&list=top300
Perhaps you see a different view when you log in non anonymously ?

1 Like

Oh my goodness thank you! I was inspecting the online content and seeing links, and had assumed that's what I'd downloaded. But the downloaded page didn't have links. So of course R wasn't finding any. I downloaded the logged-in version and now have the links I needed.

I feel very dopey for not spotting my error.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.