How download images of this site with Rselenium?

Hi community, Im want to download a images with Rselenium.

library(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "chrome")
remDr$open()

remDr$navigate(url = "https://www.genesys-pgr.org/a/images/v2ZW8lQwlep") # specific search and show images.

Im want make click in a images and download, how to do this?

This site has a R library rgenesys but for download images don't exist any function right now.

Tnks!

Hi,

So this code is not fully tested, but I think the gist of this will work:

# for parsing the src:
library(stringr)

# Get all the images:
images <- remDr$findElements(using="tag name", "img")

# create images directory
dir.create("images")

for (image in images) {
  # get the image src
  src <- image$getElementAttribute("src")

  # extract the end of the url and hope it is unique. should be good to go based on a quick inspection
  src_name <- gsub("/", "", str_extract(src, "[^/]+/[^/]+.jpg"))

  # download!
  download.file(src, paste("./images/", src_name, sep=""))
}
2 Likes

Hi @joesho112358 , Im run this code but show this:

Error in download.file(src, paste("./images/", src_name, sep = "")) : 
  invalid 'url' argument

Im try this way but dont download images:

url <- "https://www.genesys-pgr.org/10.18730/PQ4YJ"
destfile <- "C:\\Users\\macosta\\Downloads\\archivo_descargado.png" # Download corrupt images .png 

destfile_2 <- "C:\\Users\\macosta\\Downloads\\archivo_descargado.txt" # This format get the page information.

download.file(url, destfile = destfile)

How fix this for get the images.
Tnks!

Before delving into that, can you see what's wrong with the src that was extracted? IE, the reason for this error?

invalid 'url' argument

As a heads up, I am on Ubuntu and not Windows, so file paths I put probably need to be updated.

offcourse, I'm check each point. The link already exist.

src <- image$getElementAttribute("src");src
[[1]]
[1] "https://cdn.genesys-pgr.org/repository/d/_thumbs/27c/27c35558-af48-4bd7-be4b-3e1563460814/300x300.jpg"
src_name <- gsub("/", "", str_extract(src, "[^/]+/[^/]+.jpg"));src_name
[1] "27c35558-af48-4bd7-be4b-3e1563460814300x300.jpg"

Sorry, but I still don't know if I understand what the problem is. Are you saying that the image at this location:

[1] "https://cdn.genesys-pgr.org/repository/d/_thumbs/27c/27c35558-af48-4bd7-be4b-3e1563460814/300x300.jpg"

was already downloaded and that is throwing an invalid url error?

src and src_name show data, but when run the for loop show the error messages.

I dont know but in download.file the url is wrong.

Ah, so it might not be the first one. Do any files get downloaded? Also, have you tried printing out what the src is as it goes?

for (image in images) {
  src <- image$getElementAttribute("src")
  src_name <- gsub("/", "", str_extract(src, "[^/]+/[^/]+.jpg"))
  print(src)
  print(src_name)
  download.file(src, paste("./images/", src_name, sep=""))
}

Next to run this code, show this:

[[1]]
[1] "https://cdn.genesys-pgr.org/repository/d/_thumbs/27c/27c35558-af48-4bd7-be4b-3e1563460814/300x300.jpg"

[1] "27c35558-af48-4bd7-be4b-3e1563460814300x300.jpg"
Error in download.file(src, paste("./images/", src_name, sep = "")) : 
  invalid 'url' argument

Hi. You're on Windows, right? Shouldn't you have updated it to something like:

download.file(src, paste(".\images\", src_name, sep = ""))

Hi, yes. Im Windows user.

with a ' \ '

download.file(src, paste(".\images\", src_name, sep = ""))
Error: '\i' is an unrecognized escape in character string (<input>:6:31)

with two ' \ \ '

Error in download.file(src, paste(".\\images\\", src_name, sep = "")) : 
  invalid 'url' argument

Im try with "/", but the Error was the same.

And extra was

download.file(src, paste("'.\images\'", src_name, sep = ''))
Error: '\i' is an unrecognized escape in character string (<input>:6:32)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.