Hi community, Im want to download a images with Rselenium.
library(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "chrome")
remDr$open()
remDr$navigate(url = "https://www.genesys-pgr.org/a/images/v2ZW8lQwlep") # specific search and show images.
Im want make click in a images and download, how to do this?
This site has a R library rgenesys
but for download images don't exist any function right now.
Tnks!
Hi,
So this code is not fully tested, but I think the gist of this will work:
# for parsing the src:
library(stringr)
# Get all the images:
images <- remDr$findElements(using="tag name", "img")
# create images directory
dir.create("images")
for (image in images) {
# get the image src
src <- image$getElementAttribute("src")
# extract the end of the url and hope it is unique. should be good to go based on a quick inspection
src_name <- gsub("/", "", str_extract(src, "[^/]+/[^/]+.jpg"))
# download!
download.file(src, paste("./images/", src_name, sep=""))
}
2 Likes
Hi @joesho112358 , Im run this code but show this:
Error in download.file(src, paste("./images/", src_name, sep = "")) :
invalid 'url' argument
Im try this way but dont download images:
url <- "https://www.genesys-pgr.org/10.18730/PQ4YJ"
destfile <- "C:\\Users\\macosta\\Downloads\\archivo_descargado.png" # Download corrupt images .png
destfile_2 <- "C:\\Users\\macosta\\Downloads\\archivo_descargado.txt" # This format get the page information.
download.file(url, destfile = destfile)
How fix this for get the images.
Tnks!
Before delving into that, can you see what's wrong with the src
that was extracted? IE, the reason for this error?
invalid 'url' argument
As a heads up, I am on Ubuntu and not Windows, so file paths I put probably need to be updated.
offcourse, I'm check each point. The link already exist.
src <- image$getElementAttribute("src");src
[[1]]
[1] "https://cdn.genesys-pgr.org/repository/d/_thumbs/27c/27c35558-af48-4bd7-be4b-3e1563460814/300x300.jpg"
src_name <- gsub("/", "", str_extract(src, "[^/]+/[^/]+.jpg"));src_name
[1] "27c35558-af48-4bd7-be4b-3e1563460814300x300.jpg"
Sorry, but I still don't know if I understand what the problem is. Are you saying that the image at this location:
[1] "https://cdn.genesys-pgr.org/repository/d/_thumbs/27c/27c35558-af48-4bd7-be4b-3e1563460814/300x300.jpg"
was already downloaded and that is throwing an invalid url error?
src
and src_name
show data, but when run the for loop show the error messages.
I dont know but in download.file
the url
is wrong.
Ah, so it might not be the first one. Do any files get downloaded? Also, have you tried printing out what the src
is as it goes?
for (image in images) {
src <- image$getElementAttribute("src")
src_name <- gsub("/", "", str_extract(src, "[^/]+/[^/]+.jpg"))
print(src)
print(src_name)
download.file(src, paste("./images/", src_name, sep=""))
}
Next to run this code, show this:
[[1]]
[1] "https://cdn.genesys-pgr.org/repository/d/_thumbs/27c/27c35558-af48-4bd7-be4b-3e1563460814/300x300.jpg"
[1] "27c35558-af48-4bd7-be4b-3e1563460814300x300.jpg"
Error in download.file(src, paste("./images/", src_name, sep = "")) :
invalid 'url' argument
Hi. You're on Windows, right? Shouldn't you have updated it to something like:
download.file(src, paste(".\images\", src_name, sep = ""))
Hi, yes. Im Windows user.
with a ' \ '
download.file(src, paste(".\images\", src_name, sep = ""))
Error: '\i' is an unrecognized escape in character string (<input>:6:31)
with two ' \ \ '
Error in download.file(src, paste(".\\images\\", src_name, sep = "")) :
invalid 'url' argument
Im try with "/", but the Error was the same.
And extra was
download.file(src, paste("'.\images\'", src_name, sep = ''))
Error: '\i' is an unrecognized escape in character string (<input>:6:32)
system
Closed
January 24, 2024, 3:38pm
12
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.