Hi everyone.
I'm new at rvest and web scraping data. I've been trying automate data colect retrieved at http://www6.sistemaindustria.org.br/gpc/externo/listaResultados.faces?codPesquisa=100.
The data is in excel format and we download it at the page.
What I have been done is to save all the table nodes in a object named "tables" (see bellow) but I can't find where the desired informations (excel tables) are retrieved. They must be inside the "table nodes" object but after investigating carefully I could not find them.
Does anyone coul gently help me?
Dear @M_AcostaCH , thank you for the answer. Actually I've been trying to downlad some data bases that are inside of page not that table itself (please, see image attached).
In fact, the table you showed me I was able to download without any problems.
I have no idea how to automate this data download. Any help will be appretiated.
But you need install a specific java version and chrome drivers.
Im use xpath for make click in a element. In the page the xpath change in numbers Result:0:checkValores for select other items. You could replace this number if you want others.
library(RSelenium)
# open cmd for set the drivers folders.
# run .jar file by cmd
# java -jar selenium-server-standalone-3.2.0.jar -port 4444
# Make conection--------------------------------------------------
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444, browserName = "chrome")
remDr$open()
Sys.sleep(5)
url_data <- "http://www6.sistemaindustria.org.br/gpc/externo/listaResultados.faces?codPesquisa=100"
remDr$navigate(url = "http://www6.sistemaindustria.org.br/gpc/externo/listaResultados.faces?codPesquisa=100")
# put click in [Valores] -----------------------------------------------------
Sys.sleep(1)
remDr$findElement(using = 'xpath', "//*[@id='listaResultadoForm:dtResult:0:checkValores']")$clickElement()
# put click in [Valores2] -----------------------------------------------------
Sys.sleep(1)
remDr$findElement(using = 'xpath', "//*[@id='listaResultadoForm:dtResult:1:checkValores']")$clickElement()
# download the select files
Sys.sleep(1)
remDr$findElement(using = 'xpath', "//*[@id='listaResultadoForm:btnExportarExcel2']")$clickElement()
@M_AcostaCH This is really great. Your code checks the checkboxes before finally pressing the "Export" button. However, the user still needs to interact manually with the browser by pressing the "OK" button on the download popup. Do you know if there is any way to do this programmatically?
The user should not interact since after selecting the boxes to download, it tells Rselenium that it can download the selected files. In the last part of the script download the select files this is done automatically.