I am trying to write a script that does the following:
- Logs into a site with a name and password
- Select a value from a first dropdown
- Select a value from a second dropdown
- Select a value from a third dropdown
- Hitting a button that will generate a CSV file based on those selections Thus far, I have gotten through Step #2, but I'm struggling with the next three.
For the record, I cannot provide a completely reproducable example given the proprietary nature of what I'm trying to do, but I will be as detailed as possible.
library(rvest)
Address of the login webpage
login<-"http://redditfakesite.com"
create a web session with the desired login address
pgsession<-html_session(login) pgform<-html_form(pgsession)[[1]] #in this case the submit is the 2nd form filled_form<-set_values(pgform, "ctl00$MainContent$Login1$UserName" = "abc", "ctl00$MainContent$Login1$Password" = "xyz") pg <- submit_form(pgsession, filled_form)
results <- html_nodes(pg, "select[name='CustomerNumberList'] > option") %>% html_attr("value") %>% html_nodes("select[name='StatusCode'] > option")
It is that final line where I see the first error: Error in UseMethod("xml_find_all") : no applicable method for 'xml_find_all' applied to an object of class "character"
The element and its OuterHTML looks as such:
<select name="StatusCode" id="StatusCode" onchange="ShowHideInvoiceNumber(this)" style="width:200px;" class="">
<option selected="selected" value="OPEN">ALL OPEN ORDERS</option>
<option value="BOOKED"> BOOKED</option>
<option value="RESERVED"> RESERVED</option>
<option value="CUT"> CUT</option>
<option value="DIRECT"> DIRECT ORDERS</option>
<option value="ENTERED"> ENTERED (DIRECT ORDERS)</option>
<option value="CONFIRMED"> CONFIRMED (DIRECT ORDERS)</option>
<option value="BOOKING REQUESTED"> BOOKING REQUESTED (DIRECT ORDERS)</option>
<option value="BOOKING CONFIRMED"> BOOKING CONFIRMED (DIRECT ORDERS)</option>
<option value="SHIPPED"> SHIPPED (DIRECT ORDERS)</option>
<option value="ALL PART ORDERS"> ALL PART ORDERS</option>
<option value="POP ORDERS"> POP ORDERS</option>
<option value="REPLACEMENT PART ORDERS"> REPLACEMENT PART ORDERS</option>
<option value="PHOTOGRAPHY ORDERS"> PHOTOGRAPHY ORDERS</option>
<option value="PHOTOGRAPHY ORDERS-IN PHOTOGRAPHY"> IN PHOTOGRAPHY</option>
<option value="PHOTOGRAPHY ORDERS-WAITING ON APPROVAL"> WAITING ON APPROVAL</option>
<option value="PHOTOGRAPHY ORDERS-BOOKED"> BOOKED</option>
<option value="INVOICED">ALL INVOICED ORDERS</option>
</select>
I'd like to select the value in that final option: <option value="INVOICED">ALL INVOICED ORDERS</option>
The third and final dropdown HTML is as such:
<select name="OutputFormat" id="OutputFormat" style="width:150px;" class="">
<option selected="selected" value="HTML">HTML (Screen)</option>
<option value="Excel">Excel Spreadsheet</option>
</select>
Say I want to click the Excel option.
And finally, I need to click the button as described below, which then triggers the download:
My last question is: How does the download work in R? Does it just generate the file into my working directory?