rvest::html_session -- Load pages that "initiate" with a loading circle

I have a simple web scraping script that would log into bill.com and download a few reports for me. However, it appears they have updated their login page. It use to load right away and I was able to extract the form. Now when I go to the log in it shows a loading bar THEN the website loads. My apologies for not knowing what this is called. I would love to learn what this is called! When I look at my session made by html_sessions() and construct the HTML form of it I am greeted with the spinning wheel that disappears after a few seconds, then nothing loads.

I believe that it is trying to load or call something. I want to know if anyone has a way to get the session to load past this? I just need to login, that's all! So any other methods for logging in so I can navigate the site are also good with me :grin:

RSelenium is not a possibility sadly, my work environment does not allow the dependencies needed to get it up and running.


url <- "https://app.bill.com/neo/login"

(bill_session <- html_session(url))
#> <session> https://app.bill.com/neo/login
#>   Status: 200
#>   Type:   text/html
#>   Size:   35411

(bill_form <- html_form(bill_session))
#> list()

Created on 2020-05-27 by the reprex package (v0.3.0)

Thanks in advance to any resources and/or solutions to move past this hurdle!


As always with scraping, I check if there is an API. it seems it is the case

Did you try it ?

To get access to data from a program, API is better for M2M exchange than scraping.

About headless browsing, there is solution now to use Chrome Devtools Protocol from R to do headless browsing and control your browser from R. You just need a browser that use the devtool protocol (chromium based browser)

See those packages :

They are new but they work and help around. It is rather low level too and you need to dig the documentaiton of the devtools protocol to know what to do in your browser.

Hope it helps


Hey @cderv! Thanks for the response! Yes, I do use their API for a few tasks such as extracting bills, however, there are quite a few things that bill.com provides through their UI but are not available through their API.

Thank you for bringing those packages to light! I will definitely look into those today!

My apologies on the cross-post, I was wanting to reach out to as many people as possible. I should of thought about reading the guidelines on this topic. I have deleted the SO post so there is no conflict. Thanks @mfherman!

