Error in open.connection(x, "rb") : HTTP error 403.

eham06 · December 1, 2019, 10:52pm

I am using the read_html command and I get the following error message on a specific website.

library('rvest')
library('dplyr')
webpage <- read_html("https://www.edmunds.com/ford/escape/2018/cost-to-own/")
Error in open.connection(x, "rb") : HTTP error 403.

Are specific websites blocking R from being scraped?

mara · December 2, 2019, 2:46pm

It's not specific to R, but the site might block webscraping. Two places to look are at the site's robots.txt and the Terms of Service
https://www.edmunds.com/robots.txt
https://www.edmunds.com/about/visitor-agreement.html

eham06 · December 2, 2019, 8:37pm

Thanks for the response...

system · December 23, 2019, 8:37pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.