Source of the project is the below RStudio webinar on web scraping:
Hello, my reprex is below.
- I believe the
paths_allowed()
, base_url
, pages
, html_nodes()
and html_attr()
are all updated correctly (given the website design changes since recording of the webinar).
- My
pages
value shows all the individual property URLs in View(pages)
. (They end in "overview" not .html)
- I've discarded a few properties in Canada.
- I changed
quiet = F
in the walk()
command so I could see the progress in the console.
- You will have to create your own
data_dir
(I don't know how to reprex that?)
- I end up with a file in
00_data_html
called "overview" that has the .html code for one property.
- There should be 30 separate files; one for each La Quinta location.
Issue/Questions
-
- I'm wondering #6 and #7 above outcomes are because my URLs end in "overview" and not .html? I've played around with arguments in the
download.file()
function to no avail.
-
- Can't figure out why the
walk()
command/function is not generating the separate files for each property like in the webinar example?
-
- I'm not sure why the
walk()
function output is not in the reprex
? I'll add it in a reply to this thread.
I hope this reprex is easy enough for you to use. I ran it separately and it did "reprex"!
# Load packages -----------------------------------------------------
library(tidyverse)
library(rvest)
#> Loading required package: xml2
#>
#> Attaching package: 'rvest'
#> The following object is masked from 'package:purrr':
#>
#> pluck
#> The following object is masked from 'package:readr':
#>
#> guess_encoding
library(stringr)
library(robotstxt)
#library(reprex)
#library(shiny)
# Step 0: Check bot permission --------------------------------------
paths_allowed("https://www.wyndhamhotels.com/laquinta/locations")
#> www.wyndhamhotels.com
#>
#> [1] TRUE
# Step 1: Create list of links to hotels ----------------------------
base_url <- "https://www.wyndhamhotels.com"
pages <- read_html(file.path(base_url,"laquinta/locations")) %>%
html_nodes(".headline-d~ .property-list a:nth-child(1)") %>%
html_attr("href") %>%
#discard(is.na) %>%
discard(str_detect, "laquinta/oshawa-ontario/la-quinta-oshawa/overview") %>%
discard(str_detect, "laquinta/richmond-british-columbia/la-quinta-inn-vancouver-airport/overview") %>%
file.path(base_url, .)
pages <- head(pages, 30) # use a subset for testing code
# Step 2 : Save hotels pages locally ---------------------------------
# Create a directory to store downloaded hotel pages
data_dir <- "00_data-html/"
dir.create(data_dir, showWarnings = FALSE)
# Create a progress bar
p <- progress_estimated(length(pages))
# Download each hotel page
walk(pages, function(url){
download.file(url, destfile = file.path(data_dir, basename(url)), quiet = F)
p$tick()$print()
})
Created on 2019-12-13 by the reprex package (v0.3.0)
Output from above reprex
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-birmingham-hoover/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 207 KB
|==== | 3% ~38 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-inn-birmingham-inverness/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 205 KB
|========= | 7% ~42 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/cullman-alabama/la-quinta-cullman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
|============= | 10% ~33 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/daphne-alabama/la-quinta-mobile-daphne/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
|================== | 13% ~32 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/decatur-alabama/la-quinta-inn-decatur/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
|======================= | 17% ~32 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/dothan-alabama/la-quinta-dothan/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 248 KB
|=========================== | 20% ~31 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/fultondale-alabama/la-quinta-fultondale-birmingham-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 197 KB
|================================ | 23% ~30 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/homewood-alabama/la-quinta-birmingham-homewood/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 226 KB
|==================================== | 27% ~30 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/huntsville-alabama/la-quinta-inn-huntsville-research-park/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 210 KB
|========================================= | 30% ~28 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/madison-alabama/la-quinta-huntsville-airport-madison/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
|============================================== | 33% ~27 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-inn-and-suites-mobile-i-65-airport-blvd/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 226 KB
|================================================== | 37% ~25 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-mobile-tillmans-corner/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
|======================================================= | 40% ~24 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/montgomery-alabama/la-quinta-montgomery/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 219 KB
|=========================================================== | 43% ~23 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/opelika-alabama/la-quinta-opelika/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
|================================================================ | 47% ~21 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/oxford-alabama/la-quinta-oxford-anniston/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 210 KB
|===================================================================== | 50% ~20 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/prattville-alabama/la-quinta-prattville/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
|========================================================================= | 53% ~19 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/satsuma-alabama/la-quinta-mobile-satsuma-saraland/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 246 KB
|============================================================================== | 57% ~17 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/tuscaloosa-alabama/la-quinta-inn-suites-by-wyndham-tuscaloosa-university/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 223 KB
|================================================================================== | 60% ~16 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/anchorage-alaska/la-quinta-anchorage-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 255 KB
|======================================================================================= | 63% ~15 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/fairbanks-alaska/la-quinta-fairbanks-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 278 KB
|============================================================================================ | 67% ~14 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 265 KB
|================================================================================================ | 70% ~12 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff-east-i-40/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
|===================================================================================================== | 73% ~11 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/kingman-arizona/la-quinta-kingman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 231 KB
|========================================================================================================= | 77% ~9 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-mesa-superstition-springs/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
|============================================================================================================== | 80% ~8 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-phoenix-mesa-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 262 KB
|=================================================================================================================== | 83% ~7 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/page-arizona/la-quinta-page-at-lake-powell/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 241 KB
|======================================================================================================================= | 87% ~5 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/peoria-arizona/la-quinta-phoenix-west-peoria/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
|============================================================================================================================ | 90% ~4 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-chandler/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 258 KB
|================================================================================================================================ | 93% ~3 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-i-10-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 233 KB
|===================================================================================================================================== | 97% ~1 s remaining trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-inn-phoenix-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 224 KB
|==========================================================================================================================================|100% ~0 s remaining
In breaking the code down, I get also get the below output when commenting out the p
object and function.
# Create a directory to store downloaded hotel pages
data_dir <- "00_data-html/"
dir.create(data_dir, showWarnings = T)
# Create a progress bar
#p <- progress_estimated(length(pages))
# Download each hotel page
walk(pages, function(url) {
download.file(url, destfile = file.path(data_dir, basename(url)), quiet = F)
#p$tick()$print()
})
#> Error in walk(pages, function(url) {: could not find function "walk"
Created on 2019-12-13 by the reprex package (v0.3.0)
Yet it appears the pages are downloading somewhere, the below is the output, not shown in the second reprex. The dowloaded items seen below are not in the created directory 00_data-html/
.
> # Download each hotel page
> walk(pages, function(url){
+ download.file(url, destfile = file.path(data_dir, basename(url)), quiet = F)
+ #p$tick()$print()
+ })
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-birmingham-hoover/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 208 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-inn-birmingham-inverness/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 205 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/cullman-alabama/la-quinta-cullman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/daphne-alabama/la-quinta-mobile-daphne/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/decatur-alabama/la-quinta-inn-decatur/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/dothan-alabama/la-quinta-dothan/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 248 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fultondale-alabama/la-quinta-fultondale-birmingham-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 196 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/homewood-alabama/la-quinta-birmingham-homewood/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/huntsville-alabama/la-quinta-inn-huntsville-research-park/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 209 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/madison-alabama/la-quinta-huntsville-airport-madison/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-inn-and-suites-mobile-i-65-airport-blvd/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 225 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-mobile-tillmans-corner/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/montgomery-alabama/la-quinta-montgomery/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 219 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/opelika-alabama/la-quinta-opelika/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/oxford-alabama/la-quinta-oxford-anniston/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 210 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/prattville-alabama/la-quinta-prattville/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/satsuma-alabama/la-quinta-mobile-satsuma-saraland/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 246 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/tuscaloosa-alabama/la-quinta-inn-suites-by-wyndham-tuscaloosa-university/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 223 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/anchorage-alaska/la-quinta-anchorage-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 255 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fairbanks-alaska/la-quinta-fairbanks-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 278 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 264 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff-east-i-40/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/kingman-arizona/la-quinta-kingman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 232 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-mesa-superstition-springs/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-phoenix-mesa-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 261 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/page-arizona/la-quinta-page-at-lake-powell/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 242 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/peoria-arizona/la-quinta-phoenix-west-peoria/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-chandler/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 258 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-i-10-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-inn-phoenix-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 224 KB
> reprex()
Rendering reprex...
Rendered reprex is on the clipboard.
> data_dir <- "00_data-html/"
> dir.create(data_dir, showWarnings = T)
Warning message:
In dir.create(data_dir, showWarnings = T) : '00_data-html' already exists
>
> # Create a progress bar
> #p <- progress_estimated(length(pages))
>
> # Download each hotel page
> walk(pages, function(url){
+ download.file(url, destfile = file.path(data_dir, basename(url)), quiet = F)
+ #p$tick()$print()
+ })
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-birmingham-hoover/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 207 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-inn-birmingham-inverness/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 206 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/cullman-alabama/la-quinta-cullman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/daphne-alabama/la-quinta-mobile-daphne/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/decatur-alabama/la-quinta-inn-decatur/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/dothan-alabama/la-quinta-dothan/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 249 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fultondale-alabama/la-quinta-fultondale-birmingham-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 196 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/homewood-alabama/la-quinta-birmingham-homewood/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 226 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/huntsville-alabama/la-quinta-inn-huntsville-research-park/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 209 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/madison-alabama/la-quinta-huntsville-airport-madison/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-inn-and-suites-mobile-i-65-airport-blvd/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 225 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-mobile-tillmans-corner/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/montgomery-alabama/la-quinta-montgomery/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 219 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/opelika-alabama/la-quinta-opelika/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/oxford-alabama/la-quinta-oxford-anniston/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 210 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/prattville-alabama/la-quinta-prattville/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/satsuma-alabama/la-quinta-mobile-satsuma-saraland/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 246 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/tuscaloosa-alabama/la-quinta-inn-suites-by-wyndham-tuscaloosa-university/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 223 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/anchorage-alaska/la-quinta-anchorage-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 256 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fairbanks-alaska/la-quinta-fairbanks-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 277 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 264 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff-east-i-40/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/kingman-arizona/la-quinta-kingman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 231 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-mesa-superstition-springs/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-phoenix-mesa-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 262 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/page-arizona/la-quinta-page-at-lake-powell/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 241 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/peoria-arizona/la-quinta-phoenix-west-peoria/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-chandler/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 258 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-i-10-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 233 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-inn-phoenix-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 224 KB
> # Create a character vector of names of all files in directory
> files <- dir(data_dir, pattern = "*.html", full.names = F)
> dir.create(data_dir, showWarnings = T)
Warning message:
In dir.create(data_dir, showWarnings = T) : '00_data-html' already exists
>
> # Create a progress bar
> #p <- progress_estimated(length(pages))
>
> # Download each hotel page
> walk(pages, function(url){
+ download.file(url, destfile = file.path(data_dir, basename(url)), quiet = F)
+ #p$tick()$print()
+ })
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-birmingham-hoover/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 207 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-inn-birmingham-inverness/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 205 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/cullman-alabama/la-quinta-cullman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/daphne-alabama/la-quinta-mobile-daphne/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/decatur-alabama/la-quinta-inn-decatur/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/dothan-alabama/la-quinta-dothan/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 248 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fultondale-alabama/la-quinta-fultondale-birmingham-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 197 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/homewood-alabama/la-quinta-birmingham-homewood/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 226 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/huntsville-alabama/la-quinta-inn-huntsville-research-park/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 209 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/madison-alabama/la-quinta-huntsville-airport-madison/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-inn-and-suites-mobile-i-65-airport-blvd/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 225 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-mobile-tillmans-corner/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/montgomery-alabama/la-quinta-montgomery/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 219 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/opelika-alabama/la-quinta-opelika/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/oxford-alabama/la-quinta-oxford-anniston/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 210 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/prattville-alabama/la-quinta-prattville/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/satsuma-alabama/la-quinta-mobile-satsuma-saraland/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 247 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/tuscaloosa-alabama/la-quinta-inn-suites-by-wyndham-tuscaloosa-university/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 223 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/anchorage-alaska/la-quinta-anchorage-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 256 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fairbanks-alaska/la-quinta-fairbanks-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 277 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 265 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff-east-i-40/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/kingman-arizona/la-quinta-kingman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 232 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-mesa-superstition-springs/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-phoenix-mesa-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 262 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/page-arizona/la-quinta-page-at-lake-powell/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 241 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/peoria-arizona/la-quinta-phoenix-west-peoria/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-chandler/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 258 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-i-10-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-inn-phoenix-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 224 KB
> library(purrr)
> walk(pages, function(url){
+ download.file(url, destfile = file.path(data_dir, basename(url)), quiet = F)
+ #p$tick()$print()
+ })
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-birmingham-hoover/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 207 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/birmingham-alabama/la-quinta-inn-birmingham-inverness/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 205 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/cullman-alabama/la-quinta-cullman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/daphne-alabama/la-quinta-mobile-daphne/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/decatur-alabama/la-quinta-inn-decatur/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 227 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/dothan-alabama/la-quinta-dothan/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 248 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fultondale-alabama/la-quinta-fultondale-birmingham-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 196 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/homewood-alabama/la-quinta-birmingham-homewood/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 226 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/huntsville-alabama/la-quinta-inn-huntsville-research-park/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 210 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/madison-alabama/la-quinta-huntsville-airport-madison/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-inn-and-suites-mobile-i-65-airport-blvd/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 226 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mobile-alabama/la-quinta-mobile-tillmans-corner/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 228 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/montgomery-alabama/la-quinta-montgomery/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 219 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/opelika-alabama/la-quinta-opelika/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 214 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/oxford-alabama/la-quinta-oxford-anniston/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 211 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/prattville-alabama/la-quinta-prattville/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 234 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/satsuma-alabama/la-quinta-mobile-satsuma-saraland/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 247 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/tuscaloosa-alabama/la-quinta-inn-suites-by-wyndham-tuscaloosa-university/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 224 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/anchorage-alaska/la-quinta-anchorage-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 256 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/fairbanks-alaska/la-quinta-fairbanks-airport/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 278 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 264 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/flagstaff-arizona/la-quinta-flagstaff-east-i-40/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 222 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/kingman-arizona/la-quinta-kingman/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 231 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-mesa-superstition-springs/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 221 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/mesa-arizona/la-quinta-phoenix-mesa-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 261 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/page-arizona/la-quinta-page-at-lake-powell/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 241 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/peoria-arizona/la-quinta-phoenix-west-peoria/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 220 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-chandler/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 258 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-phoenix-i-10-west/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 233 KB
trying URL 'https://www.wyndhamhotels.com//laquinta/phoenix-arizona/la-quinta-inn-phoenix-north/overview'
Content type 'text/html; charset=UTF-8' length unknown
downloaded 224 KB