From this question (Scrapping 400 pages using rvest and purr - #3 by hassannasir) I believe you want to scrape a url of this form News Archives for 2019-05-22 - DAWN.COM
Here
You are passing parameters to the url. Per the documentation and how url works, parameters are for url of this form
Not the same.
Using polite, you are interesting by the nod
function. Look at the examples in ?scrape
help page
You would need something like that
fulllinks <- map(
dates, ~ {
nod(dawnsession, paste0("archive/",.x), verbose = TRUE) %>%
scrape()
})
However, it seems you are not allowed to scrape this part of the website
dawnsession <- polite::bow("https://www.dawn.com")
#> No encoding supplied: defaulting to UTF-8.
polite::nod(dawnsession, "archive/2019-04-01")
#> <polite session> https://www.dawn.com/archive/2019-04-01
#> User-agent: polite R package - https://github.com/dmi3kno/polite
#> robots.txt: 12 rules are defined for 1 bots
#> Crawl delay: 5 sec
#> The path is not scrapable for this user-agent
The path is not scrapable for this user-agent
it can be verified directy in the robotstxt file
robotstxt::paths_allowed("https://www.dawn.com/archive/2019-04-01")
#>
www.dawn.com No encoding supplied: defaulting to UTF-8.
#> [1] FALSE
rt <- robotstxt::robotstxt("https://www.dawn.com")
#> No encoding supplied: defaulting to UTF-8.
rt$permissions
#> field useragent value
#> 1 Disallow * */print
#> 2 Disallow * */authors/*/1*
#> 3 Disallow * */authors/*/2*
#> 4 Disallow * */authors/*/3*
#> 5 Disallow * */authors/*/4*
#> 6 Disallow * */authors/*/5*
#> 7 Disallow * */authors/*/6*
#> 8 Disallow * */authors/*/7*
#> 9 Disallow * */authors/*/8*
#> 10 Disallow * */authors/*/9*
#> 11 Disallow * /newspaper/*/20*
#> 12 Disallow * /archive/*
you see that /archive/*
is disallowed
So you are not allowed to scrape with some code (a robot) this part of the website. sorry. See those about scraping responsability
You should contact the website to ask permission or retrieve some information from them.