Error Message or Print Statement

I'm trying to webscrape data from Google Scholar on a large list of people. I made a similar post here.

Someone mentioned the error message may not be an error, but instead a print statement. I've tried modifying my code to recognize the 429 error print statement, but the results are still the same.

scholar_ids <- character(nrow(info))
scholar_ids[] <- NA  # Initialize with NA values

last_successful_iteration <- 0  # Initialize last successful iteration

i <- 1  # Start at the first row

while (i <= nrow(info)) {
  id <- NULL
  
  output <- capture.output({
    tryCatch({
      id <- get_scholar_id(last_name = info$last_name[i],
                           first_name = info$first_name[i],
                           affiliation = "School Name")
      last_successful_iteration <- i  # Update last successful iteration
    }, error = function(err) {
      cat("Error message:", err$message, "\n")  # Print the error message
      stop(err)  # Stop on other errors
    }, warning = function(warn) { cat("Warning: ", warn$message, "\n") })
  }, type = "message")
  
  # Check if the output contains "429"
  if (any(grepl("429", output))) {
    cat("Timeout. Pausing for 15 minutes.\n")
    Sys.sleep(900)  # Pause for 15 minutes (900 seconds)
    next  # Retry the same iteration
  }
  
  if (!is.null(id) && length(id) > 0) {
    scholar_ids[i] <- id  # Store the ID if successfully retrieved
  }
  
  i <- i + 1  # Increment the loop variable to move to the next iteration

  # Random sleep to avoid hitting rate limits
  sleep_time <- runif(n = 1, min = 10, max = 12)
  Sys.sleep(sleep_time)
}

If this is relevant, the 429 error code message is output in red in the console, which is why I thought it was a real error. I also tried the advice from the person in my previous post, but it didn't work. I would appreciate any help on this issue.

What are the results? Can you show the output?

I think you cant capture.output and inspect the results if you are re-throwing stops that you catch.

demo of your problem :

output1 <- capture.output({stop("istoppedit")})
output1
# Error: object 'output1' not found

output2 <- capture.output(tryCatch(stop("istoppedit"),
                                   error = function(err){
                                     #just print the error but problematically  re stop 
                                     cat("Error message:", err$message, "\n")  
                                     stop("arbitrary re stop")
                                   }))
output2
# Error: object 'output2' not found 

output3 <- capture.output(tryCatch(stop("istoppedit"),
                                   error = function(err){
                                     #just print the error but dont re stop 
                                     cat("Error message:", err$message, "\n")  # Print the error message
                                   }))
output3
# "Error message: istoppedit "

in your code as it was presented here; I would expect any(grepl("429", output)) to fail due to a lack of an output object. Exception being that if there is an output object because output was made at least once without encountering any errors; then the error will never be in output and the if(any( test will always be false) as you are always testing the last successful attempt