Only run a qmd chunk if a pin has been updated on connect.

I would like to set up a quarto document (or it could be an rmd as well) in connect such that it exits early or doesn't run unless a pin has been updated. But I am finding that I have am having to write more code that I expected. Specifically I am feel like I am doing something wrong that I have to write a hash to its own pin and then use that for logic to see if a pin has changed. This qmd will successfully fetch new data and do something (make a plot) if the data has changed but exit the qmd early if that data has not changed:

---
title: "Untitled"
format: 
  html:
    self-contained: true
---

```{r}
library(pins)

board <- board_connect()


# Get the hash of the pin
pin_hash <- pin_meta(board, "sam/quakes")$pin_hash

# Load the previous hash from a pin
if (pin_exists(board, "sam/prev_pin_hash")) {
  prev_pin_hash <- pin_read(board, "sam/prev_pin_hash")
} else {
  prev_pin_hash <- "no hash"
}

# Check if the hash has changed
if (pin_hash != prev_pin_hash) {
  # Run the chunk of code that uses the pin
  results_proc <- pin_read(board, "sam/quakes")
  message("data downloaded")
  print(paste0("n rows: ", nrow(results_proc)))
  print(pin_meta(board, "sam/quakes"))
  # Update the previous hash with the current hash
  prev_pin_hash <- pin_hash
  
  # Save the updated previous hash to a pin
  pin_write(board, prev_pin_hash, "sam/prev_pin_hash")
} else {
    message("Nothing new downloaded")
    results_proc <- pin_read(board, "sam/quakes")
    print(paste0("n rows: ", nrow(results_proc)))
    knitr::knit_exit()
}
```


```{r}
plot(results_proc$mag, results_proc$depth, col = "red", pch = 20, cex = 3)
```

Is there some functionality in pins that I am missing that provides this? I am wanting some function that is is_pin_changed() that returns a logical. The relation to Connect here is that I am expecting this to be a piece of connect content that is run on a schedule and watches for changes in a pin and then only executes some code if it is changed. I am also not set on using pins though it does seem nice and convenient.

How could the hypothetical is_pin_changed() function determin it's value? It would need some sort of "known state" to compare with, e.g. a known hash value or a known creation date. And you would have to save that "known state" somewhere, e.g. in another pin. Just liek you are doing now.

Alternatively, you could assume that your report runs every x hours and run it only if the latest update is newer than x hours. This requires synchronization between your schedule and your code. And it can be problematic if either a non-scheduled report is rendered or if a run fails for other reasons.

Right. My question though is can {pins} handle this all on its own? {targets} is the gold standard for this in that it only updates when a file has changed (even remote files). And yep {targets} has a cache to save hte "known state". The nice thing is that that is handled on the targets end so for the user it just works. I imagine this would be reasonably involved with pins because it handles more than a few backends potentially requiring some custom code for each.

And yeah

And it can be problematic if either a non-scheduled report is rendered or if a run fails for other reasons.

That is the scenario I am in :).

I have never used remote files with {targets}, but adding pin as a supported format would make sense to me. Interestingly, there already is a feature request for it: Pins as a possible future target data object · Issue #1045 · ropensci/targets · GitHub
However, that has been closed since @wlandau sees pins as downstream objects. Maybe you can convince him with your use case? You can deploy {targets} pipelines to Connect, c.f. Solutions - Connect & targets.

I am not sure if such an extension would make sense within {pins}. I don't see that package in the business of storing such data locally. But feel free to raise a feature request there. I am not maintaining that package.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.