Hi all,
I have a question about pins
and a use case with AWS S3 boards that I'm not sure about in terms of recommendations. Here at Montoux, we develop a SaaS web based platform for life insurance companies to model their portfolios from an actuarial analysis point of view. We're developing a bunch of data science based models in R, and we're figuring out the mechanics of productising these into our platform.
We've previously been using s3mpi
(https://github.com/robertzk/s3mpi) as a way to allow us to fetch data from S3 and cache it locally - some of the data we consume is pretty large, so this has worked well for explorative analysis. However, we feel like most of the community traction is around pins
, so we're looking at how we can use it.
One area that's a little unknown to us is how we should approach data that hasn't yet been cached/pinned - for example, may have been uploaded directly by a user, or produced as part of some other data processing - ie. the metadata data.txt
hasn't been produced. One way to approach this would be to use aws.s3
to pull the file and then pin
it, but this seems somewhat inefficient and it would be nice if there was a way for pins
to populate the cache from an existing S3 object. Is there anything I'm missing here, or is this a use case that is outside the scope of pins
?
I'd really appreciate any experiences or recommendations anyone has.
Thanks!
Glynn