Hey all,
I have a pickle file on s3 (which comes from a python/pandas DataFrame), and I want to read it into R. I know from a previous question how to read in a csv, and if I was in Python, I'd know how to read in a pickle from s3, but I am having difficulty combining them in R with reticulate.
In Python, I run the following:
import pandas as pd
import pickle
import boto3
from io import BytesIO
bucket = 'my_bucket'
filename = 'my_filename.pkl'
s3 = boto3.resource('s3')
with BytesIO() as data:
s3.Bucket(my_bucket).download_fileobj(my_filename, data)
data.seek(0)
df1 = pickle.load(data)
which works succesfully.
so I tried to convert this into R, but failed:
library(reticulate)
reticulate::use_condaenv("base2", required = TRUE)
boto3 <- reticulate::import("boto3")
pickle <- reticulate::import("pickle")
io <- reticulate::import("io")
data <- io$BytesIO()
s3 <- boto3$resource("s3")
bucket <- 'my_bucket'
filename <- "my_filename.pkl"
s3$Bucket(bucket)$download_fileobj(filename, data)
data$seek(0)
#> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: integer argument expected, got float
df1 <- pickle$load(data)
#> Error in py_call_impl(callable, dots$args, dots$keywords): UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 4: invalid start byte
Created on 2020-07-23 by the reprex package (v0.3.0)
Can anyone help with the python <--> R conversion?