That's a big dataset you are loading. The largest instance seems to have 8GB, also for the more expensive plans. At least that's what I am seeing when clicking on performance boost on shinyapps.io
Just wondering if you need the whole dataset for your shiny app? I am also regularly working with large genomic datasets like this and I know it's a challenge. My tip: try to pre-summarize the data in a script and save it to a (smaller) file that can be loaded in the shiny app. If pre-summarizing of the data is not possible, maybe you can get rid of parts of the object you are sure are not used.
Another thing that's important is that the data is loaded only once during startup of the application. If you load it inside the server function, it will be loaded for each user, which will make a huge difference with such a big dataset.
Maybe it would be better to upload the dataset in a database alongside your app, then have each task query what it needs? The documentation covers storage. I'm not what the storage limit is, though.
Thank you so much @ginberg! I followed your tip and reduced the size of the object. The object contains data from a single cell RNAseq experiment (16K genes and 120K cells). I randomly removed some cells and with 50K cells it runs fine! Not ideal, but at least it works.
Regarding your second comment - how do I load an object outside of a server function?
Thank you, @nwerth! I don't have experience working with databases. Do you happen to know a good post that describes incorporating databases into to a shiny app?
SQLite (with the RSQLite package) is perfect for quickly setting up a single-app database. Most examples of RSQLite use databases created in RAM, but that defeats the point here. You can create a single file that acts like a database:
library("RSQLite")
# Create a database file named db.sqlite
con <- dbConnect(RSQLite::SQLite(), "db.sqlite")
dbWriteTable(con, "mydata", mydata)
subtable <- dbGetQuery(
con, "SELECT * FROM mydata WHERE name = 'John Doe'"
)
dbWriteTable can also read data from a delimited data file into a database, with no leftover objects in the R session. This is helpful with huge data. It takes most of the arguments from read.table:
dbWriteTable(
conn = con,
name = "mydata",
value = "mydata.csv",
sep = ",",
header = TRUE,
overwrite = TRUE
)
Pair this with creating objects outside the server function, and you'll have a single database shared by all sessions.
Thank you, @nwerth! This helps a ton when I experiment with a dataframe.
However, my giant object is a container for sequencing data and metadata with several slots. Can non-dataframe objects be converted to a database file?
First, if the app works when you keep the object in RAM outside the shiny sessions, then stick with that.
If that doesn't work, your next step should be looking through Bioconductor or asking on the rOpenSci forum for any packages that might help. Chances are you're not the first person with this problem, so hopefully somebody has already solved it.
Otherwise, you'd have to create your own solution. It's possible to store complex objects in what are call NoSQL databases (or non-relational databases). The Databases task view on CRAN gives a list of recommended packages.
Thanks so much! Keeping the object outside the Shiny sessions and reducing the size of the object prevents memory from running out but the app is very slow. I will look into NoSQL databases.