Issue with dput()

jcblum · June 12, 2018, 4:50am

I agree that a very large dataset is not a good fit for the dput() strategy (some people will argue that there are very few problems where you really need to include all of a large dataset in your reproducible example). There have been a couple of discussions here with ideas for sharing data beyond dput():

Best Practices: how to prepare your own data for use in a `reprex` if you can’t, or don’t know how to reproduce a problem with a built-in dataset? tidyverse

@EconomiCurtis split this out of FAQ: What's a reproducible example (`reprex`) and how do I do one?. Curious if you have anything additional to add specifically on "how to prepare your own data for use in a reprex if you can't, or don't know how to reproduce a problem with a built-in dataset." I think @jessemaegan's post is about 80% there. The piece it is missing, if your average stack overflow post is any indication, is an explanation about how to prepare your own data for use in a reprex if you can't, or don't know how to reproduce a problem with a built-in dataset. Some handy things to know for this situation: deparse() The ugly as sin, gold standard: head(my_data, 2) %>% depa…

I’m curious what’s going wrong for you when you try dput() while selecting just a few rows. You said you don’t get output that starts with structure() — what do you get? What happens when you try running dput() on a slice of a built-in dataset? For example:

dput(head(ggplot2::diamonds))

(I get this...)

structure(list(carat = c(0.23, 0.21, 0.23, 0.29, 0.31, 0.24), 
    cut = structure(c(5L, 4L, 2L, 4L, 2L, 3L), .Label = c("Fair", 
    "Good", "Very Good", "Premium", "Ideal"), class = c("ordered", 
    "factor")), color = structure(c(2L, 2L, 2L, 6L, 7L, 7L), .Label = c("D", 
    "E", "F", "G", "H", "I", "J"), class = c("ordered", "factor"
    )), clarity = structure(c(2L, 3L, 5L, 4L, 2L, 6L), .Label = c("I1", 
    "SI2", "SI1", "VS2", "VS1", "VVS2", "VVS1", "IF"), class = c("ordered", 
    "factor")), depth = c(61.5, 59.8, 56.9, 62.4, 63.3, 62.8), 
    table = c(55, 61, 65, 58, 58, 57), price = c(326L, 326L, 
    327L, 334L, 335L, 336L), x = c(3.95, 3.89, 4.05, 4.2, 4.34, 
    3.94), y = c(3.98, 3.84, 4.07, 4.23, 4.35, 3.96), z = c(2.43, 
    2.31, 2.31, 2.63, 2.75, 2.48)), .Names = c("carat", "cut", 
"color", "clarity", "depth", "table", "price", "x", "y", "z"), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))