@EconomiCurtis split this out of FAQ: What's a reproducible example (`reprex`) and how do I do one?.
Curious if you have anything additional to add specifically on "how to prepare your own data for use in a reprex
if you can't, or don't know how to reproduce a problem with a built-in dataset."
I think @jessemaegan's post is about 80% there. The piece it is missing, if your average stack overflow post is any indication, is an explanation about how to prepare your own data for use in a reprex if you can't, or don't know how to reproduce a problem with a built-in dataset.
Some handy things to know for this situation:
-
deparse()
The ugly as sin, gold standard:
head(my_data, 2) %>%
deparse()
returning something like:
structure(list(date = list(structure(-61289950328, class = c("POSIXct",
"POSIXt"), tzone = ""), structure(-61258327928, class = c("POSIXct",
"POSIXt"), tzone = "")), id = c("0001234", "0001235"), ammount = c("$18.50",
"-$18.50")), class = "data.frame", .Names = c("date", "id", "ammount"
), row.names = c(NA, -2L))
Which is not beginner friendly... what's a structure
? But it is really the only method that will not mess with the data types. It also works with both listy structures and data.frame-ish ones.
-
tibble::tribble()
Handy if you have the patience to hand type out a some data for your audience in a pretty format. There is a servere limitation in that not all data types can be represented in atribble()
. The previous would be something close to:
tibble::tribble(
~date, ~id, ~ammount,
"27/10/2016 21:00", "0001234", "$18.50",
"28/10/2016 21:05", "0001235", "-$18.50"
) %>%
mutate(date = lubridate::parse_date_time(date, orders = c("d!/m!/Y! H!:M!")))
With the trailing mutate to fix the date that could not be represented. It would be remiss of me not to plug datapasta::tribble_paste()
which can save you some typing here.
-
readr::read_csv()
It's possible to represent your data, complete with type specification, as aread_csv()
call. The previous would be:
readr::read_csv('date, id, amount
"27/10/2016 21:00", 0001234, $18.50
"28/10/2016 21:05", 0001235, -$18.50',
col_types = cols( col_date(format="%d/%m/%Y %H:%M"),
col_character(), col_character() )
)
- krlmlr/deparse
Not yet on CRAN, A nicer version of 1, that can also get you directly to 2. in some cases. https://github.com/krlmlr/deparse
Edit: you can always use data.frame(), Tibble(), list() etc!