Programming with cols()

I'd like to write a function to read a csv, which takes 2 arguments, a file name and a column name. Then use the column name as a argument to cols()

This is a cut down version of what I'm trying to do:

Read_a_file <- function(my_file, my_col){
col_names = c('xdate', my_col), # File with 2 columns, "xdate" and one other
col_types = cols(
xdate = col_date(format = '%d/%m/%Y'),
my_col = col_double())). # This doesn't work because I need the string specified by my_col not "my_col"


If that's all there is to it

  read_csv(file.path('../data',my_file) %>% select(my_col).

I'd like to be able to specify a col_type for the my_col column. At present I can't do this because in cols() I need the string contained within the object my_col, not the string "my_col".

I understood. My comment goes to the question was

  1. I'm trying to bring in only one column from a CSV.
  2. How do I pass the value of an object to a function within another function.

I was answering the first.

For the second, the my_col variable isn't called, so it's in the global environment, perhaps. However, the problem lies in the right object in the right way.

Not easy to be without a reprex. See the FAQ: How to do a minimal reproducible example reprex for beginners

Have you tried get(mycol)=col_double()

readr provides a as.col_spec() function to convert a list to a column specification. You can also specify what you want the default column specification to be with the .default name, in this case we want to skip columns that are not specified. Combining these two things you can create the function you were attempting.


# just creating a dataset as an example
mtcars$xdate <- format(Sys.Date(), "%d/%m/%Y")

write_csv(mtcars, "mtcars.csv")

Read_a_file <- function(my_file, my_col){
  types <- list(
    xdate = col_date(format = '%d/%m/%Y'),
    .default = col_skip()
  types[[my_col]] <- col_double()
  types <- as.col_spec(types)
  read_csv(my_file, col_types = types)

Read_a_file("mtcars.csv", "hp")
#> # A tibble: 32 x 2
#>       hp xdate     
#>    <dbl> <date>    
#>  1   110 2021-04-01
#>  2   110 2021-04-01
#>  3    93 2021-04-01
#>  4   110 2021-04-01
#>  5   175 2021-04-01
#>  6   105 2021-04-01
#>  7   245 2021-04-01
#>  8    62 2021-04-01
#>  9    95 2021-04-01
#> 10   123 2021-04-01
#> # … with 22 more rows

Created on 2021-04-01 by the reprex package (v1.0.0)


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.