Trying to Sort by Date

I am trying to sort a data frame by year. I know this should be the easiest thing in the world but I just keep getting error after error and have no idea how to solve this problem. I've tried to do the "minimum reproducible example" or whatever but I don't know how to create a data frame to reproduce the problem and I am not allowed to post the information online. So the first problem I thought I solved is changed the year column to numeric because it was a factor. Anyway, please help I'm about to punch a hole in a wall.

#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>     filter, lag
#> The following objects are masked from 'package:base':
#>     intersect, setdiff, setequal, union
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>     date
#> Loading required package: SparseM
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>     backsolve

sports_top25000 <- read.csv('index_sales_export_categories.csv')
#> Warning in file(file, "rt"): cannot open file
#> 'index_sales_export_categories.csv': No such file or directory
#> Error in file(file, "rt"): cannot open the connection
baseball_top25000 <- sports_top25000 %>%  
  select(graded_title,vcp_card_grade_id,category,year,date,price) %>%
  filter(category == 'Baseball')
#> Error in eval(lhs, parent, parent): object 'sports_top25000' not found
#> Error in eval(expr, envir, enclos): object 'baseball_top25000' not found

prewar_baseball_top25000 <- baseball_top25000 %>%
  transform(prewar_baseball_top25000, year = as.numeric(year)) %>%
#> Error in eval(lhs, parent, parent): object 'baseball_top25000' not found

#> Error in head(prewar_baseball_top25000): object 'prewar_baseball_top25000' not found

Created on 2019-07-30 by the reprex package (v0.3.0)

The errors I am getting the most are

  • 'Error in prewar_baseball_top25000(.):
    could not find function "prewar_baseball_top25000" '

  • ' Error in data.frame(list(graded_title = c(9669L, 9669L, 9669L, :
    arguments imply differing number of rows: 1041953, 0 '

The errors you are getting suggest your problem starts before you attempt to sort by date. For example, your read.csv() function returns an error. Let's get that straightened out first. Can you call read.csv() without getting an error. Try using the full directory path. If you are on windows that would probably be something like

sports_top25000 <- read.csv("c:\users\YourUserName\Documents\index_sales_export_categories.csv")

Here is an example of sorting a data frame by the year, after calculating the year from the date.

#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>     date

#Make an example data frame
df <- data.frame(DATE =ymd(c("2017-01-01", "2018-01-01", "2019-01-01",
                       "2017-06-01", "2018-06-01", "2019-06-01")),
                 A = LETTERS[1:6])
#>         DATE A
#> 1 2017-01-01 A
#> 2 2018-01-01 B
#> 3 2019-01-01 C
#> 4 2017-06-01 D
#> 5 2018-06-01 E
#> 6 2019-06-01 F
#add a YEAR column and then sort by it in ascending order
df <- df %>% mutate(YEAR = year(DATE)) %>% arrange(YEAR)
#>         DATE A YEAR
#> 1 2017-01-01 A 2017
#> 2 2017-06-01 D 2017
#> 3 2018-01-01 B 2018
#> 4 2018-06-01 E 2018
#> 5 2019-01-01 C 2019
#> 6 2019-06-01 F 2019

Created on 2019-07-30 by the reprex package (v0.2.1)

1 Like

I can´t be sure if this is your only problem because you are not providing sample data but this part of your code makes no sense, I guess you are trying to do something like this instead.

prewar_baseball_top25000 <- baseball_top25000 %>%
    mutate(year = as.numeric(year))


If I'm guessing wrong, could you try to explain what were you trying to accomplish with that code?

I was trying to change the type to numeric because it was factor. Whenever I tried to filter by year it wouldn't let me and I figured that was the problem. Now, after I have changed to numeric the year is being changed from something like 1952 to 46.

Try with this

prewar_baseball_top25000 <- baseball_top25000 %>%
    mutate(year = as.numeric(as.character(year)))
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.