Hi,
I have a data frame where one of the columns contain time(HH:MM:SS) and of character format.
I am trying to convert it into minutes to make further calculations easy.
I have a function like this to do it by doing matrix multiplication.
It works beautifully with a new column 'ride_time_in_minutes' of num datatype.
But when i try it on the entire DF , it fails with the message
arguments imply differing number of rows: 1698572, 1697551
I have already checked for null values and o s and the row numbers are also consistent in the original set.
Confused how to proceed to debug.
Any idea what goes wrong??
Thanks in advance !!!
The aim is to get all columns to have the same number of rows. The purpose of dim(object)[1] is to get the number of rows. dim() returns a vector of length two with the row/column counts—you can also use nrow() or ncol(). But all of these require that it be used on something with more than one dimension. A single column of a data frame has only a length(). As a result, dim(cleanDF$ride_length)[1] evaluates to NULL and that makes the result of
equal to zero.
This
Tells us that 15,164 have one or more columns with missing values. With a dataset as big as this, discarding them is sensible.
recleanedDF <- cleanDF[complete.cases(cleanDF,]:
which reads subset cleanDF by taking all rows with no NA values and all columns.
I got distracted by the title of the post and went off in the wrong direction. Try this on a copy of your data
# Load the lubridate package
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
# Create the data frame
d <- data.frame(
hms = c("3:10:01", "23:10:02", "", NA, "asdf", "212", "23:10:02"),
stuff = rep(TRUE, 7),
more = seq(1:7),
last = c("A", NA, LETTERS[3:7])
)
# Convert the hms column to a period object using the hms() function from lubridate
d$hms <- hms(d$hms)
#> Warning in .parse_hms(..., order = "HMS", quiet = quiet): Some strings failed
#> to parse, or all strings are NAs
# Convert the period object to seconds using the as.numeric() function
d$hms <- as.numeric(d$hms, "seconds")
# Display the modified data frame
d
#> hms stuff more last
#> 1 11401 TRUE 1 A
#> 2 83402 TRUE 2 <NA>
#> 3 NA TRUE 3 C
#> 4 NA TRUE 4 D
#> 5 NA TRUE 5 E
#> 6 NA TRUE 6 F
#> 7 83402 TRUE 7 G
Thanks for sharing this wonderfull knowledge.
It works beautifully with a new column 'ride_time_in_minutes' of numeric datatype. However, when I try it on the entire DataFrame, it fails with the following message:.
I've issue while trying to create a plot for multivariate time series data in RStudio. I have a dataset with multiple variables recorded over time, and I'm attempting to visualize the relationships between these variables. When I use the ggplot2 package or other plotting libraries to create a time series plot with multiple lines (one for each variable), I run into an error. The error message I receive is somewhat cryptic: "Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ."
@ Demetrius675 The issue discussed in this thread was resolved; If you have another (possibly similar) coding issue that you would like support with I would encourage you to start a new thread where that can be discussed.
I also recommend that you review the following guide, FAQ: Tips for writing R-related questions.
For example, the guide emphasizes asking coding questions with formatted code-chunks and a reprex.
You may have noticed folks here requesting minimal reprexes, that's because asking questions this way saves answerers a lot of time.
Reproducible Examples:
help make your question clear and replicable
increases the probability folks will reach out and try to help,
reduces the number of back-and-forths required to understand the question,
and makes your question and suggested solutions more useful to folks in the future researching similar problems.