need help with lubridate function to convert the string into the numeric

Gne_Koda · January 2, 2025, 12:59am

Hello all everyone,

I am learning about R studio and doing one case study. (this case study is the final lesson of the learning program before receiving the certificate for the junior data analyst). With one raw table data 44.4 MB, please look at the screenshot with the first 15 rows.
Screenshot 2025-01-01 15.01.19

I wrote on the scrip at R studio with function dmy_hm -> I tried to convert all rows of the column "ActivityMinute" from string into numeric. But I failed.

I tried change the format setting for the whole column "ActivityMinute" with the script:
format(minuteStepsNarrow_merged$ActivityMinute, format = "%b/%d/%Y %I:%M %p")

result: R system runs well. However, the data type is still class as string-character.

the final goal try to reach: separating the original column "ActivityMinute" into 2 columns which are "date" and "time" with the correspondent component. for example: the column "date" should be the numeric type and print as "4/12/2016". the column "time" should be the numeric and print as "12:09:00 AM **

So far, I write the script: minuteStepsNarrow_merged$date <- as.Date(minuteStepsNarrow_merged$ActivityMinute)

And then hit "run" . next, R system performs the job -> I use the function class() to check the data type -> R system shows it is date type -> this column "date" is done as I want.

Move to the leftover column "time".
I did try the separate() function with the script
minuteStepNarrow_merged <- separate( minuteStepNarrow_merge,
col = “ActivityMinute” ,
into = c( “date”, “time”),
sep = “ “)

the result was NAs for the whole column "time"

I also tried format() function plus the class as.POSIXct() function. the result was not like I expected. I wrote the script as minuteStepsNarrow_merged$time <- format(as.POSIXct(minuteStepsNarrow_merged$ActivityMinute), format = "% %I:%M %p")

I spent near 10 hours search in Google also visit a few websites from other data scientists. However, I still get struck with the column "time".

Please help me inspect my process and guide me to the solution.

FJCC · January 2, 2025, 1:46am

Your use of dmy_hm() failed because your values include values for seconds. Try using dmy_hms().

From there, you can use separate() to make a Day column and a Time column, each of which will be characters. Then as.Date() can change the Date to a numeric date and as_hms() from the hms package can make a numeric time. However, I think it is unlikely that splitting the date-time value into separate day and time values is a useful thing to do.

jrkrideau · January 2, 2025, 2:37am

I was just looking at something similar and I am not sure you need to convert to character. There are {lubridate} functions that will handle this. My apologies for the {data.table} syntax but it is late here and I would take way too long and make far too many mistakes doing it in base R or {tidyverse} syntax.

The main things is the two {lubridate} functions


DT <- data.table(timbit = ymd_hms(c("24-12-15 01:32:11", "25-01-01 21:21:00")))
DT[ , .(mydate = lubridate::as_date(timbit) , mytime = hms::as_hms(timbit))]

Gne_Koda · January 2, 2025, 2:03pm

dear FJCC, I attached the screenshot about the script I wrote when use the function dmy_hm().

I will follow your guide and try again. and thank you again because you give me the idea that is outside the box.
I am the new-learning so I try to do the basic.

Gne_Koda · January 2, 2025, 2:07pm

Dear Jrkrideau,
you give me the idea -> I already thank you so much for the short time helping.
thank for the guide again. I will continue to research and tried out.

jrkrideau · January 2, 2025, 2:09pm

Please do not send screenshots. They are difficult to work with.
It is much better to copy your code and paste it here between

```

This gives us formatted code that we can copy, paste and run .

It would possibly help if we had a sample of your data. A handy way to supply data is to use the dput() function. Do dput(mydata) where "mydata" is the name of your dataset. For really large datasets probably dput(head(mydata, 100)) will do. Again, paste the output between
```

```

Thanks

jrkrideau · January 2, 2025, 4:22pm

I got your message. It is better to post here so others can see and possibly help.

As I posted before, the way to supply us with sample data is to use dput.
Do dput(mydata) where "mydata" is the name of your dataset. For really large datasets probably dput(head(mydata, 100)) will do. Paste the output between
```

|```
You cannot upload a .csv file.

Here is a short example of how to use dput().

dat1 <- data.frame(aa = LETTERS[1:10], bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9,  3), cc = 1:10)
dat2 <- data.frame(aa = LETTERS[1:10], bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9,  3), cc = as.factor(1:10))
                   
dput(dat1)
dput(dat2)

This will give you

 structure(list(aa = c("A", "B", "C", "D", "E", "F", "G", "H", 
          "I", "J"), bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9, 3), cc = 1:10), class = "data.frame", row.names = c(NA,  -10L))

 structure(list(aa = c("A", "B", "C", "D", "E", "F", "G", "H", 
        "I", "J"), bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9, 3), 
        cc = structure(1:10, levels = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "factor")),
        class = "data.frame", row.names = c(NA, -10L))

which you can copy and paste in your message between
```

```

Please also copy all of your code and paste it between
```

```

This will give us well formatted data.
Thanks.

Gne_Koda · January 6, 2025, 12:27pm

thank for the guiding about the formatting in the community.

I will continue to research more about the function dput().

I still have another full time job (just min-wage). I am learning to become the junior data analysis with the mindset transitional from the min-wage into the job with better pay (hope of landing the salary job instead hourly-pay).

thank you again for your time.

system · January 27, 2025, 12:27pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.