Count Total time from occurences of event

Hello Everyone,

I have following data frame.
Below dataframe contains Time.stamp ,battery voltage, condition(which test if voltage greater than 50 it says yes if it is less than 50 it says no)

Time Battery.Voltage Condition
00:07:00 54.5205 Yes
00:12:00 54.5205 Yes
00:18:00 54.4447 Yes
00:23:00 54.5205 Yes
00:28:00 45 No
00:33:00 54.5205 Yes
00:38:00 54.5205 Yes
00:43:00 43.2 No
00:48:00 45 No
00:53:00 54.5205 Yes
00:58:00 54.5584 Yes
01:03:00 54.5205 Yes
01:08:00 54.4826 Yes
01:13:00 54.5205 Yes
01:18:00 54.5205 Yes
01:23:00 54.5205 Yes
01:28:00 48 No
01:33:00 46 No
01:38:00 54.5205 Yes
01:43:00 54.5205 Yes
01:48:00 54.5205 Yes
01:53:00 54.5205 Yes
01:58:00 54.5205 Yes
02:03:00 54.5205 Yes
02:08:00 54.5205 Yes
02:13:00 54.5205 Yes
02:18:00 45 No
02:23:00 41 No
02:28:00 49 No
02:33:00 54.5205 Yes
02:38:00 54.5205 Yes
02:43:00 54.5205 Yes
02:48:00 54.5205 Yes
02:53:00 54.5205 Yes
02:58:00 54.5205 Yes
03:03:00 54.5205 Yes
03:08:00 54.5205 Yes
03:13:00 54.5205 Yes
03:18:00 44 No
03:23:00 47 No
03:28:00 54.5205 Yes
03:33:00 54.5205 Yes
03:38:00 54.5205 Yes

So from the above dataframe, I need to calculate the following,
Condition where it is NO i.e if it is NO how long it is NO which is 5mins, or 10mins etc.

So I expect output of the dataframe as,
Day wise -> 20mins(like day 1 ->5mins, day 2-> 5mins)
month wise -> 120mins(like jan 20mins, feb 20mins)
Year wise -> 500 mins.

Since your question is not reproducible, I'll just write down the approach I would have taken:

  1. Right now you only have beginning of the episode. You can add end of the episode with dplyr::lag(Time).
  2. Next, you can find out sessions where there was same status with rle function. You can use it to assign each session it's ID.
  3. You can then use dplyr::group_by and dplyr::summarize to find out beginning and end of each session.
  4. At this stage you are free to use resulting dataframe and produce any number of summaries. To do that, you need to define a granularity -- e.g., if you want to find out time per day then you need to create a column with date in it, group by that date and summarize by summing all sessions with No in column Condition.
2 Likes

As @mishabalyasin notes, your question is currently not in a reproducible format (and then gives some solid advice, :+1:).

Just to give you more detail re. reproducible examples:

It will help us help you if you can turn this into a self-contained reprex (short for minimal reproducible example).

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.

Thanks mara and mishabalyasin for your time.

Hi mishabalyasin,
For your point 4 approach
Take for Example I have a date column, can I derive as following,
groupby(date) %>% summarise(Time = sum(Time),condition == "No")

Here how to sum all session with "no" in it.

Once again I appreciate you for your reply.

If any doubts will try to post.

summarize doesn't have an argument called condition, so you'll better off filtering out what you don't need before that step, something like this:

library(magrittr)

set.seed(42)

data <- tibble::tibble(length = rnorm(n = 100), 
                       condition = sample(c("Yes",  "No"), size = 100, replace = TRUE), 
                       group = sample(c(1, 2, 3, 4), size = 100,  replace = TRUE))

data %>% 
  dplyr::group_by(group) %>% 
  dplyr::filter(condition == "No") %>%
  dplyr::summarize(outage = sum(length))
#> # A tibble: 4 x 2
#>   group  outage
#>   <dbl>   <dbl>
#> 1    1.  0.410 
#> 2    2. -0.321 
#> 3    3. -1.27  
#> 4    4. -0.0961

Hi mishabalyasin,

Thanks for your comments.

Now I am clear with my question.
Here my question is straightforward.
I need sum total no of mins which is running in battery(i.e voltage less than 50 or condition = "No")

Ques 1: How to sum total mins by day wise, month wise, year wise as following table.
01/01/2017 15mins
02/01/2017 20mins etc.

Month wise
Jan : 200mins etc

Note: I have a date column here, using dplyr and in summarise it is throwing error. If you throw me some command approach I can catch it up.

Ques 2:
I need to find max time(of running in battery) of each day.
Say from 10 to 10.15 i.e 15mins is max time(other time it needs to be less) it needs to be printed.

Ex. as following dataframe.
date maxtime running in battery
01/02/2017 20mins
02/02/2017 15mins

Once again thanks for your time. Please help me out.

Here is my sample dataframe,

Time stamp Battery.Voltage Power f_device_time_date Condition
00:07:00 54.5205 5997.756589 01/02/2017 Yes
00:12:00 54.5205 6179.146292 01/02/2017 Yes
00:18:00 54.4447 6144.672398 01/02/2017 Yes
00:23:00 54.5205 6071.506469 01/02/2017 Yes
00:28:00 54.5205 6059.550123 01/02/2017 Yes
00:33:00 54.5205 6021.680184 01/02/2017 Yes
00:38:00 54.5205 6071.501017 01/02/2017 Yes
00:43:00 54.5205 6047.588326 01/02/2017 Yes
00:48:00 54.5205 6005.727486 01/02/2017 Yes
00:53:00 54.5205 6011.708385 01/02/2017 Yes
00:58:00 54.5584 6019.881107 01/02/2017 Yes
01:03:00 54.5205 6161.209048 01/02/2017 Yes
01:08:00 54.4826 5993.592688 01/02/2017 Yes
01:13:00 54.5205 6011.713837 01/02/2017 Yes
01:18:00 54.5205 5977.823894 01/02/2017 Yes
01:23:00 54.5205 6053.569224 01/02/2017 Yes

Please see above in re. providing a reproducible example. As you've currently given it, your data is not in an "R-friendly" format:

Hi Zara,

Please find the reprex() output at below,

rs

In this output, you haven't loaded the necessary package for the pipe operator %>% (magrittr) as @mishabalyasin did in the reprex he produced for you. With reprex you should be able to paste the entirety of the code and output here – that's what the package does for you.

If you're having trouble with clipboard access, see the method described above using input and outfile arguments.

Also note that in your sample data pasted above, none of the observations meet these criteria:

I need sum total no of mins which is running in battery(i.e voltage less than 50 or condition = "No")