Determine number of days in a time series where a value is exceeded

Hi,

I have a .csv file that contains water level data taken over a 15-minute interval for 2 years. I have attached the first 11 records of the file below for reference. I want to determine how many dates (and what dates) exceed a depth of 0.385 metres. Does anybody have any suggestions as to how to accomplish this? I know that none of the dates show exceed 0.385 m (the below data is just to provide an idea of my data structure).

TIA

     X.        Date.Time   Temp Depth
1     1 2022-08-03 15:09 24.158 0.018
2     2 2022-08-03 15:24 18.426 0.148
3     3 2022-08-03 15:39 17.855 0.149
4     4 2022-08-03 15:54  17.76 0.150
5     5 2022-08-03 16:09  17.76 0.150
6     6 2022-08-03 16:24  17.76 0.151
7     7 2022-08-03 16:39 17.665 0.152
8     8 2022-08-03 16:54 17.665 0.156
9     9 2022-08-03 17:09 17.665 0.158
10   10 2022-08-03 17:24 17.665 0.159
11   11 2022-08-03 17:39 17.665 0.158

It would be helpful if you could share the data in an easier to read fashion. Within R, use the dput(head(datasetname, 11)) function and share the output of that. It will help get more precise help.

Also, what have you tried?

Thank you. Below is the output of my dput() function:

dput(head(blanchard.dat,11))
structure(list(X. = 1:11, Date.Time = c("2022-08-03 15:09", "2022-08-03 15:24", 
"2022-08-03 15:39", "2022-08-03 15:54", "2022-08-03 16:09", "2022-08-03 16:24", 
"2022-08-03 16:39", "2022-08-03 16:54", "2022-08-03 17:09", "2022-08-03 17:24", 
"2022-08-03 17:39"), Temp = c("24.158", "18.426", "17.855", "17.76", 
"17.76", "17.76", "17.665", "17.665", "17.665", "17.665", "17.665"
), Depth = c(0.018, 0.148, 0.149, 0.15, 0.15, 0.151, 0.152, 0.156, 
0.158, 0.159, 0.158)), row.names = c(NA, 11L), class = "data.frame")

I first tried to filter the rows where depth exceeded 0.385 metres using the code below

# Filter rows where Depth > 0.385
blanchard_Filter <- blanchard.dat[blanchard.dat$Depth > 0.385, ]

I then tried to count the number of unique dates where the depth exceeded 0.385 metres using the code:

# Count the number of unique days
blanchard_days <- length(unique(blanchard_Filter$Date))

# Print the result
print(blanchard_days)

The function print(blanchard_dates) gave me a value of 1. But I am not sure if this is correct or if I have an error in my code. I am also unsure how to find what dates exceeded the 0.3850 m threshold.

Your data frame does not have a column named Date. The code above only works because R will do partial matching of column names. If you look at the result of blanchard_days, you will see that it contains timestamp values. Do you want to see how many days have values over the threshold or how many timestamps exceed the threshold?
The code below shows how to get either timestamps or dates that are over the threshold. I lowered the threshold to have some values above it in the sample data set.

blanchard.dat = structure(list(X. = 1:11, Date.Time = c("2022-08-03 15:09", "2022-08-03 15:24", 
                                        "2022-08-03 15:39", "2022-08-03 15:54", "2022-08-03 16:09", "2022-08-03 16:24", 
                                        "2022-08-03 16:39", "2022-08-03 16:54", "2022-08-03 17:09", "2022-08-03 17:24", 
                                        "2022-08-03 17:39"), 
               Temp = c("24.158", "18.426", "17.855", "17.76", 
                        "17.76", "17.76", "17.665", "17.665", "17.665", "17.665", "17.665"
               ), 
               Depth = c(0.018, 0.148, 0.149, 0.15, 0.15, 0.151, 0.152, 0.156, 
                         0.158, 0.159, 0.158)), row.names = c(NA, 11L), class = "data.frame")

blanchard_Filter <- blanchard.dat[blanchard.dat$Depth > 0.152, ]

unique(blanchard_Filter$Date)
#> [1] "2022-08-03 16:54" "2022-08-03 17:09" "2022-08-03 17:24" "2022-08-03 17:39"
blanchard_days <- length(unique(blanchard_Filter$Date))
blanchard_days
#> [1] 4

blanchard_Filter$DateOnly = as.Date(blanchard_Filter$Date.Time)
unique(blanchard_Filter$DateOnly)
#> [1] "2022-08-03"

Created on 2024-12-18 with reprex v2.1.1

Thank you @FJCC . I want to see how many days have values over the threshold, not how many timestamps exceed the threshold.

This should do what you want. The filter function from the dplyr package is very helpful. I've added two additional fake rows of data to your dput output to illustrate how the code works.

library(dplyr)

blanchard.dat <- structure(
  list(
    Date.Time = c(
      "2022-08-03 15:09", "2022-08-03 15:24", "2022-08-03 15:39",
      "2022-08-03 15:54", "2022-08-03 16:09", "2022-08-03 16:24",
      "2022-08-03 16:39", "2022-08-03 16:54", "2022-08-03 17:09",
      "2022-08-03 17:24", "2022-08-03 17:39", "2022-08-03 18:01",
      "2022-08-03 18:25"
    ),
    Temp = c(
      "24.158", "18.426", "17.855", "17.76", "17.76", "17.76",
      "17.665", "17.665", "17.665", "17.665", "17.665", "17.665",
      "17.665"
    ),
    Depth = c(
      0.018, 0.148, 0.149, 0.15, 0.15, 0.151, 0.152, 0.156,
      0.158, 0.159, 0.158, 0.385, 0.400
    )
  ),
  row.names = c(NA, -13L),
  class = "data.frame"
)

blanchard.dat |>
  filter(Depth >= 0.385)

If you run this, it will output only the rows that fit the criteria listed. Here, I've made the bound inclusive (greater than or equal to), but you could edit to make it strictly greater than.

         Date.Time   Temp Depth
1 2022-08-03 18:01 17.665 0.385
2 2022-08-03 18:25 17.665 0.400

To count the number of values that fit the criteria, you can apply nrow to the filtered data, which will count the number of rows matching your criterion. E.g.

high_depth <- blanchard.dat |>
  filter(Depth >= 0.385)

nrow(high_depth)

This will output 2.

I realized - I really like piping, even when it's not always necessary. :slight_smile: Alternatively (no pipes):

high_depth <- filter(blanchard.dat, Depth >= 0.385)

high_depth
nrow(high_depth)

I changed the data so there are three days, one below a value of 0.3 and two above.

library(tidyverse)

blanchard.dat <- structure(
  list(
    Date.Time = c(
      "2022-08-03 15:09", "2022-08-04 15:24", "2022-08-05 15:39",
      "2022-08-03 15:54", "2022-08-04 16:09", "2022-08-05 16:24",
      "2022-08-03 16:39", "2022-08-04 16:54", "2022-08-05 17:09",
      "2022-08-03 17:24", "2022-08-05 17:39", "2022-08-05 18:01",
      "2022-08-04 18:25"
    ),
    Temp = c(
      "24.158", "18.426", "17.855", "17.76", "17.76", "17.76",
      "17.665", "17.665", "17.665", "17.665", "17.665", "17.665",
      "17.665"
    ),
    Depth = c(
      0.018, 0.148, 0.149, 0.15, 0.15, 0.151, 0.152, 0.156,
      0.158, 0.159, 0.158, 0.385, 0.400
    )
  ),
  row.names = c(NA, -13L),
  class = "data.frame"
)

blanchard.dat |>  
  mutate(Date = as_date(ymd_hm(Date.Time))) |> 
  group_by(Date) |> summarise(Max = max(Depth)) |> 
  filter(Max > 0.3) |> 
  count()
#> # A tibble: 1 × 1
#>       n
#>   <int>
#> 1     2

Created on 2024-12-19 with reprex v2.1.1

Thank you. When I run the code below,

library(tidyverse)
blanchard.dat |>  
  mutate(Date = as_date(ymd_hm(Date.Time))) |> 
  group_by(Date) |> summarise(Max = max(Depth)) |> 
  filter(Max > 0.213) |> 
  count()

I get the output

# A tibble: 1 × 1
      n
  <int>
1   116

Does this mean that there are 116 fifteen-minute intervals where Depth is greater than 0.213 metres, or how do I interpret this output? Also how do I know what dates have a Depth exceeding 0.213 m?

Stepping through your last bit of code, as_date() stripped off the time component of the date-time. The group_by(Date) combined with summarize() should yield one entry per day, where Max yields the daily maximum depth. The count of 116 should therefore give you the number of days where the max depth was above your cut-off of 0.213.

If you want the dates, remove the last pipe and count() and save all of that to a variable:

library(tidyverse)
blanchard.filtered <- blanchard.dat |>  
  mutate(Date = as_date(ymd_hm(Date.Time))) |> 
  group_by(Date) |> summarise(Max = max(Depth)) |> 
  filter(Max > 0.213)

blanchard.filtered$Date  # This yields the list of dates
count(blanchard.filtered)  # This yields the number of days
1 Like

Thank you so much for your help. That appears to have worked.