You can do something using dplyr workflow
var <- c(0,1,0,0,0, 1,1, 1,0, 0,1,1,0,0,1,1,1,1,1,0, 0, 1, 0, 1, 1)
library(dplyr)
#>
#> Attachement du package : 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
res <- tibble(var = var) %>%
# use lag to see if next value is equal to previous
mutate(diff = lag(var, 1, default = 9)) %>%
mutate(equal = var != diff) %>%
# get an id for each group
mutate(seq_id = cumsum(equal)) %>%
# we are interesting only by the 1s
filter(var == 1) %>%
# count how many 1s in each group
count(var, seq_id) %>%
# count how many times there is n 1s
count(n)
res
#> # A tibble: 4 x 2
#> n nn
#> <int> <int>
#> 1 1 2
#> 2 2 2
#> 3 3 1
#> 4 5 1
You can then see if there is a SQL translation that works for your database. Here is a generic translation.
library(dbplyr)
#>
#> Attachement du package : 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#>
#> ident, sql
lazy_frame(var = var, src = simulate_dbi()) %>%
# use lag to see if next value is equal to previous
mutate(diff = lag(var, 1, default = 9)) %>%
mutate(equal = var != diff) %>%
# get an id for each group
window_order(equal) %>% # needed. see warning without
mutate(seq_id = cumsum(equal)) %>%
# we are interesting only by the 1s
filter(var == 1) %>%
# count how many 1s in each group
count(var, seq_id) %>%
# count how many times there is n 1s
count(n) %>% show_query()
#> <SQL> SELECT "var", "n", COUNT() AS "nn"
#> FROM (SELECT "var", "seq_id", COUNT() AS "n"
#> FROM (SELECT *
#> FROM (SELECT "var", "diff", "equal", sum("equal") OVER (ORDER BY "equal" ROWS UNBOUNDED PRECEDING) AS "seq_id"
#> FROM (SELECT "var", "diff", "var" != "diff" AS "equal"
#> FROM (SELECT "var", LAG("var", 1, 9.0) OVER () AS "diff"
#> FROM "df") "ykyweekteo") "axbeluvfyo") "nhimatobvm"
#> WHERE ("var" = 1.0)) "qxojiberyj"
#> GROUP BY "var", "seq_id") "rrkfkuqlhj"
#> GROUP BY "var", "n"
otherwise, in memory, you have also the data.table framework that can help you to get the same results. It has included an efficient rleid
function.
var <- c(0,1,0,0,0, 1,1, 1,0, 0,1,1,0,0,1,1,1,1,1,0, 0, 1, 0, 1, 1)
library(data.table)
tab <- data.table(var = var)
tab[, id := rleid(var)]
res <- tab[var == 1, .(nb1 = .N), keyby = .(id, var)][, .(nb2 = .N), keyby = nb1]
res
#> nb1 nb2
#> 1: 1 2
#> 2: 2 2
#> 3: 3 1
#> 4: 5 1
the number of 1 consecutive 1s, is 2
the number of 2 consecutive 1s is 2
the number of 3 consecutive 1s is 1
the number of 4 consecutive 1s is 0
the number of 5 consecutive 1s us 1