# first occurrence of a string value within a group

I have an example data below where I would like to flag in a new variable the first occurrence where VAR does not equal to "1" within each SUBJ. Can you help?

``````data <- read.table(header = T, text = "
SUBJ	TIME	VAR
1	TIME1	1
1	TIME2	1
1	TIME3	1
1	TIME4	1
1	TIME5	1
1	TIME6	1
1	TIME7	5
1	TIME8	1
1	TIME9	3
1	TIME10	1
2	TIME1	1
2	TIME2	1
2	TIME3	6
2	TIME4	1
2	TIME5	3
2	TIME6	2
2	TIME7	1
2	TIME8	1
3	TIME1	1
3	TIME2	1
3	TIME3	1
3	TIME4	4
3	TIME5	2
3	TIME6	1
3	TIME7	1
3	TIME8	8
4	TIME1	1
5	TIME1	1
5	TIME2	1
5	TIME3	2
5	TIME4	1
5	TIME5	4
5	TIME6	1
")
``````

Here is one approach. for SUBJ 4 it returns an NA since there is no case of VAR != 1. That is easily fixed if it is a problem.

``````data <- read.table(header = T, text = "
SUBJ    TIME    VAR
1    TIME1   1
1    TIME2   1
1    TIME3   1
1    TIME4   1
1    TIME5   1
1    TIME6   1
1    TIME7   5
1    TIME8   1
1    TIME9   3
1    TIME10  1
2    TIME1   1
2    TIME2   1
2    TIME3   6
2    TIME4   1
2    TIME5   3
2    TIME6   2
2    TIME7   1
2    TIME8   1
3    TIME1   1
3    TIME2   1
3    TIME3   1
3    TIME4   4
3    TIME5   2
3    TIME6   1
3    TIME7   1
3    TIME8   8
4    TIME1   1
5    TIME1   1
5    TIME2   1
5    TIME3   2
5    TIME4   1
5    TIME5   4
5    TIME6   1
")
FindFirst <- function(X){
which(X != 1)[1]
}
library(dplyr)

data <- data %>% group_by(SUBJ) %>%
mutate(Index = FindFirst(VAR), ROW = row_number(), FIRST = Index == ROW)
#> # A tibble: 7 x 6
#> # Groups:   SUBJ [1]
#>    SUBJ TIME    VAR Index   ROW FIRST
#>   <int> <fct> <int> <int> <int> <lgl>
#> 1     1 TIME1     1     7     1 FALSE
#> 2     1 TIME2     1     7     2 FALSE
#> 3     1 TIME3     1     7     3 FALSE
#> 4     1 TIME4     1     7     4 FALSE
#> 5     1 TIME5     1     7     5 FALSE
#> 6     1 TIME6     1     7     6 FALSE
#> 7     1 TIME7     5     7     7 TRUE
data <- data %>% select(-Index, -ROW)
``````

Created on 2020-05-04 by the reprex package (v0.3.0)

My objective for finding the first occurrence where VAR does not equal to "1" within each SUBJ so I can filter the TIME prior to this event so that I can calculate final TIME from TIME1. I revised your code to be able to do this filtering:

``````data1 <- data %>%
group_by(SUBJ) %>%
mutate(Index = FindFirst(VAR),
ROW = row_number()) %>%
filter(ROW < Index)
``````

Thank you!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.