df <- read.delim(sep="|", text=c("
M1 M2
A/A G/G
T/T NA
NA C/C
A/A G/G
A/A A/A
"))
now I would like to match row1 of M1 column with itself and with other rows of the same column (iterative matching) if it matches then 0 otherwise 1 but any row of M1 column has NA I do not want include that row in matches and I do not match this row with other rows of M1 column and I would like to repeat this with M2 column too and finally I will sum column at bottom (6 for M1 and 10 for M2). any help in this regard is highly appreciated
finally my results looks like this
Your data and explanation are a bit confusing. You show a very small example of input and output, but then you start talking about B3 - B5 which is mentioned nowhere in the data.
I suggest you create a reprex so we can try and help you out. A reprex consists of the minimal code and data needed to recreate the issue/question you're having. You can find instructions how to build and share one here:
Hi @genetist,
I suggest you first split your columns into "first_letter" and "second_letter", and then do the comparisons to see if they are the same:
# Add to your data to show various extra data combinations
df <- read.delim(sep="|", text=c("
|IND|M1|M2|
|1|A/A|G/G|
|2|T/T|NA|
|3|NA|C/C|
|4|T/C|A/T|
|5|G/T|A/A|
"))
df
#> X IND M1 M2 X.1
#> 1 NA 1 A/A G/G NA
#> 2 NA 2 T/T <NA> NA
#> 3 NA 3 <NA> C/C NA
#> 4 NA 4 T/C A/T NA
#> 5 NA 5 G/T A/A NA
suppressPackageStartupMessages({
library(tidyverse)
})
df %>%
select(!c(1,5)) %>%
separate(col=M1, sep="/", into=c("M1_1", "M1_2")) %>%
separate(col=M2, sep="/", into=c("M2_1", "M2_2")) %>%
mutate(m1 = ifelse(M1_1 == M1_2, 1, 0),
m2 = ifelse(M2_1 == M2_2, 1, 0)) -> out.df
out.df
#> IND M1_1 M1_2 M2_1 M2_2 m1 m2
#> 1 1 A A G G 1 1
#> 2 2 T T <NA> <NA> 1 NA
#> 3 3 <NA> <NA> C C NA 1
#> 4 4 T C A T 0 0
#> 5 5 G T A A 0 1
out.df %>%
select(!c(2,3,4,5))
#> IND m1 m2
#> 1 1 1 1
#> 2 2 1 NA
#> 3 3 NA 1
#> 4 4 0 0
#> 5 5 0 1
Dear PJ
Good Afternoon
Sorry my bad post and thanks for your valuable time, Now I edited it I hope It will be better now. I would like to upload my out file being a new member i do not have permission to do that.
Thanking you very much
Dear DavoWW
Good Afternoon
Thanks for your kind reply and for your valuable time to help me. Now i edited my post hope it will be better now. sorry for my bad post
Dear nirgrahamuk
Good Morning
Thank you very much for reply. No problem even if we keep NA in final data, i am much interested in sum row i.e. 26th Row in base_df output and your code working like i expected
thanking you very much for your valuable time spent to help on my problem
Regards