New question. Here's my dataframe.
data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
Sampleid = c("AVM_360","AVM_360","AVM_360",
"AVM_362","AVM_362","AVM_362","AVM_362"),
Currentid = c("Bibasis vasutana",
"Bibasis vasutana","Bibasis vasutana","Parnara ganga",
"Parnara ganga","Parnara ganga","Parnara ganga"),
%Match
= c(100, 100, 99.5, 100, 98.6, 97.5, 96.5),
Matchid = c("Bibasis vasutana",
"Burara vasutana","Bibasis nikos","Parnara ganga","Parnara batta",
"Parnara batta","Parnara batta")
)
Here's a code someone earlier helped me write:
(First I do a quick filtering step to only select those values that have a >99 match)
data[,3] <- sapply(data[,3], as.numeric)
ffilter <- function(x,y){
filt <- y>=99
exfilt <- x[filt,]
}
exfilt <- ffilter(data,data$%Match
)
And then:
em <- exfilt %>% group_by(Sampleid) %>%
summarise(any_match = any(Currentid == Matchid),
any_not_match = any(Currentid != Matchid))
This creates a table and checks if any there are or arent any matches between Currentid and matchid, grouped by sampleid.
If I then write:
which(em$any_match==TRUE & em$any_not_match==TRUE)
I get
[1] 1
which corresponds to the AVM_360 sample.
Now here's what I want to do: in this case, sampleid AVM_360 has both a correct match and an incorrect match when filtering at >99. What I would like to do is to somehow extract that data into a new dataframe which looks exactly like my previous ("data", see above), however, this new dataframe should only consist of those values where I have both a correct and incorrect match as described.
Is it possible to do?
And a second question. Is there any way to select specific columns in these brackets [
] without using numbers? I.e., let' s say I have a column named "%Match" and I want to see all row values of that column, and let's say it has position 3. I could write data[,3], but is it in any way possible to instead use the column name, i.e., I would like to write data[,data$%Match] but that doesn't work.
Thank you!!!