Hi
Here's my dataframe:
data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
Sampleid = c("AVM_360", "AVM_360", "AVM_360"),
Currentid = c("Bibasis vasutana",
"Bibasis vasutana","Bibasis vasutana"),
%Match
= c(100, 100, 99.5),
Matchid = c("Bibasis vasutana", "Burara vasutana", "Bibasis nikos")
)
I want to select the highest values according to "%Match". As you can see, there are two values both with 100.0 match but the "Matchid" is different. How should I write such a code that filters out the highest value of each group (Sampleid), and if there are multiple highest value with the same number, filter all those?
To get rid of the highest values of a column, I'd recommend using {dplyr}
(make sure it's installed!)
dat = data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
Sampleid = c("AVM_360", "AVM_360", "AVM_360"),
Currentid = c("Bibasis vasutana",
"Bibasis vasutana","Bibasis vasutana"),
`%Match` = c(100, 100, 99.5),
Matchid = c("Bibasis vasutana", "Burara vasutana", "Bibasis nikos")
)
dat
#> Sampleid Currentid %Match Matchid
#> 1 AVM_360 Bibasis vasutana 100.0 Bibasis vasutana
#> 2 AVM_360 Bibasis vasutana 100.0 Burara vasutana
#> 3 AVM_360 Bibasis vasutana 99.5 Bibasis nikos
dplyr::filter(dat, `%Match` != max(`%Match`))
#> Sampleid Currentid %Match Matchid
#> 1 AVM_360 Bibasis vasutana 99.5 Bibasis nikos
Created on 2022-02-24 by the reprex package (v2.0.1)
system
Closed
March 4, 2022, 7:50am
4
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.