I have a table of counties, states and their minimum and maximum temperatures. I need to select only the counties that have temperatures in a range such as -15 degrees to 40 degrees. What function would I use?
Hi, can you put your question into reprex?
It would help everyone here help you in a most straightforward manner.
One possible approach is to group by county and then summarize it with minimum and maximum of their temperatures. Then you can use this information to filter out all the counties that are in the range and join it with your original table by name.
@mishabalyasin
climate.minmax <-
climate.data %>%
group_by(County, State) %>%
summarise(temp_min = min(temp_min),
temp_max = max(temp_max))
I did summarize it and that is all I have. Now I need to filter out the ones that do not fit in my desired range. How do I do that?
You can use temp_min
and temp_max
in your new dataset to create a new variable with mutate
(something like mutate(include = temp_min >= -15 & temp_max <= 40)
)
Then you filter to only have rows with TRUE
and use dplyr::semi_join
on your original data.
If you understand sql you can even try data.table package which is the fastest in entire R programming.
library(data.table)
climate.data %>% setDT()
climate.data[,.(temp_min=min(temp_min),
temp_max=max(temp_max)),
by=.(County,State)][
(temp_min > -15) & (temp_max <40),]
FYI
data.table has a syntax like sql something like this
from[where, select, group by]