Erica
November 21, 2019, 10:42pm
1
What I have is a large dataset where each observation contains characters for the variable called EA, for example observation 1 EA is "Los Angeles, CA".
I want to select only rows that contain "CA" for in the variable called EA.
Is there a way to do this?
Erica
November 21, 2019, 10:46pm
2
I used grepl to get a string of FALSE and TRUE's, how do I use this to select the rows?
You can do something like this, but very likely you might need to refine the regex pattern for your actual application.
library(tidyverse)
large_dataset <- data.frame(stringsAsFactors = FALSE,
EA = c("Los Angeles, CA", "Other text")
)
large_dataset %>%
filter(str_detect(EA, pattern = "CA"))
#> EA
#> 1 Los Angeles, CA
Created on 2019-11-22 by the reprex package (v0.3.0.9000)
1 Like
lmcewen
December 16, 2019, 8:08pm
4
You can just use the square brackets to subset based on a TRUE/FALSE statement:
example:
df <- df[grepl('CA', df$EA),]