I have an existing dataframe. i want to search within this dataframes column for words that contain "x" words. Finally, i want to get a new tibble with rows and the columns which contain the the specific words im looking for.
example: This is my dataframe. i want a new tibble with abc, def, ghi, jkl and mno whenever pqr contains "leg"
abc def ghi jkl mno pqr
1 cracker 3 apple 5 blue
4 bubbles 5 cart 9 leg
7 andy 4 spud 3 leg
I want this tibble:
abc def ghi jkl mno pqr
4 bubbles 5 cart 9 leg
7 andy 4 spud 3 leg
It helps if you provide a reproducible example using reprex
or something else:
Why reprex?
Getting unstuck is hard. Your first step here is usually to create a reprex, or reproducible example. The goal of a reprex is to package your code, and information about your problem so that others can run it and feel your pain. Then, hopefully, folks can more easily provide a solution.
What's in a Reproducible Example?
Parts of a reproducible example:
background information - Describe what you are trying to do. What have you already done?
complete set up - include any library() calls and data to reproduce your issue.
data for a reprex: Here's a discussion on setting up data for a reprex
make it run - include the minimal code required to reproduce your error on the data…
I think you're after something like this though.
Example dataset (there are three different species):
library(tidyverse)
iris2 <- as_tibble(iris)
iris2
# A tibble: 150 x 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# ... with 140 more rows
Then, detect the text you want (in this case the string "set") - filtered down to 50 rows:
iris2 %>%
filter(str_detect(Species, "set"))
# A tibble: 50 x 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# ... with 40 more rows
system
Closed
February 13, 2020, 5:27am
3
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.