Hi there,
I was playing around with a dataset in the rrcov
library and I decided to do something very simple / mundane which was to sub-set the dataset being used.
library("rrcov")
library("dplyr")
library("stringr")
# Package with the Fish and many other datasets
data(fish)
# The fish dataset requires a little data wrangling.
fish <- fish %>% mutate(
`Species` = case_when(
`Species` == "1" ~ "Bream",
`Species` == "2" ~ "Whitewish",
`Species` == "3" ~ "Roach",
`Species` == "4" ~ "Parkki",
`Species` == "5" ~ "Smelt",
`Species` == "6" ~ "Pike",
`Species` == "7" ~ "Perch"
)
)
# Species before filter:
Summary1 <- fish %>% count(`Species`, sort = TRUE)
# Decided to filter out few selected species
unique(fish$Species)
# These are the ones to be removed:
rmv_fish <- c("Parkki","Whitewish", "Smelt")
# Initially thought of creating a vector containing the undesired species
# Once issue identified, tried typing it out...
With rmv_fish
my intention was to remove those specific fish using str_detect()
within a filter()
to create the desired sub-set.
Create the sub-set for the desired fish:
# Create a sub-set:
fish2 <- fish %>% rename(`mass_g` = Weight,
`length_cm` = Length3) %>%
select(mass_g, length_cm, Height, Width, Species) %>%
filter(
!str_detect(`Species`, pattern = c("Parkki","Whitewish","Smelt"))
)
# Checking
Summary2 <- fish2 %>% count(`Species`, sort = TRUE)
# Trying to see if spelling was off or something..
fish2 %>% filter(
`Species` == c("Parkki",
"Whitewish",
"Smelt")
)
# Bamboozled, not all fish are being removed, dunno why...
Summary1
Summary2
Summary1
Species n
1 Perch 56
2 Bream 35
3 Roach 20
4 Pike 17
5 Smelt 14
6 Parkki 11
7 Whitewish 6
Summary2
Species n
1 Perch 56
2 Bream 35
3 Roach 20
4 Pike 17
5 Smelt 10
6 Parkki 8
7 Whitewish 4
I noticed that the desired fish were still in the sub-set so I decided to run a reprex
on that bit of code, the below is what I got. There is a drop in the number of observations, the dataset starts with 159 and drops to 150 with the below and no errors are alerted. The error only appears when creating a reprex
fish2 <- fish %>% rename(`mass_g` = Weight,
`length_cm` = Length3) %>%
select(mass_g, length_cm, Height, Width, Species) %>%
filter(
!str_detect(`Species`, pattern = c("Parkki","Whitewish","Smelt"))
)
#> Error in fish %>% rename(mass_g = Weight, length_cm = Length3) %>% select(mass_g, : could not find function "%>%"
Created on 2021-06-17 by the reprex package (v2.0.0)
Any insight on what I'm not seeing?
It must be something very simple, but I can't seem to identify what is causing the issue..
Species
don't appear to be written any differently.
Thanks for the time.