How to filter column with percent data?

Tom_Dale · August 28, 2021, 6:32pm

Hello! I'm new to R and could use some help.

I'm running into a problem that I'm sure has an easy answer, but I feel like an idiot for not finding it in R's documentation.

I'm trying to filter two columns; one has cells with numeric data and the other one has cells that are expressed in percentages.

My filter function is filtering out the former correctly, but it's not filtering the column with percents. Also, I'm not getting any error messages for the latter, which is why I'm so confused.

My project is here:

And the lines of code should be 25-26, or

best_trimmed_flavors_df <- trimmed_flavors_df %>%
filter("Cocoa\nPercent" >=75, Rating >= 3.9)

HanOostdijk · August 28, 2021, 6:48pm

Hello Tom,

shouldn't you just filter on 0.75 (independent of your formatting which I can't see) ?

Tom_Dale · August 28, 2021, 7:20pm

Yeah, I've tried that and it still wouldn't work.

HanOostdijk · August 28, 2021, 8:16pm

Tom,

maybe this helps (but if not provide a reprex for your question) :

library(dplyr) 
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

toms_data <- data.frame(
   "Cocoa\nPercent" = c(80,45) ,
   Rating = c(4.5, 2.1)
 )
 
toms_data
#>   Cocoa.Percent Rating
#> 1            80    4.5
#> 2            45    2.1
 
toms_data %>%
  filter(Cocoa.Percent >=75, Rating >= 3.9)
#>   Cocoa.Percent Rating
#> 1            80    4.5
Created on 2021-08-28 by the reprex package (v2.0.0)

Tom_Dale · August 28, 2021, 9:46pm

OK, so really all I'm trying to do is take a csv with over 1700 rows of data
shrink it to just a handful of most important observations.

cacao_df <-read_csv("cacao_cleaned.csv")

trimmed_cacao_df <- cacao_df %>%
  select(Rating,Company,  "Company\nLocation" , "Cocoa\nPercent" )

best_trimmed_cacao_df <- trimmed_cacao_df %>% 
 filter("Cocoa\nPercent" >= 0.75 & Rating >= 3.9)

OK, so really all I'm trying to do is take a csv with over 1700 rows of data
shrink it to just a handful of most important observations.

cacao_df <-read_csv("cacao_cleaned.csv")
trimmed_cacao_df <- cacao_df %>%
  select(Rating,Company,  "Company\nLocation" , "Cocoa\nPercent" )

The following is the line of code that seems to be tripping up. The console is filtering out the ratings per instructed, BUT the percentage column is not being filtered!
The original csv had a general text format, so I changed the format of the column
to number, but both formats were not being filtered. So I'm stuck.

best_trimmed_cacao_df <- trimmed_cacao_df %>% 
 filter("Cocoa\nPercent" >= 0.75 & Rating >= 3.9)
view(best_trimmed_cacao_df)

Tom_Dale · August 28, 2021, 9:58pm

Ope, I found the solution!

Thanks for your help

My problem was using quotation marks for the column name in the code instead of backticks. Here's the correct code:

best_trimmed_cacao_df <- trimmed_cacao_df %>% 
 filter(`Cocoa\nPercent` >= 0.75, Rating >= 3.9)

system · September 18, 2021, 9:59pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.