Filter in dplyr HELP

chanuae · April 27, 2021, 9:10pm

Hi I was hoping If anyone could help me with the following.
I have data "hotel reviews", There are 7 different hotels in total with coloms such as good review, bad review, avg score, date etc. First I had to calculate the MCC score for each hotel. I did it with the following:
statsConfusionMatrix <- function(sentlabels, preds) {
mytab<- table(sentlabels, preds)
TP = as.numeric(mytab[2,2])
TN = as.numeric(mytab[1,1])
FN = as.numeric(mytab[2,1])
FP = as.numeric(mytab[1,2])
MCC = (TP * TN - FP * FN) / (sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN)))
return(list(MCC))
}
myresults=list()

Now I have to research for each hotel with a high enough score (MCC> 0.2), whether there are negative comments made all over the beds.
I don't know how to proceed any further.

EconomiCurtis · April 27, 2021, 10:33pm

Welcome. I am not totally sure what your question is. If you have a question about how to acheive your goal with code, it's helpful to pose your question as a reproducible example (reprex). This makes it much easier to understand your issue, and reprexs are great starting points to offer you a suggestion.

It sounds like you are setting up a basic filtering question. That is, you have data, and you'd like to see a subset of your data based on a few conditions. The Tidyverse package dplyr has a filter() function that can be really helpful. Here's a chapter of R4DS that helps you step into these kinds of data wrangling tasks.

You might then group by specific hotel, and calculate summary statistics of those hotels (e.g. negative and positive comments). That R4DS chapter covers basics of those operations as well.

Update: there's a similar question over here. unequal length combining data

Given this is the 2nd of a very similar question, I should make you aware of our homework policy, FAQ: Homework Policy. We are happy to help with homework, but be sure to mark them as such.

system · May 18, 2021, 10:33pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.