Hi, absolutely newbie here. I've got a .csv with several columns, one of them is gender (labeled with 1 for female, 2 for male, both genders in the same column) and another column is willigness to pay (labeled from 1 to 5 based on a WTP range).
I want to know if there are some significant differences between female's WTP and male's WTP, but have no idea about how can I do it.
I already tried several features as "subset" and "which" to select WTP that corresponds to females and WTP that corresponds to males, so I could compare them, but nothing worked (guess I may be doing something wrong).
install.packages("dplyr")
install.packages("readr")
install.packages("tibble")
wtpmf <- as.tibble(read.csv("full pathname of your csv file here", stringsAsFactors = False)
men <- wtpmf %>% select(gender, wtp)
women <- wtpmf %>% select(gender,wtp)
Now, I'm making assumptions about your csv header line, so substitute out gender and wtp for the corresponding fields.
I suspect that you don't really need separate objects by gender, but it will serve as an introduction to some incredibly helpful tools in the tidyverse to make data manipulation much,much easier.
When you've done that (feel free to ask questions if you get stuck), come back and we can talk about how you would apply a test, such as Chi squared on your object. For that discussion, we'll need FAQ: What's a reproducible example (`reprex`) and how do I do one? as pointed out below.
I wanted to give OP a chance to get in some thinking about fundamentals. Probably the next step is the [in]famous Berkeley Graduate Admissions dataset illustrated briefly here https://cran.cnr.berkeley.edu/doc/manuals/R-data.html and available simply by
data(UCBAdmissions)
And there is an embarrassment of examples of how that data has been analyzed to provide insight in how the OP can approach their analysis. Just add Google