Test for uniform ranking (Likert-type data)

R19 · September 20, 2019, 1:11am

Hi Valeri and everyone,

Firstly I'm using the the output from my whole data set (i.e. "mydata") using the R code solution posted earlier by FJCC.

As explained earlier, the next step for me is:

I would like to test if the mean ranking in Stats1 is uniform i.e. is there actually a difference in the respondents' ranking in Stats1?

The expected mean ranking score (i.e. ImptExpectedMeanRanking vector) for each of model, location, education, fee, and income was calculated as follows:

Expected Mean Ranking Score = ((264/5)*1 +(264/5)*2 +(264/5)*3 +(264/5)*4 +(264/5)*5))/264 = 2.6

As the expected mean ranking is uniform then there is no difference in the expected respondents’ ranking score so the expected mean ranking score is the same for all the variables and is equal to 2.6 (i.e. ImptExpectedMeanRanking vector where the expected mean ranking score for each of model, location, education, fee, and income = 2.6)

H0: Mean rank(model) = Mean rank (location) = Mean rank(education) = Mean rank(fee) = Mean rank(income) = expected uniform mean ranking score = 2.6
i.e.there is no difference in the respondents’ ranks i.e. the mean ranking is uniform.

H1: Any of Mean rank(model), Mean rank (location), Mean rank(education), Mean rank(fee), Mean rank(income) is different from each other i.e. there is a difference in the respondents’ ranks i.e. the mean ranking is not uniform.

In the reprex you can see how I set up and then ran a chi squared test to test this BUT IT DID NOT WORK!!

The output from the chi squared test is as follows (as you will also see towards the end in the reprex):

Error in chisq.test(ImptMeanRanking, ExpectedImptMeanRanking) : 
  'x' and 'y' must have at least 2 levels

library("tidyverse")
library("reprex")

#Calculate means
Columns1 <- mydata %>% select(model:income)
Col1_tall <- Columns1 %>% gather(key = Feature, value = Rank, model:income)
Stats1 <- Col1_tall %>% group_by(Feature) %>% summarize(Avg = mean(Rank))
Stats1

#> # A tibble: 5 x 2
#>   Feature     Avg
#>   <chr>     <dbl>
#> 1 education   2.647727
#> 2 fee         3.481061
#> 3 income      3.037879
#> 4 location    2.852273
#> 5 model       2.981061

#Now I would like to test if the mean ranking in Stats1 is uniform
#i.e. is there a difference in the respondents' ranking in Stats1?
#H0: Mean rank(model) = Mean rank (location) = Mean rank(education) = Mean rank(fee) = Mean rank(income) = expected uniform mean ranking = 2.6
#i.e.there is no difference in the respondents ranks i.e. the mean ranking is uniform.
#H1: Any of Mean rank(model), Mean rank (location), Mean rank(education), Mean rank(fee), Mean rank(income) is different from each other
#i.e. there is a difference in the respondents㤼㸲 ranks i.e. the mean ranking is not uniform.
#I set and ran a chi squared test to test this as follows BUT IT DID NOT WORK!!

ModelMeanRank <- round(mean(mydata$model), digits = 6)
LocationMeanRank <- round(mean(mydata$location), digits = 6)
EducationMeanRank <- round(mean(mydata$education), digits = 6)
FeeMeanRank <- round(mean(mydata$fee), digits = 6)
IncomeMeanRank <- round(mean(mydata$income), digits = 6)

ModelMeanRank
[1] 2.981061
LocationMeanRank
[1] 2.852273
EducationMeanRank
[1] 2.647727
FeeMeanRank
[1] 3.481061
IncomeMeanRank
[1] 3.037879
ImptMeanRanking <- c(ModelMeanRank, LocationMeanRank, EducationMeanRank, FeeMeanRank, IncomeMeanRank)

#ImptExpectedMeanRanking for each of model, location, education, fee, and income was calculated
#as follows: ImptExpectedMeanRanking = ((264/5)*1 +(264/5)*2 +(264/5)*3 +(264/5)*4 +(264/5)*5))/264 = 2.6
ExpectedImptMeanRanking <- c(2.6, 2.6, 2.6, 2.6, 2.6)

#I then ran the chi squared test to test if the observed ranking score (i.e.ImptMeanRanking) was
#statistically different from the expected uniform ranking score i.e. if there is no difference in the respondents ranking scores (i.e.ExpectedImptMeanRanking) )

chisq.test(ImptMeanRanking, ExpectedImptMeanRanking)
Error in chisq.test(ImptMeanRanking, ExpectedImptMeanRanking) : 
  'x' and 'y' must have at least 2 levels

I am really doing something incorrect here and I don't know how to fix it.

Any help or suggestions from anyone about the appropriate statistical tool and correct R code is gratefully appreciated.

Many thanks.

valeri · September 23, 2019, 5:18pm

Hi @R19,

I think that there is some fundamental misunderstanding here what a test (any test really) does. A test is based on a so called test statistic which is a function of some data (and importantly for your case, ideally not a single data point!) and as such is a random variable. This test statistic could have some (typically asymptotic) distribution (like a t distribution or an F distribution - depending really on what function of the data the test statistic actually is). For example in a t-test you would normally ask if the mean of some sample of data is statistically different from some value, e.g., 0. Then you construct the test statistic which is the sample average divided by an estimate of the standard error of the sample mean. In this case the numerator is asymptotically normal while the numerator is asymptotically \chi^2-distributed, which makes the ratio (the actual test statistic) to follow asymptotically a t-distribution.

Given a particular sample of data the test statistic is computed on the data and compared to some critical values (which are simply quantiles of the distribution) which depend on the confidence level of the test as well as whether it is a two-sided or one-sided test.

Now, this is all maybe too theoretical, but what I am trying to say is that you cannot try to run a test whether something is equal to something else when you only have one data point to test that. For example in order to test the model mean rank to 2.6 you should use all of the mydata$model observations, not their mean.

I hope that helps you further.

R19 · September 25, 2019, 1:57am

Hi Valeri,

Thank you for the explanation—it does help.

I understand your explanation regarding the example of if I want to test the model mean rank to 2.6, then I should use all of the mydata$model observations, not their mean as I suggested above.

My problem is that I don't know how to test the model mean rank to 2.6 using all of the mydata$model observations, not their mean in R.

Do you have any suggestions how I could test the model mean rank to 2.6 using all of the mydata$model observations, not their mean in R?

Many thanks.

valeri · September 28, 2019, 4:27pm

Here is one way with some examples:

set.seed(123)

x1 <- rnorm(10) # x1 has a mean of zero so we would expect a test to reject mean of x1 = 2.6
print(t.test(x1-2.6))
#> 
#>  One Sample t-test
#> 
#> data:  x1 - 2.6
#> t = -8.3729, df = 9, p-value = 1.535e-05
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -3.207670 -1.843078
#> sample estimates:
#> mean of x 
#> -2.525374

x2 <- rnorm(10, mean = 2.6) # x2 has a mean of 2.6 so we would expect a test not to reject mean of x2 = 2.6
print(t.test(x2-2.6))
#> 
#>  One Sample t-test
#> 
#> data:  x2 - 2.6
#> t = 0.63552, df = 9, p-value = 0.5409
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  -0.5339710  0.9512149
#> sample estimates:
#> mean of x 
#>  0.208622

system · October 19, 2019, 4:27pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.