Dear all,
Thank you for all your support and help that I have received so far from this community.
I'm working on the following dataset, and I would like to calculate the range of values for 95% of the speakers for both columns votvd
and votvl
.
speaker votvd votvl
1 00-M-f04 0.05381864 0.02563282
2 00-Y-f03 0.05909734 0.02136499
3 00-Y-f02 0.04568184 0.01828234
4 00-M-f01 0.05474888 0.02120949
5 00-M-f06 0.06269178 0.01647195
6 70-Y-f03 0.05463603 0.02231716
7 00-Y-f06 0.05470651 0.01782035
8 70-O-f03 0.05123922 0.01738909
9 00-O-f03 0.04375921 0.01616929
10 70-M-f01 0.04228886 0.01998891
11 00-O-f01 0.04210959 0.01687892
12 00-O-f02 0.04471604 0.02048789
13 70-M-f02 0.03971611 0.01403043
14 70-Y-f02 0.06074638 0.01355691
15 70-O-f04 0.04915699 0.02119257
16 00-O-f05 0.05579494 0.01725491
17 70-Y-f01 0.03735125 0.01577395
18 70-M-f04 0.04616147 0.01901408
19 70-Y-f04 0.04636063 0.01615609
20 00-M-f03 0.05671241 0.02621205
21 70-M-f07 0.05455009 0.01966456
22 70-O-f01 0.05379974 0.02257897
23 00-Y-f01 0.04546661 0.01847809
Here is the data.
data <- structure(list(speaker = c("00-M-f04", "00-Y-f03", "00-Y-f02", "00-M-f01",
"00-M-f06", "70-Y-f03", "00-Y-f06", "70-O-f03", "00-O-f03", "70-M-f01",
"00-O-f01", "00-O-f02", "70-M-f02", "70-Y-f02", "70-O-f04", "00-O-f05",
"70-Y-f01", "70-M-f04", "70-Y-f04", "00-M-f03", "70-M-f07", "70-O-f01",
"00-Y-f01"), votvd = c(0.0538186361816087, 0.0590973443704265,
0.0456818407451248, 0.0547488762884262, 0.062691784096462, 0.054636032040423,
0.0547065128257382, 0.0512392236172749, 0.0437592077504489, 0.0422888589173195,
0.0421095882310396, 0.0447160447066727, 0.0397161050321998, 0.0607463788135851,
0.04915699000058, 0.055794941335901, 0.0373512463572469, 0.0461614729033426,
0.0463606295363043, 0.0567124147450744, 0.0545500851509402, 0.0537997365006125,
0.0454666136349681), votvl = c(0.0256328208390501, 0.0213649868637071,
0.0182823350591374, 0.0212094920417251, 0.0164719453186502, 0.0223171564809505,
0.0178203531852858, 0.0173890929808758, 0.0161692865783799, 0.0199889141195467,
0.0168789203574063, 0.0204878908105645, 0.0140304290078088, 0.0135569091088139,
0.0211925748569302, 0.0172549136324653, 0.0157739488880231, 0.0190140833820649,
0.0161560917047786, 0.02621204948485, 0.0196645571410369, 0.0225789744983796,
0.0184780850804763)), row.names = c(NA, 23L), class = "data.frame")
More specifically, I want to say something like:
Most speaker (95%) have an overall value between ... and ..., compared to the population mean.
So. if I use something like:
data %>%
pivot_longer(!speaker, names_to = "vot", values_to = "value") -> d1
sapply(d1,ci,ci=0.95)
$value
95% ETI: [14.25, 60.54]
Is this correct? That is, does this mean that 95% of the speaker have an overall value between 14.25 and 60.54? Or this means 95% of the values fall within this range with no reference to the percentage of the speaker involved in this calculation. Am I missing something?
I want a way to support the first interpretation, please.
Thank you in advance!