Summarize function produce wrong mean

Hi All,

I am looking to create an interaction plot where I need to get the mean of linear growth. Using the summarize function, my mean is different from the mean i manually calcuated from excel.

Here is my a subset of my data.

The mean for the linear growth for yellow at time 2 is 118.7050667.

Using the summarize function on R :
sum = Summarize( Time_factor+ Colour)
sum$se= sum$sd / sqrt(sum$n)

Time_factor Colour n mean sd min Q1 median Q3 max se
1 Time_0 Blue 15 78.60000 26.23193 22 65.50 83.0 91.00 123 6.773056
2 Time_1 Blue 15 84.53333 46.72697 13 48.00 94.0 118.00 147 12.064851
3 Time_2 Blue 15 45.26667 50.18461 2 11.50 31.0 40.50 148 12.957611
4 Time_0 Brown 20 76.95000 25.22629 46 55.00 74.5 102.00 120 5.640770
5 Time_1 Brown 20 86.40000 42.91411 3 65.25 99.5 117.50 138 9.595887
6 Time_2 Brown 20 83.70000 51.79524 10 34.50 88.0 131.75 150 11.581769
7 Time_0 Yellow 15 76.86667 24.27187 45 62.50 69.0 85.50 132 6.266971
8 Time_1 Yellow 15 93.73333 45.38324 1 66.00 110.0 126.50 142 11.717901
9 Time_2 Yellow 15 46.60000 49.46543 4 16.00 27.0 42.50 145 12.771919

this is what i got. all the mean from different colors at different time are different from those manually calculate from excel.

Does anyone know why this happened? or is there a better function?

All the codes type from the console.


sum = Summarize(lineargrown ~ color+Time,)
sum$se= sum$sd / sqrt(sum$n)

pd = position_dodge(.2)

aes(x = Time,
y = mean,
color = color)) +

geom_point(shape = 15,
size = 4,
position = pd) +
geom_errorbar(aes(ymin = mean - se,
ymax = mean + se),
width = 0.2,
size = 0.7,
position = pd) +
theme_bw() +
theme(axis.title = element_text(face = "bold")) +

ylab("Linear Growth")

Please provide the relevant data to calculate your issue. If I run the FSA::Summarize() function on your data above, it works as expected:

data <- data.frame(
  time_factor = rep('Time2',15),
  colour = rep('Yellow',15),
  lin_ext_mm = c(121.524, 103.82, 99.026, 114.38, 109.053,
                 103.55, 96.09, 123.45, 110.53, 163.88,
                 136.013, 95.275, 122.13,126.66,155.195)
sum <- FSA::Summarize(object = data$lin_ext_mm ~ data$time_factor + data$colour)
sum$se <- sum$sd / sqrt(sum$n)

#>   data$time_factor data$colour  n     mean       sd    min      Q1 median
#> 1            Time2      Yellow 15 118.7051 20.45019 95.275 103.685 114.38
#>        Q3    max       se
#> 1 125.055 163.88 5.280216

Created on 2022-08-30 by the reprex package (v2.0.1)

So please provide your data (at least everything you have in R which is associated with data$time_factor == "Time2" and data$color == "Yellow", just so we can check what is going on. However, with the given values from the screenshot, there seems to be no issues at all.

Kind regards

Thanks for this.

I found out the issue. It happens when when im transform the variable of linear extenstion to be numeric. Redid that again and works fine!


