Hi all,
Basically, I've been trying to create a simple script to calculate z-scores for a dataset using dplyr where
zscores <- mydata %>%
mutate_at(c(x:y), function, na.rm = TRUE)
My current function is:
function(x, na.rm = FALSE) (x - mean(x, na.rm = TRUE)) / sd(x, na.rm = TRUE)
which takes the mean and SD of the all values in calculating the z-score. I'm hoping to edit this script so that the mean and SD value is calculated using only values that come from group 1 for each respective column.
For example
grp | trial 1 | trial 2 | trial 3...
1 | 4 | 6 | 3
2 | 3 | 8 | 7
1 | 3 | 5 | 9
3 | 8 | 2 | 2
4 | 7 | 7 | 1
For each column of trials, the script would hopefully calculate a mean and SD from only the values from grp 1 (4 and 3 in trial 1's case). Then, it would use those values to create a z-score for every value in the column.
I've tried editing the function to something like:
(x, na.rm = FALSE) (x - mean(filter(group == "1"), na.rm = TRUE) / sd(filter(group == "1"), na.rm = TRUE))
or
(x, na.rm = FALSE) (x - mean(x[group==1]), na.rm = TRUE) / sd(x(group==1), na.rm = TRUE)) but it hasn't worked as intended. I feel like this should have an easy solution but I'm having a lot of issues figuring it out.
Thanks so much for your help beforehand!