I've got a beginner question for applying computations to data-frames. I'm having some trouble articulating what I want to do and finding results. I was hoping someone here could lend me a hand. Basically, I've got some (example) data that looks like this:
What I want to do is transform it into a new frame with three columns, where I've computed a jitter value:
Jitter Total Class
0.5 0.1 1
0.4 0.1 2
I've created a handy function to compute the jitter for me, which I've defined here:
meanJitterByClass <- function(d, c) {
z <- filter(d, Class==c)
jitter <- c()
for (r in 1:nrow(z)) {
jitter <- c(jitter, d$NWCRT[r] - d$NBCRT[r])
}
return (mean(jitter))
}
Basically, I just call: meanJitterByClass(filter(df, Total=="0.1")) and obtain the mean for a particular class within the data frame, out of the rows in which I have a particular total. But I would like to get this in a data-frame where I've got a mean jitter for each combination of class and total (so I can make a grouped barplot).
I've tried using a custom double for loop to iterate over all utilisations, and then iterate over all classes, and append a row to a data frame. But It's not quite working, and I feel like there's a much more elegant way to do it in R.
To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:
Of course, here is a fully contained example (data is fetched from Github Gist):
library(dplyr)
library(lattice)
# Function to compute jitter
meanJitterByClass <- function(d, c) {
z <- filter(d, Class==c)
jitter <- c()
for (r in 1:nrow(z)) {
jitter <- c(jitter, d$NWCRT[r] - d$NBCRT[r])
}
return (mean(jitter))
}
# Data presentation mode
# Major groups (columns) are things like your Utilisation, or Chain Length
# Minor groups have a row label, like "Class" being "0", "1", or "2"
# Prior to reshape, we're going to format our data as follows:
# VALUES (i.e jitter) | GROUP (i.e. Total) | Subgroup (i.e. mode)
raw <- read.csv(url("https://gist.githubusercontent.com/Micrified/cd3c00bbf8429e5701f0af31b54ed109/raw/c01e7a0a35dc6bceb7b14d6ed38df8210256a7c9/data.csv"))
raw <- as.data.frame(raw)
colnames(raw) <- c("ID", "Utilisation", "Class", "NWCRT", "NBCRT", "NACRT", "Seed", "Total")
df <- NULL
# For each Group (major) (The group column is called "Total" in the data, and has possible values [0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90])
for (i in 1:9) {
# Create three Subgroups (minor) (The subgroup column is called "Mode" and has possible values [1, 2, 3])
subset <- filter(raw, Total==(i*0.10))
rbind(df, c(meanJitterByClass(subset, "0"), i, 0)) -> df
rbind(df, c(meanJitterByClass(subset, "1"), i, 1)) -> df
rbind(df, c(meanJitterByClass(subset, "2"), i, 2)) -> df
}
# Not elegant? Also, has NA inside?
# We can modify our data for a base barplot using reshape (??)
# base <- reshape()