It's harder to answer that question in general than knowing what your data looks like. So I'll cheat as I've read your other question.
Let's start with 2.
apply()
takes a data frame (or similar), and applies an operation on its rows or columns. Here we use apply(..., 2, ...)
so we apply the function on its columns. For example:
X <- data.frame(x1 = 1:3,
x2 = 4:6)
X
#> x1 x2
#> 1 1 4
#> 2 2 5
#> 3 3 6
apply(X, 2, min)
#> x1 x2
#> 1 4
apply(X, 2, max)
#> x1 x2
#> 3 6
Created on 2023-07-12 with reprex v2.0.2
So an apply()
is a way to make a loop. In other words, apply(X, 2, max)
means "take X, and for each column of X take the max".
Here we have:
apply(sdat[,-1], 2, e.function, seq=sdat[, 1])
That can be translated in "Take sdat[,-1]
, and for each column of sdat[,-1]
take the function e.function()
". But, as we'll see in a second, e.function()
requires two parameters, x
and seq
. So, x
will be each column of sdat[,-1]
, but we also need to provide seq
. We can give it as the 4th argument: seq = sdat[,1]
, that means the first column of sdat
, which is Sequence
.
So, what this does is, for each column of sdat
except the first, pass that column as x
and the first column as seq
and apply e.function()
.
Now let's go to 1. and the definition of e.function()
. I should say tapply()
can be used in many ways, and can be very confusing. Here, we have a single case where both of its inputs are a vector (a single column of sdat
).
tapply()
takes argument X
, a data vector, and INDEX
, a grouping factor. It uses the grouping factor to "split" the data, and applies a function to each of the groups:
x <- 1:7
fac <- list(c("a","a","a","a","b","b","b"))
tapply(x, fac, min)
#> a b
#> 1 5
tapply(x, fac, max)
#> a b
#> 4 7
Finally, let's put it back together:
e.function <- function(x, seq) tapply(x, seq, median)
temp <- apply(sdat[,-1], 2, e.function, seq=sdat[, 1])
What this does is take sdat
, and separate the first column which has protein sequences from the other columns which contain data. Then, for each data column, it takes the median by peptide.