Help debugging my R code for gene expression tercile analysis

Hi everyone,

I'm a master's student in biochemistry and currently struggling with an R exercise.
I'm a complete beginner in R and would just need an experienced eye to tell me what I'm doing wrong (since my teacher won’t give any feedback).

Here’s the question:

Load the gene expression dataset "dataset155.Rdata" and answer the following.

  • A gene is considered "active" if its expression level is strictly above its theoretical upper tercile.
  • A gene is considered "inactive" if its expression level is strictly below its theoretical lower tercile.
  1. Determine how many genes have a standard deviation ≤ 0.25.
  2. Among those genes, determine how many are active in at least 58 experiments.
  3. Among the genes identified in question 2, determine how many are inactive in fewer than 44 experiments.

And here’s my code:

Q1

sum(apply(dataset, 2, sd) <= 0.25)

Q2

donnéesok <- which(apply(dataset, 2, sd) <= 0.25)
donnéesok <- dataset[, donnéesok]
moyemp <- apply(donnéesok, 2, mean)
sdemp <- apply(donnéesok, 2, sd)
terciletheosup <- qnorm(2/3, mean = moyemp, sd = sdemp)
comparaison <- sweep(donnéesok, 2, terciletheosup, ">")
sum(apply(comparaison, 2, sum) >= 58)

Q3

indice <- apply(comparaison, 2, sum) >= 58
donnéesok <- donnéesok[, indice]
moyemp <- apply(donnéesok, 2, mean)
sdemp <- apply(donnéesok, 2, sd)
terciletheoinf <- qnorm(1/3, mean = moyemp, sd = sdemp)
comparaison <- sweep(donnéesok, 2, terciletheoinf, "<")
sum(apply(comparaison, 2, sum) < 44)

At least one of my answers is wrong, but I don’t know which step causes the divergence.
Thanks for any help !

Hi, welcome to the forum.

I know nothing about the subject but I think that to help we need some sample data and also an explanation of where or what the error is. If the dataset is not generally available a handy way to supply data is to use the dput() function. Do dput(mydata) where "mydata" is the name of your dataset. For really large datasets probably dput(head(mydata, 1000). Paste it here between

```

```

Are you getting an actual error message or something just does not look right?

Can you copy all your code and paste it here between

```

````

This gives us formatted code that we can copy, paste and run .

Thanks.

Hi, thanks for your answer ! It turns out the issue was due to an error from the teacher, the code was right... Always doubt the education system.

Glad to hear it. Good luck with the rest of the course.

Can you sent me your code and some sample data anyway?

I think your code is too verbose and I might be able to make it a bit simpler.

Thanks.

Unfortunately the exercise is closed, I don't have access to the data anymore.

Oh dear . Thanks anyway.