This is simple enough not to absolutely require cut-and-paste reprex
(see the FAQ, but it's a good idea to cut friction as much as possible.
This is a concise way to do this, with little syntax to master. I'll unpack it below
# fake data created by random sampling
# without a seed, so they are likely to
# be all different and different each
# time data frame is created with this
# snippet
m <- matrix(
c(plate = 1:50,
sample(20:100,50, replace = TRUE),
sample(20:100,50, replace = TRUE),
sample(20:100,50, replace = TRUE)),
nrow = 50,
ncol = 4
)
colnames(m) <- c("plate","m1","m2","m3")
head(m)
#> plate m1 m2 m3
#> [1,] 1 32 72 51
#> [2,] 2 47 60 89
#> [3,] 3 53 84 23
#> [4,] 4 94 98 78
#> [5,] 5 50 72 100
#> [6,] 6 98 61 47
mark_na <- function(x) ifelse(x < 60,NA,x)
m[,2:4] <- apply(m[,2:4],2,mark_na)
head(m)
#> plate m1 m2 m3
#> [1,] 1 NA 72 NA
#> [2,] 2 NA 60 89
#> [3,] 3 NA 84 NA
#> [4,] 4 94 98 78
#> [5,] 5 NA 72 100
#> [6,] 6 98 61 NA
Created on 2023-04-05 with reprex v2.0.2
-
My paradigm of using
R
is school algebra—f(x)=y. x is an object that needs some transformation, y is the object containing the transformation and f is the function object that does the transformation. Each of these may be, and usually is, composite. -
The object chosen for x has a big influence on f. I've used a
matrix
because all the contents to be subject to f is numeric. Amatrix
must be either all character or all numeric. Internally, both columns and rows are vectors. Adata frame
, which is where incoming data usually lands, can mix character and numeric types. *However, both columns and rows are lists. This is an important difference because a matrix can be treated as a single object and transformed more simply. -
The function isolates the logical condition to be tested—whether a value is less than 61, because those are the values to be replaced with
NA
. -
The
matrix
object,m
, has objects and rows. Herem[,2:4]
means all rows ofm
(because therow
position in the brackets is empty and columns 2:4 (if we wanted only the second and fourth column, it would bem[,c(2,4)]
). Think row/column, row/column. If only dealing with columns, it can be shorthandedm[2:3]
which we usually do. When we want to change only some rows, it would be `m[1:7,2:3]. I find it helpful to always have the comma—one less thing to keep track of. -
At this point, we know that we are changing every thing in
m
except the first,plate
column with a value of less than 61 toNA
and we know how. Now, we do that in a single pass by applying our function to the target columns by columns (we could also do it row-wise). That's whatapply
does. -
As far as variable dimensions,
dim()
works like the subset operator, row/column.m
is
> dim(m)
[1] 50 4
The script will work for any numbers of rows. Some wand waving is required for a variable number of columns. Come back with a reprex
if you need help with that case.