Counting values from a table based on two columns

np13 · October 6, 2020, 10:39pm

Hi all,

I'm new to R Studio so this may be a dumb question.

I have imported a table and now I want to count the number of times that two columns have a certain value.

For example, the first column has values ranging from 1 to 4 and the second column has values ranging from 1 to 3. I want to count the number of rows where the first column has a 1 and the second column has a 2.

I imagine the answer to this is quite trivial so again sorry if this is a dumb question.

technocrat · October 6, 2020, 11:49pm

There are easier ways to do this, but here is one of the simplest

dat <- data.frame(col1 = sample(seq(1:3), 100, replace = TRUE), col2 = sample(seq(1:4), 100, replace = TRUE))

find_length <- function(x) {
  nrow(x[which(x[,1] == 1 & x[,2] == 2),])
}

find_length(dat)
#> [1] 10

^{Created on 2020-10-06 by the reprex package (v0.3.0.9001)}

This illustrates an underlying approach that works well in approaching a problem in R: f(x) = y.

f x and y are three objects with different properties.

x is the object in hand, dat in the reprex. Most objects can be composite, and x has two interior objects, the vectors in the two columns, call them x_i and x_j.

y is the object of desire, in this case it is the same as x with only rows meeting the condition that col1 have a value of 1 and col2 a value of 2.

f is the function that transforms x to y. Like most other objects, it can be a composite of other functions.

The function find_length is a composite function. Read it from the inside out.

& is the logical operator and; an operator is a special kind of function that doesn't require the usual parentheses

== is the logical operator equals

Putting those two together the subexpression

a == b & c == d

evaluates to TRUE or FALSE, depending.

The bracket operators [ and ] subset a vector or data frame by row, column. So, for a vector

v <- c(1,2)
v[2]

evaluates to 2

For a data frame

dat[1,1]
[1] 3

All rows

dat[,1]
  [1] 3 1 2 2 2 2 3 1 3 1 3 2 1 1 2 2 1 2 2 3 1 1 3 3 2 2 1 1 2 1 1 1 1 2
 [35] 1 1 2 1 1 3 2 1 3 1 2 2 2 2 1 3 3 1 3 2 1 3 3 2 2 3 1 1 3 1 1 2 3 2
 [69] 1 2 1 2 2 2 2 2 3 1 3 2 1 1 2 1 2 2 1 3 3 2 2 1 3 3 3 2 1 1 2 2

which performs a logical test and returns TRUE/FALSE

The subsetting by that return expression extracts the rows meeting the condition and nrow counts them.

system · October 27, 2020, 11:49pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.