Counting values from a table based on two columns

Hi all,

I'm new to R Studio so this may be a dumb question.

I have imported a table and now I want to count the number of times that two columns have a certain value.

For example, the first column has values ranging from 1 to 4 and the second column has values ranging from 1 to 3. I want to count the number of rows where the first column has a 1 and the second column has a 2.

I imagine the answer to this is quite trivial so again sorry if this is a dumb question.

There are easier ways to do this, but here is one of the simplest

dat <- data.frame(col1 = sample(seq(1:3), 100, replace = TRUE), col2 = sample(seq(1:4), 100, replace = TRUE))

find_length <- function(x) {
  nrow(x[which(x[,1] == 1 & x[,2] == 2),])
}

find_length(dat)
#> [1] 10

Created on 2020-10-06 by the reprex package (v0.3.0.9001)

This illustrates an underlying approach that works well in approaching a problem in R: f(x) = y.

f x and y are three objects with different properties.

x is the object in hand, dat in the reprex. Most objects can be composite, and x has two interior objects, the vectors in the two columns, call them x_i and x_j.

y is the object of desire, in this case it is the same as x with only rows meeting the condition that col1 have a value of 1 and col2 a value of 2.

f is the function that transforms x to y. Like most other objects, it can be a composite of other functions.

The function find_length is a composite function. Read it from the inside out.

& is the logical operator and; an operator is a special kind of function that doesn't require the usual parentheses

== is the logical operator equals

Putting those two together the subexpression

a == b & c == d

evaluates to TRUE or FALSE, depending.

The bracket operators [ and ] subset a vector or data frame by row, column. So, for a vector

v <- c(1,2)
v[2]

evaluates to 2

For a data frame

dat[1,1]
[1] 3

All rows

dat[,1]
  [1] 3 1 2 2 2 2 3 1 3 1 3 2 1 1 2 2 1 2 2 3 1 1 3 3 2 2 1 1 2 1 1 1 1 2
 [35] 1 1 2 1 1 3 2 1 3 1 2 2 2 2 1 3 3 1 3 2 1 3 3 2 2 3 1 1 3 1 1 2 3 2
 [69] 1 2 1 2 2 2 2 2 3 1 3 2 1 1 2 1 2 2 1 3 3 2 2 1 3 3 3 2 1 1 2 2

which performs a logical test and returns TRUE/FALSE

The subsetting by that return expression extracts the rows meeting the condition and nrow counts them.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.