Mutate Column using Column Inputs and Functions

I am trying to create a new column based on a calculation that uses input values from other columns for the same row that is being computed. I am using a function based on another package to conduct my calculations. It will error out due to NAs. But the function I am using does not take the na.rm=T argument. The code below first has an example of the function working ( sample_size <- pwr.t.test) based on explicitly provided input values. Once I try to put this into a mutate function in the next attempt it fails.

library(pwr)
library(tidyverse)
mast_reff <- data.frame(
  stringsAsFactors = FALSE,
                          ind = c("PctOverheadCover","PctBankOverheadCover",
                                  "VegComplexity","VegComplexityWoody",
                                  "VegComplexityUnderstoryGround","PctNoxiousWoodySpecies"),
                     mean_val = c(37.7796407185629,72.4149700598802,1.46291666666667,
                                  0.86375,0.80577380952381,0),
                       sd_val = c(32.5625783793838,24.7446784561184,0.410474173251117,
                                  0.352326208910194,0.33514050503087,0)
           )
## These variables are needed for the sample_size code below and ensures the calculations work
mean <- 38
std_dev <- 32
delta <- 0.20 
alpha <- 0.10
power <- 0.8


sample_size <- pwr.t.test(d = (delta * mean / std_dev), 
                          power = power, 
                          sig.level = alpha, 
                          type = "two.sample", 
                          alternative = "two.sided")$n

print(sample_size)
#> [1] 219.8835

## Replace the "mean" and "std_dev" inputs with values from the mast_reff data.frame ("mean_val" and "sd_val")
## and populate a new column "nSize" with the result
mast_reff |> 
  mutate(nSize = pwr.t.test(d = (delta*mean_val / sd_val), 
                            power = power, 
                            sig.level = alpha, 
                            type = "two.sample", 
                            alternative = "two.sided")$n, .by = ind
  )
#> Error in `mutate()`:
#> ℹ In argument: `nSize = `$`(...)`.
#> ℹ In group 6: `ind = "PctNoxiousWoodySpecies"`.
#> Caused by error in `uniroot()`:
#> ! f.lower = f(lower) is NA

Created on 2024-03-05 with reprex v2.1.0

Any feedback is Greatly appreciated!!

Rather than trying to "group" each row separately, you can mutate using the rowwise function as in the following.

mast_reff[1:5, ] |> rowwise() |>
  mutate(nSize = pwr.t.test(d = (delta*mean_val / sd_val), 
                            power = power, 
                            sig.level = alpha, 
                            type = "two.sample", 
                            alternative = "two.sided")$n
  )

I added rowwise() and took out the .by argument. Also note that I limited it to the first five rows of your data frame. The zeros in the last row apparently screw up the size calculation.

1 Like

Well, I feel silly for posting that now. I finally realized the NaN and 0's were going to throw errors in the calculation. I simply removed those with a na.omit, then removed any row with 0.00 and I was able to get it to work!

This example represented "global" data but within the global data I have groups so now I will move on to creating the same results but for "subsets" or groups of the global data.

library(pwr)
library(tidyverse)
mast_reff <- data.frame(
  stringsAsFactors = FALSE,
                          ind = c("PctOverheadCover","PctBankOverheadCover",
                                  "VegComplexity","VegComplexityWoody",
                                  "VegComplexityUnderstoryGround"),
                     mean_val = c(37.7796407185629,72.4149700598802,1.46291666666667,
                                  0.86375,0.80577380952381),
                       sd_val = c(32.5625783793838,24.7446784561184,0.410474173251117,
                                  0.352326208910194,0.33514050503087)
           )
mast_reff_f <- na.omit(mast_reff)
## These variables are needed for the sample_size code below and ensures the calculations work
mean <- 38
std_dev <- 32
delta <- 0.20 
alpha <- 0.10
power <- 0.8


sample_size <- pwr.t.test(d = (delta * mean / std_dev), 
                          power = power, 
                          sig.level = alpha, 
                          type = "two.sample", 
                          alternative = "two.sided")$n

print(sample_size)
#> [1] 219.8835

## Replace the "mean" and "std_dev" inputs with values from the mast_reff data.frame ("mean_val" and "sd_val")
## and populate a new column "nSize" with the result
mast_reff_size <- mast_reff_f |> 
  mutate(nSize = pwr.t.test(d = (delta*mean_val / sd_val), 
                            power = power, 
                            sig.level = alpha, 
                            type = "two.sample", 
                            alternative = "two.sided")$n, .by = ind
  )
head(mast_reff_size)
#>                             ind   mean_val     sd_val     nSize
#> 1              PctOverheadCover 37.7796407 32.5625784 230.31411
#> 2          PctBankOverheadCover 72.4149701 24.7446785  36.78848
#> 3                 VegComplexity  1.4629167  0.4104742  25.04077
#> 4            VegComplexityWoody  0.8637500  0.3523262  52.12132
#> 5 VegComplexityUnderstoryGround  0.8057738  0.3351405  54.16336

Created on 2024-03-05 with reprex v2.1.0

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.