converting measurement units in a data frame

Is there a good way of converting measurment units, within a data frame, using the values within the data frame? I could write a manual function to check for specific units, but ideally I would just use the units that are in the units package or somewhere else.

My non-working attempt:

library(tidyverse)
library(units)

# this works
# from https://cran.r-project.org/web/packages/units/vignettes/measurement_units_in_R.html
a <- set_units(5, "ug/L")
units(a) <- make_units(mg/L) # 0.005 [mg/L]

b <- set_units(10, "ug/L")
units(b) <- make_units(mg/L) # 0.01 [mg/L]

# test dataset ---------------------------------------------------
df1 <- tibble::tribble(
  ~CAS_RN,      ~CHEMICAL_NAME, ~CRITERIA, ~ACTION_LEVEL, ~CRITERIA_UNIT,
  "127-18-4", "Tetrachloroethene", "EPA MCL",             5,         "ug/L",
  "7440-38-2",           "Arsenic", "EPA MCL",            10,         "ug/L"
)

# this does not
df1 %>% 
  mutate(new = set_units(ACTION_LEVEL, CRITERIA_UNIT))


# function also does not work
row_units <- function(df, to_units){
  
  value <- df$action_level[1]
  meas_unit <- df$CRITERIA_UNIT[1]
  
  new <- as_units(value, meas_unit)
  units(new) <- make_units(to_units)
  df$new_units <- to_units
  
  return(df)
  
}

# with function
df1 %>% 
  pmap(tibble) %>%  # each row to tibble
  map2_dfr(~row_units(.x, "mg/L"))

## Error in as_mapper(.f, ...) : argument ".f" is missing, with no default

Setting mode = "standard" helps with using mutate, but the new column can only have one unit. The set_units() function only takes a single value for the unit, so I wrapped the column name in unique().

library(tidyverse)
library(units)
#> Warning: package 'units' was built under R version 4.3.2
#> udunits database from C:/Users/xxxxx/Documents/R/win-library/4.2/units/share/udunits/udunits2.xml

df1 <- tibble::tribble(
  ~CAS_RN,      ~CHEMICAL_NAME, ~CRITERIA, ~ACTION_LEVEL, ~CRITERIA_UNIT,
  "127-18-4", "Tetrachloroethene", "EPA MCL",             5,         "ug/L",
  "7440-38-2",           "Arsenic", "EPA MCL",            10,         "ug/L"
)

df1 %>%
  mutate(new = set_units(ACTION_LEVEL, unique(CRITERIA_UNIT), mode = "standard"))
#> # A tibble: 2 × 6
#>   CAS_RN    CHEMICAL_NAME     CRITERIA ACTION_LEVEL CRITERIA_UNIT    new
#>   <chr>     <chr>             <chr>           <dbl> <chr>         [ug/L]
#> 1 127-18-4  Tetrachloroethene EPA MCL             5 ug/L               5
#> 2 7440-38-2 Arsenic           EPA MCL            10 ug/L              10

#what if units are mixed
df2 <- tibble::tribble(
  ~CAS_RN,      ~CHEMICAL_NAME, ~CRITERIA, ~ACTION_LEVEL, ~CRITERIA_UNIT,
  "127-18-4", "Tetrachloroethene", "EPA MCL",             5,         "ug/L",
  "7440-38-2",           "Arsenic", "EPA MCL",            10,         "ng/L"
)
df2 |> group_by(CRITERIA_UNIT) |> 
  mutate(new = set_units(ACTION_LEVEL, unique(CRITERIA_UNIT), mode = "standard"))
#> # A tibble: 2 × 6
#> # Groups:   CRITERIA_UNIT [2]
#>   CAS_RN    CHEMICAL_NAME     CRITERIA ACTION_LEVEL CRITERIA_UNIT    new
#>   <chr>     <chr>             <chr>           <dbl> <chr>         [ng/L]
#> 1 127-18-4  Tetrachloroethene EPA MCL             5 ug/L            5000
#> 2 7440-38-2 Arsenic           EPA MCL            10 ng/L              10

Created on 2024-01-31 with reprex v2.0.2

Excellent. Thanks for that.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.