I am dealing with a dataframe containing several variables and some of them are non-mesurable and reported as "<X" - for example <10.
How can I convert <X values in (X/2)?
I tried the following, using pacman + rio + tidyverse, but it is very long and seems to give some issues:
df2 <- df %>% #CREATE NEW DATA FRAME WITH NO NON-DETECTS
mutate(Benzene = ifelse(str_detect(
string = Benzene,
pattern = "<"
),
parse_number(gsub(
pattern = "<",
replacement = "", x = Benzene
)) / 2, parse_number(Benzene)
)) %>%
mutate(C6_C10 = ifelse(str_detect( #SUBSTITUTE C6-C10
string = C6_C10,
pattern = "<"
),
parse_number(gsub(
pattern = "<",
replacement = "", x = C6_C10
)) / 2, parse_number(C6_C10)
)) %>% #AND SO ON FOR ALL VARIABLES, WHICH ARE 20
Thank you so much for answering. It is appreciated very much.
I have tried your code many times, but I cannot make it work. This is the error I get:
Error: Problem with mutate() input ..1.
x is.character(x) is not TRUE
i Input ..1 is across(C6_C10:Sulphate, process_numbers).
Run rlang::last_error() to see where the error occurred.
Note that columns in my "df", , which are 20, are named C6_C10 throught to Sulphate.
are they all character columns or a mix of character columns with the artifacts your describe, but some purely numeric with no such artifacts ? I think its likely a matter of identifying the variables to process (and which to not), or to add conditional tests into the process function, so it can work with either (i.e. the full set)
They are all value, comprised between 0.01 to 11,000. Some data point were non-detects and reported as less-than a certain value, for example <2 or <0.01.
However, some columns are "dbl" while others are "chr".
rlang::last_error()
<error/dplyr:::mutate_error>
Problem with mutate() input ..1.
x is.character(x) is not TRUE
i Input ..1 is across(C6_C10:Sulphate, process_numbers).
Backtrace:
Run rlang::last_trace() to see the full context.
I add this as an alternative, because in this example, you have defined process_numbers() yourself, sometimes we may wish to apply a function in this way that we don't have control over and using where() can add that additional caveat.
Hi guys,
thanks for all your answers. It still does not work - it seems hard to conver characters '<X" into X/2. I thought with R it would have been easy but I might do it using excel.
I still got:
Error: Problem with mutate() input ..1.
x object 'process_numbers' not found
i Input ..1 is across(C6_C10:Sulphate & where(is.character), process_numbers).
Run rlang::last_error() to see where the error occurred.