Hi, I am greatly aware of function cut() for slicing data according to the value set. However, some of the categories are presented by a range of numbers while some are presented by a single number. For example:
groups <- c("A", "B", "C", "D", "E")
value <- c("9.9-13.9", "32.1-32.9", "0.9-6.9", 73, "14.8 AND 41.1")
data <- cbind(groups, value)
Intent to show the value in ranges but i dont know how and yes, the value in my data are in numbers
The problem is some groups are defined by a range of values, some by single number while the last group, is defined by two different value.
I am intended to substitute my original data into groups as stated in data according to the value stated to plot the frequency of the groups into a plot.
How should i do this? Thank you in advance for any comments and suggestions!
Thank you so much for the help, @pieterjanvc!
I would like to get an extra miles on the analysis as substitute my original data (in numeric) by referring to the groups in data is my final goal. I only have experience in substitute single and multiple values but no luck in able to substitute ranges. May someone suggest a proper reference and if there is a possible way that i can substitute all groups with different info at once?
I'm afraid I do not understand what your question is here. Is the code and output I provided already a step in the right direction?
Please write out a detailed example of a before and after dataset (like I did at the end of my code) so I can see what you are trying to accomplish and explain to me what filtering / substitutions are needed.
Are you trying to filter data depending on whether their range (or values) contain a certain numeric value? For example: get all groups where the number 10.0 is in the range (would be A).
It was as I thought then. Here is my implementation:
library(stringr)
library(dplyr)
options(stringsAsFactors = F)
groups <- c("A", "B", "C", "D", "E")
value <- c("9.9-13.9", "32.1-32.9", "0.9-6.9", 73, "14.8 AND 41.1")
data <- cbind(groups, value)
data = data.frame(data)
data = purrr::map_df(1:nrow(data), function(x){ # x = 1
value = data$value[x]
if(str_detect(value, "-")){
myRange = as.numeric(unlist(str_split(data$value[x], "-")))
data.frame(groups = data$groups[x],
start = myRange[1], end = myRange[2], info = "range")
} else if(str_detect(value, "AND")){
myVals = as.numeric(unlist(str_split(data$value[x], "AND")))
data.frame(groups = data$groups[x],
start = myVals, end = myVals, info = "multiple")
} else if(!is.na(as.numeric(value))){
myVals = as.numeric(value)
data.frame(groups = data$groups[x],
start = myVals, end = myVals, info = "single")
} else {
data.frame(groups = data$groups[x],
start = NA, end = NA, info = "error")
}
})
# New input
value2 <- c(73.0, 32.9, 10.0, 6.1, 14.8, 41.1)
sample <- 1:6
data2 <- cbind(sample, value2)
#Find groups for input
data2 = data.frame(data2)
data2 = data2 %>% mutate(groups = sapply(value2, function(x) {
data %>% filter(start <= x, end >= x) %>% pull(groups)
}))
data2
sample value2 groups
1 1 73.0 D
2 2 32.9 B
3 3 10.0 A
4 4 6.1 C
5 5 14.8 E
6 6 41.1 E
Note that this simple version assumes that there is only one group that can match per value, i.e. the ranges of data in the first dataset do not overlap. If they would, the are multiple groups possible per input and the code needs to be expanded (not that difficult)
Hi,
Yes i did install dplyr and i tried reinstall and rerun it after it is restarted but the problem doesn't resolve...
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data %>% dplyr::filter(start <= x, end >= x) %>% pull(groups)
#> Error in UseMethod("filter_"): no applicable method for 'filter_' applied to an object of class "function"
Hi,
Thank you for everything! I put the problem aside and continue with other parts of my analysis and somehow it works without any error (the only possible reason i updated some other packages in Rstudio but dplyr is not one of them?) Thank you again for helping me out!