I am working with the R programming language. I am trying to optimize a function that can accept "numerical" and "factor" inputs. For the optimization, I am using the "GA" library. Here are the references I am using:
- Demo : A quick tour of GA
- Actual Library: CRAN - Package GA
- Specific Function I am Using: ga function - RDocumentation
Suppose I have a function that looks like this:
my_function <- function(r1, r2) {
#define function here, e.g:
#this "select" can be done using "dplyr" or SQL
part1 <- SELECT * FROM my_data WHERE (col_1 IN r1) AND (col_2 > r2)
part2<- mean(part1$col_3)
}
In this example:
- "r1" can take any "group" of values of "a, b, c, d" (factor variable.) . For example, "r1 = a", "r1 = a, d", "r1 = b,c,a", "r1 =c", "r1 = a, b, c, d", etc.
- "r2" can take a single value between 1 and 100 (numeric variable).
- "my_data" is a dataset that has 3 columns : col_1 (factor variable, can only take values ""a, b, c, d"), col_2 (numeric variable), col_3 (numeric variable .
- "my_data" will be "subsetted" according to "r1" and "r2".
- the "mean" of col_3 is the value that "my_function" will return, given a choice of "r1" and "r2".
- the "mean" of col_3 will is the value that I am trying to optimize for a choice of "r1" and "r2".)
Problem: Currently, I am trying to optimize "my_function" using the "ga" function in R:
library(GA)
GA <- ga(type = "real-valued",
fitness = function(x) my_function(x[1], x[2]),
lower = c(c("a", "b", "c", "d"), 1), upper = c(c("a", "b", "c", "d"), 100),
popSize = 50, maxiter = 1000, run = 100)
But I am not sure how to set this up correctly. I am not sure how to correctly define "my_function", and I am not sure how to correctly define "GA".
Can someone please show me how to do this?
Thanks