I am working with the R programming language.
As a learning exercise, I generated a dataset of 20 random points from a Normal Distribution, created the Maximum Likelihood Function corresponding to these 20 points, and then tried to optimize this function to find out the mean (mu) and the standard deviation (sigma).
First, I generated the random data:
y <- rnorm(20,5,5)
Then, I defined the maximum likelihood function:
my_function <- function(x) {
n = 20
a = -n/2*log(2*pi)
b = -n/2*log(x[2]^2)
c = -1/(2*x[2]^2)
d = sum ((y-x[1])^2)
return( a + b + c* d)
}
Then, I tried to optimize this function three different ways - but all of these ways failed:
Method 1:
#load all libraries
library(pracma)
library(stats)
library(GA)
steep_descent(c(1, 1), my_function, maxiter = 10000)
$xmin
[1] NA
$fmin
[1] NA
$niter
[1] 10001
Warning message:
In steep_descent(c(1, 1), my_function, maxiter = 10000) :
Maximum number of iterations reached -- not converged.
Method 2:
optim(c(1,1), my_function)
$par
[1] -8.487500e+00 1.776357e-16
$value
[1] -5.559631e+34
$counts
function gradient
501 NA
$convergence
[1] 1
$message
NULL
Method 3:
# I redefined the function for the GA Library
my_function <- function(x_1, x_2) {
n = 20
a = -n/2*log(2*pi)
b = -n/2*log(x_2^2)
c = -1/(2*x_2^2)
d = sum ((y-x_1)^2)
return( a + b + c* d)
}
GA <- ga(type = "real-valued",
fitness = function(x) my_function(x[1], x[2]),
lower = c(-20,20), upper = c(-20,20),
popSize = 50, maxiter = 1000, run = 100)
GA@solution
x1 x2
[1,] -20 20
In all 3 methods, the right correct answer is never found - as a matter of fact, none of the 3 methods find anything close to the correct answer.
As per statistical theory, the correct answer (based on the randomly generate data for this question) should be:
> mu = mean(y)
> sigma = sd(y)
> mu
[1] 3.937513
> sigma
[1] 4.707227
Can someone please show me what I am doing wrong and how can I fix this? (i.e. have the optimization methods return answers that are closer to the correct answers)
Thanks!