Bootstrap Confidence Intervals for the difference of means

Maisaa · May 29, 2022, 5:14pm

hello,
I'm trying to calculate the coverage probability for Bootstrap Confidence Intervals for the difference in means , but I always get unreasonable results.
I want to create two examples where the data is simulated and one of the examples i want to have a Coverage probability around 0.95 and in the other example i want the Coverage probability to be different from 0.95 (far). for that i did 2 thing : for the first example i sampled data from exponential(rate=3) and for the second example i sampled data from normal(mu=4,sd=7) and exponential(rate=1).
for calculating one bootstrap CI i created a function that takes the "simulated data" (X and Y) and its difference of means (denoted as T_stat) as arguments and returns 1 if the CI includes the T_stat otherwise it returns 0.
then to compute the CP, i used replicate to create B=1000 CI indicators and took the mean of them.
i beileve the problem is in the function which i called "boot_fun" but i cant figure what it is.

the photo attached id for the code of example 2 i tried to create (samples from different distributions) but for example 1 (with the same distribution) i used the code the difference only where i sample X and Y for the first time.

CI - 2

set.seed(4)
n <- 200
m <- 300
B <- 1000

X <- rnorm(n, mean = 4, sd = 7)
Y <- rexp(n, rate = 1)

X_bar <- mean(X)
Y_bar <- mean(Y)
T_stat <- X_bar-Y_bar

#Bootstrap

Boot_func <- function(X,Y){
  n <- length(X)
  m <- length(Y)
  joint <- c(X,Y)
  X_star <- sample(joint, n, replace=T)
  Y_star <- sample(joint, m, replace=T)
  X_star_bar <- mean(X_star)
  Y_star_bar <- mean(Y_star)
  Diff <- X_star - Y_star
  CI <- quantile(Diff, c(0.025,0.975))
  Indicate <- between(T_stat,CI[1],CI[2])
  return(Indicate)
}

CP2 <- mean(replicate(B, Boot_func(X,Y)))

williaml · May 29, 2022, 10:28pm

Hi, can you post your actual code, rather than a screenshot of it?

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

system · June 19, 2022, 10:28pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.