I've a random variable X/n, where X is a binomial random variable with p=1/2 and n= 10, 100, 1000. The goal is to plot the cumulative distribution function of X/n.
I'm wondering why the first argument of the command pbinom is x*n.
delta = 1/1000
x = seq(0, 1, by=delta)
plot(x=NULL, y=NULL, xlim=c(0,1), ylim=c(0,1), xlab='x', ylab='CDF')
n = 10
y = pbinom(x*n, n, 1/2)
lines(x, y, type='s', col='red')
n = 100
y = pbinom(x*n, n, 1/2)
lines(x, y, type='s', col='blue')
n = 1000
y = pbinom(x*n, n, 1/2)
lines(x, y, type='s')
If we label the states of X as 0 and 1, x represents the fraction of the n samples that had X = 1. When x = 0, there were no observations of X = 1 and when x = 1, all n of the samples had X = 1. The other 999 values of x * n represent intermediate cases. When n = 10, almost all of the values of x * n do not represent actually realizable results of integer values of observations. The resulting value remains constant until the next integer threshold, making the stair step appearance of the plot.