Monte Carlo simulation

simulated_games <- sample(c("lose","win"), 4, replace = TRUE, prob = c(0.6, 0.4)).

B <- 10000

set.seed(1)

celtic_wins that first replicates the sample code generating the variable called simulated_games for B iterations and then tallies the number of simulated series that contain at least one win for the Celtics.

 celtic_wins <- replicate(B, {simulated_games <- sample(c("lose","win"), 4, replace = TRUE, prob = c(0.6, 0.4))) any(simulated_games == c("win"))

if simulated_games is defined already, why repeat it again in replicate function ? why not just pass in the simulated_games variable instead ? other than that, am I close to getting this simulation correct ?

Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.

Paul, I have no idea what you are trying to do. I'd love to help you, but I can't make sense of your post.

Can you please start with, "I am trying to ..."

somewhere in there also include "my desired output would be ..."

1 Like

As to why the code is repeated, there are two important differences:

  1. When simulated_games is defined in the Global Environment (i.e. the first code block in your post), the value will be a list or vector (can't remember which off the top of my head) of length 4. replicate needs code that does something, not a static value. You could define simulated_games to be a function that calls that sample with the given parameters, and then just pass simulated_games into replicate, but that's not what you've done.
  2. In the first code block you don't have the step where you check if there's "at least one win".

As to whether or not you've structured the simulation correctly.... it looks fine to me after a cursory glance and according to my interpretation of what you're trying to do, but, as other have commented, it'd be helpful if you were more explicit about your goal.

1 Like

is this done in r studio only ?

having trouble with the fourth line. do I need == or %in% ?

I couldnt quite figure out the whole reprex thing beyond just installing it in r studio. I hope my screenshot is enough for now. thanks.
im having trouble with line four. do I need == or %in% ?

Line 4 looks blank to me. At the very least can you please copy and paste the code and surround it by backticks and r? i.e.

```r
code here
```

The reprex::reprex() function runs your code and generates a nicely formatted chunk of R Markdown that includes both your code and its output, ready for pasting into forums like this one. If you have written a truly self-contained example, that means people trying to help you can easily copy-paste your code into their R to run it themselves — but they can also get an idea of what's going on just by reading through what you posted, since they don't have to run the code to see the output. That's the superpower of using reprex to post a reproducible example.

The next best thing is to post your code, formatted as code (as explained by @mara), so even if your helpers can't see the output, they can still copy-paste it into their R and run it themselves.

A screenshot is only helpful if your question is about something visual/graphical.

All this aside, your original code example actually was self-contained. The problem was that you hadn't explained enough of the context of what you're trying to do for helpers to know how to answer your "am I close to getting this simulation correct?" question.

"Monte Carlo simulation" is a broad category that includes many specific methods and implementations. It sounds like maybe you're trying to work through examples to learn about this concept? If so, is there a reference you're using that you could link to or excerpt from to explain what criteria your simulation is trying to meet?

== vs %in%

There's not much point to using %in% when your compare-to vector has only one element, so I'm wondering:

  • are you uncertain about the difference between == and %in%, in general? (If so, you might want to start by reading the docs: ==; %in%)
  • are you getting a result from your code that's different from what you expected (which has led you to question your use of ==)?
1 Like

By "line four" I'm going to assume you mean the fourth line of meaningful code (for future reference, there are line numbers on the left of the screen and it's standard practice to use those to refer to specific lines of code. It's fine here, but your method would make referring to the 253rd line of code impossible).

You had two small syntax errors. Here's a version that ran on my machine:

celtic_wins <- replicate(B, {
  simulated_games <- sample(c("lose","win"), 4, replace = TRUE, prob = c(0.6, 0.4));
  any(simulated_games == c("win"))
})

The first syntax error was a lack of a closing curly-bracket } and then a misplaced closing paren. The second was a lack of newline and/or a semicolon between the sample and any functions in the statement. R doesn't know a priori that those are separate statements without one or both of those characters.

do i copy and paste the script into the console ? if so, then what ?

# This line of sample code simulates four random games where the Celtics either lose or win. Each game is independent of other games.
simulated_games <- sample(c("lose","win"), 4, replace = TRUE, prob = c(0.6, 0.4))


# The variable 'B' specifies the number of times I want the simulation to run. I will run the Monte Carlo simulation 10,000 times.
B <- 10000


# Use the `set.seed` function to make sure my answer matches the expected result after random sampling.
set.seed(1)


# Create an object called `celtic_wins` that first replicates the sample code generating the variable called `simulated_games` for `B` iterations and then tallies the number of simulated series that contain at least one win for the Celtics.

celtic_wins <- replicate(B, {simulated_games <- sample(c("lose","win"), 4, replace = TRUE, prob = c(0.6, 0.4)) any(simulated_games %in% "win")})



# Calculate the frequency out of B iterations that the Celtics won at least one game.  
mean(celtic_wins) 
1 Like

my backticks did not show up

this worked for me. none of the examples i saw online ever showed or mentioned the use of a semicolon between the replicate / sample function and the any function. thank you. now I need to try to do the whole reprex thing. thank you.

whats this "priori" mean ? i have some idea but it is a new word to me as far r and programming go.

A Priori Probability

1 Like

Semicolons can be used to separate statements that are on the same line. They generally aren't used because a new line does much the same thing and there's no particular cost to having multiple lines of code. But in the case where you do have multiple expressions (in this case two separate function calls), you need one or the other. Otherwise R thinks they are all part of the same expression and then gets confused because it's not.

So, for example:

x <- 3; y <- 4; x * y

is the same as:

x <- 3
y <- 4
x * y

is the same as:

x <- 3;
y <- 4;
x * y