How to generate random numbers to correlate with others?

hey.. I know how to generate a set of random numbers, either by rnorm(or exp/binom/unif etc..) or good ol' sample() and then test how they correlate with others but what im interested in is to generate a set of random numbers so they colerate to a fixed set i already have (or generating both of them, setting just my desired R).

I will be both surprised and happy to find a way to do so!

mvrnorm() from the MASS package lets you feed in whatever variances and covariances you like.

1 Like

hey. thanks,
but it doesnt answer my need.

Let's say i want to create a table of weight and height, and i want them to correlate with eachother so all BMI are 20-30.
For that i need to produce a RNORM() with the mean weight and SE (easy).
but then, not only that i need to produce aRNORM() for height, it needs to correlate to a certain degree with the weight variable. do you understand my question?

if i only could tell R to produce a 100 sample of weight values and then another 100 sample of heights that in the end correlate in __ (let's say, 0.4) with the previous vector

mvrnorm() will let you create two vectors with any correlation you want, including 0.4. But perhaps you are looking for something for something different than the technical term "correlation coefficient."

Note that if weight and height are both normal then you can't guarantee the ratio will be between 20-30 since both the numerator and denominator can take on any value.

Depending on what you really need, you could just generate values for weight and height and throw away any pairs where the ratio is outside the range you want.

To answer the question in the main thread, here's a very popular Stack Overflow question:

This should solve the original question.

1 Like

I will do more reading on this one. in first sight i couldnt figure the arguments out but Dr. google should help me in my own pace.
Cheers! That is actually what i wanted to find.

Merci! very specific indeed. I will read this through

Reading through that post blew my mind to be honest.. another level of mathematics and engineering

The {faux} package has the handy rnorm_multi() function.

Learn more in this vignette Simulate Correlated Variables:

The rnorm_multi() function makes multiple normally distributed vectors with specified parameters and relationships.

Quick example

For example, the following creates a sample that has 100 observations of 3 variables, drawn from a population where A has a mean of 0 and SD of 1, while B and C have means of 20 and SDs of 5. A correlates with B and C with r = 0.5, and B and C correlate with r = 0.25.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.