See the FAQ: How to do a minimal reproducible example `reprex`

for beginners. It would look something like the snippet below (which is generated by the `sample`

function, so it is just random, so it wouldn't be helpful to use it to compare each score group based on smoking).

A `p-value`

is the probability that some *test statistic* that is calculated would be expected to be at least as extreme as it is. A "small" p-value is used to evaluate what is termed the "null" hypothesis at a given level of the unfortunately-named "significance." Do not confuse statistically significant with meaningful. Honesty to one's self requires choosing a level of significance, called \alpha, which is conventionally set at `0.05`

, *before* running the test.

An \alpha of `0.05`

is some evidence and, given the nature of the particular data, may be as low as the associations permit. I call it *passing the laugh test—ok, maybe there's something here*. But reflect. Take four 5-shot revolvers and put a bullet in one of them and place them on a table. Have someone re-arrange them out of your sight. Pick one up and put it to your head. Would you pull the trigger knowing that there is *only* a single chance in 20 that you **won't** live to tell the tale?

OK, so you run a test an get a test statistic with a p-value of 0.02, for example. That tells you that you don't have to reject the null hypothesis. (We'll get to that.) That's called *failing to reject the null hypothesis*. But if the result is 0.08, for example, then one is said to *reject the null hypothesis* and *accept the alternative hypothesis* (i.e., the opposite of the null hypothesis).

OK, so different statistical tests use different statistical measures and null hypotheses. A simple one is the `t.test`

, shown below. The null hypothesis in the example is that there is no difference between the mean number of smokers in the scored == 1 group and the scored == 2 group. How do you interpret the output?

Here's a recent article on selecting statistical tests

```
my_data <- data.frame(
smoked =
c(1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1),
scored =
c(4, 1, 3, 3, 4, 5, 3, 1, 3, 1, 5, 6, 2, 6, 3, 6, 6, 4, 1, 1, 3, 3, 5, 5, 4, 4, 6, 3, 3, 5, 2, 4, 1, 2, 3, 3, 1, 1, 2, 6, 1, 3, 6, 1, 1, 6, 1, 4, 4, 2, 1, 1, 5, 5, 2, 2, 3, 5, 1, 2, 5, 3, 6, 3, 2, 4, 4, 3, 1, 1, 1, 6, 5, 4, 6, 3, 1, 2, 1, 3, 5, 1, 3, 5, 1, 3, 2, 6, 6, 3, 1, 3, 2, 1, 3, 1, 4, 4, 3, 6))
head(my_data)
#> smoked scored
#> 1 1 4
#> 2 0 1
#> 3 0 3
#> 4 1 3
#> 5 1 4
#> 6 0 5
with(my_data, t.test(smoked[scored == 1], smoked[scored == 2]))
#>
#> Welch Two Sample t-test
#>
#> data: smoked[scored == 1] and smoked[scored == 2]
#> t = -0.77865, df = 20.573, p-value = 0.445
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -0.5143832 0.2343832
#> sample estimates:
#> mean of x mean of y
#> 0.36 0.50
```