Apparently, I didn't pay enough attention in categorical analysis class. I'm headed for calculating odds, odds ratios, and eventually (but not here) the Cochran-Mantel-Haenszel Test. My problem is that the `odds()`

function does not work as expected.

Some definitions:

*Odds* is a way of expressing probabilities, with \text{odds} = \frac{\pi}{1-\pi}. That means p=\frac{1}{2} gives odds of 1, i.e. odds are "even", "50-50".

An *odds ratio* \theta is exactly what it sounds like: the ratio of two odds. It measures the odds that an outcome will occur under a condition compared to the odds of the outcome occurring without the condition. For example, some joint injuries as young adults increase the chances of developing arthritis as an older adult. The odds ratio for this situation would be written

So I'm using the Whickham data set, which looked at long-term survival based on smoking. Researchers recorded a person's age and yes/no smoke levels. Twenty years later, they found the same people and (after doing proper epidemiological adjustments, etc.) recorded whether they were alive or dead. It's in the `mosaicData`

package. I'm going to create a factor variable that groups the ages 18–34, 35–64, and 64+, and store it in `WhickTable`

.

```
library(mosaicData) ## Use mosaicData::Whickham to avoid loading packages
Whickham$AgeRange <- cut(Whickham$age, c(18, 35, 65, Inf), include.lowest = TRUE, labels = c("18 – 34", "35 – 64", "65+"))
WhickTable <- xtabs( ~ smoker + outcome + AgeRange, data=Whickham)
```

The 18–34 group in `WhickTable`

is

```
, , AgeRange = 18 – 34
outcome
smoker Alive Dead
No 236 7
Yes 182 5
```

To compute the odds for being Alive in the 18–34 group, calculate \frac{\text{Number present}}{\text{Number absent}} in the group:

The odds of being Alive + No Smoke is \frac{36}{7} = 33.7143.

The odds of being Alive + Smoke is \frac{182}{5} = 36.4. (I know this seems counterintuitive, but this is also a great Simpson's Paradox example).

Therefore the odds ratio is

and for the epidemiologists our there, the log odds ratio:

...and we could do that for each of the three groups. Note that the odds are in the 30s and the odds ratio is 0.926217.

Tons of good categorical stuff is in Michael Friendly's `vcd`

package, including a function to compute the stuff above

```
library(vcd)
oddsratio(WhickTable, log=FALSE)
# odds ratios for smoker and outcome by AgeRange
#
# 18 – 34 35 – 64 65+
#0.9262166 1.2647636 1.1559140
```

Huzzah! We calculated the odds ratio correctly for the 18–34 group!

According to the `vcd::lodds()`

documentation, I should be able to calculate the odds (33.7143 and 36.4) using the `odds()`

command.

```
> odds(WhickTable, log=FALSE)
# odds for smoker by outcome, AgeRange
#
# AgeRange
#outcome 18 – 34 35 – 64 65+
# Alive 1.296703 0.9450980 4.166667
# Dead 1.400000 0.7472527 3.604651
```

Nope. maybe I should try the `lodds()`

command, intended for calculating log odds, but set `log = FALSE`

```
lodds(WhickTable, log=FALSE)
# odds for smoker by outcome, AgeRange
#
# AgeRange
#outcome 18 – 34 35 – 64 65+
# Alive 1.296703 0.9450980 4.166667
# Dead 1.400000 0.7472527 3.604651
```

Looks like `odds()`

and `lodds()`

do the same thing.

Any idea why `odds()`

isn't giving me the non-logged odds? I'll bet something's staring me in the face, right there in the docs, but I'm not seeing it.