I want to conduct a statistical inference given a two-way table of frequencies (i.e., contingency table). Thus, I'd like to use the chi-squared test of independence. I've found a very nice streamlined procedure with the `{infer}`

package from `tidymodels`

.

However, I cannot follow the tutorial's example given my own data. The tutorial assumes that the starting point is a dataset with two categorical columns (i.e., of type `factor`

):

```
library(dplyr, warn.conflicts = FALSE)
data(ad_data, package = "modeldata")
ad_data_gen_class <- ad_data |>
select(Genotype, Class)
ad_data_gen_class
#> # A tibble: 333 x 2
#> Genotype Class
#> <fct> <fct>
#> 1 E3E3 Control
#> 2 E3E4 Control
#> 3 E3E4 Control
#> 4 E3E4 Control
#> 5 E3E3 Control
#> 6 E4E4 Impaired
#> 7 E2E3 Control
#> 8 E2E3 Control
#> 9 E3E3 Control
#> 10 E2E3 Impaired
#> # ... with 323 more rows
```

However, *my* starting point is already a contingency table. In other words, imagine that the following `ad_data_xtab`

is a *given*:

```
ad_data_xtab <-
ad_data_gen_class |>
table()
ad_data_xtab
#> Class
#> Genotype Impaired Control
#> E2E2 0 2
#> E2E3 7 30
#> E2E4 1 7
#> E3E3 34 133
#> E3E4 41 65
#> E4E4 8 5
```

**My question:** given `ad_data_xtab`

as the starting point of my analysis, how can I nevertheless use `{infer}`

procedure as demonstrated in the tutorial?

One way, I guess, would be to somehow "untable" `ad_data_xtab`

back into `ad_data_gen_class`

. This has at least two limitations:

- When "un-table-ing"
`ad_data_xtab`

, is it guaranteed that we get exactly`ad_data_gen_class`

? - Unlike
`ad_data_xtab`

, my real data's contingency table has much larger values for counts. If I am to "un-table" it, it would result in a*combinatorial explosion*of millions of rows, eating up my computer's memory (likely crashing it), for apparently no good reason.

What else can I do?