1) dealing with Nas, and 2) designating subgroups for analyses, e.g., dunnTest

soleri · June 11, 2020, 9:36pm

library(reprex)
library(FSA)

FSA v0.8.30. See citation('FSA') if used in publication.

Run fishR() for related website and fishR('IFAR') for related book.

#the minimally representative data set
View(test2)
test2

A tibble: 20 x 5

 REG   COM   NUM   AGE EDU

1 1 1 101 63 2
2 1 1 102 56 6
3 1 1 103 33 3
4 1 1 104 49 6
5 1 1 105 65 3
6 1 2 201 73 6
7 1 2 202 70 5
8 1 2 203 32 6
9 1 2 204 33 6
10 1 2 205 49 6
11 2 5 501 67 3
12 2 5 502 81 3
13 2 5 503 42 6
14 2 5 504 55 Na
15 2 5 505 82 Na
16 2 6 601 75 1
17 2 6 602 62 0
18 2 6 603 63 0
19 2 6 604 67 2
20 2 6 605 79 0

note that there are two regions (REG), and across these, there are four communities (COM). Each row is a households, indicated with individual number (NUM).
#The goal is to be able to conduct analyses at different levels: between REG, between all COM, and between COM within a region. This last designation is what I can't seem to get right.
# 1st question, because of Nas, I can't re-assign EDU as an integer, instead of character. How can I address the NAs issue here?
EDU <- as.numeric(test2$EDU)
Warning message:
NAs introduced by coercion
class(test2$EDU)
[1] "character"

#I'll give enxamples using AGE, which is an integer
AGE <- as.numeric(test2$AGE)
class(test2$AGE)
[1] "integer"

kruskal.test(AGE ~ REG,

```
               data = test2)
```
Kruskal-Wallis rank sum test

data: AGE by REG
Kruskal-Wallis chi-squared = 4.025, df = 1, p-value = 0.04483

#these work
kruskal.test(AGE ~ COM,

```
                 data = test2)
```
Kruskal-Wallis rank sum test

data: AGE by COM
Kruskal-Wallis chi-squared = 4.0894, df = 3, p-value = 0.252

below is same variable, but among COM in REG 1

kruskal.test(AGE ~ COM, REG == 1,

```
                data = test2)
```
Kruskal-Wallis rank sum test

data: AGE by COM
Kruskal-Wallis chi-squared = 0.011043, df = 1, p-value = 0.9163

#that works, e.g., see df.
#2nd question: How to do the same designation for other analyses. For example, Dunn's test also among COM in REG 1. I try the same designation, REG == 1, but doesn't work
DT <- dunnTest(AGE ~ COM, REG ==1,

```
                        data= test2,
```

                           method="bh")

Warning messages:
1: COM was coerced to a factor.
2: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
3: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
4: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
5: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
6: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
7: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
8: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
9: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used
10: In if (altp == FALSE) { :
the condition has length > 1 and only the first element will be used

#I'm brand new to Rstudio, obviously! Your suggestions would be appreciated.

system · July 2, 2020, 9:36pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.