why do I have 2 p-values

I have some doubts about the p-value. I made this reprex to show my doubts.

I have a 100x3 tibble. Here I leave a head of the table.

# A tibble: 6 × 3
  A     B     C    
  <fct> <fct> <fct>
1 NA    car   TRUE 
2 NA    bike  FALSE
3 Red   bike  TRUE 
4 NA    bike  FALSE
5 Blue  bike  FALSE
6 Blue  car   TRUE 

The values ​​are:

$A
.
Blue  Red <NA> 
  36   29   35 

$B
.
bike  car <NA> 
  34   31   35 

$C
.
FALSE  TRUE  <NA> 
   58    42     0 

I need to know how to get the p value excluding the NAs, the problem is that I don't know how the packages calculate it. I use 2 packages (tableone) and gt_summary() and none specify.

Also, the 2 packages give me different things and the strange thing is that if I exclude the NA they also continue to give me 2 different things.

Like this:

not excluding NA

With tableone:

base %>%
  CreateTableOne(vars= c("A",
                         "B"),
                 strata= "C",
                 includeNA = F,
                 addOverall = T) %>% 
  print(showAllLevels = T,
        explain= F) 


 level Overall     FALSE      TRUE       p      test
  n       100         58         42                    
  A Blue   36 (55.4)  22 (57.9)  14 (51.9)   0.818     
    Red    29 (44.6)  16 (42.1)  13 (48.1)             
  B bike   34 (52.3)  17 (48.6)  17 (56.7)   0.687     
    car    31 (47.7)  18 (51.4)  13 (43.3) 


With gt_summary:

base %>% 
  tbl_summary(missing_text = "(Missing)",
              missing= "always",
              by= "C") %>%
  add_p() %>%
  add_overall() %>%
 as.tibble()


# A tibble: 8 × 5
  level     `**Overall**, N = 100` `**FALSE**, N = 58` `**TRUE**, N = 42` `**p-value**`
  <chr>     <chr>                  <chr>               <chr>              <chr>        
1 A         NA                     NA                  NA                 0.6          
2 Blue      36 (55%)               22 (58%)            14 (52%)           NA           
3 Red       29 (45%)               16 (42%)            13 (48%)           NA           
4 (Missing) 35                     20                  15                 NA           
5 B         NA                     NA                  NA                 0.5          
6 bike      34 (52%)               17 (49%)            17 (57%)           NA           
7 car       31 (48%)               18 (51%)            13 (43%)           NA           
8 (Missing) 35                     23                  12                 NA     

and now excluding the NA:

With tableone:

base %>%
  filter(!is.na(A) &
           !is.na(B)) %>%
  CreateTableOne(vars= c("A",
                         "B"),
                 strata= "C",
                 addOverall = T) %>% 
  print(showAllLevels = T,
        explain= F) 

 level Overall    FALSE      TRUE       p      test
  n       38         20         18                    
  A Blue  22 (57.9)  12 (60.0)  10 (55.6)   1.000     
    Red   16 (42.1)   8 (40.0)   8 (44.4)             
  B bike  16 (42.1)   6 (30.0)  10 (55.6)   0.206     
    car   22 (57.9)  14 (70.0)   8 (44.4)  

With gt_summary:

base %>% 
  filter(!is.na(A) &
           !is.na(B)) %>%
  tbl_summary(missing_text = "(Missing)",
              missing= "always",
              by= "C") %>%
  add_p() %>%
  add_overall() %>%
  as.tibble()


# A tibble: 8 × 5
  level     `**Overall**, N = 38` `**FALSE**, N = 20` `**TRUE**, N = 18` `**p-value**`
  <chr>     <chr>                 <chr>               <chr>              <chr>        
1 A         NA                    NA                  NA                 0.8          
2 Blue      22 (58%)              12 (60%)            10 (56%)           NA           
3 Red       16 (42%)              8 (40%)             8 (44%)            NA           
4 (Missing) 0                     0                   0                  NA           
5 B         NA                    NA                  NA                 0.11         
6 bike      16 (42%)              6 (30%)             10 (56%)           NA           
7 car       22 (58%)              14 (70%)            8 (44%)            NA           
8 (Missing) 0                     0                   0                  NA    

Why does it give me two different values ​​of p value? Why if I exclude the NA it still gives me two different values? Someone could help me?

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.