why do I have 2 p-values

juandmaz · April 27, 2023, 8:44pm

I have some doubts about the p-value. I made this reprex to show my doubts.

I have a 100x3 tibble. Here I leave a head of the table.

# A tibble: 6 × 3
  A     B     C    
  <fct> <fct> <fct>
1 NA    car   TRUE 
2 NA    bike  FALSE
3 Red   bike  TRUE 
4 NA    bike  FALSE
5 Blue  bike  FALSE
6 Blue  car   TRUE

The values are:

$A
.
Blue  Red <NA> 
  36   29   35 

$B
.
bike  car <NA> 
  34   31   35 

$C
.
FALSE  TRUE  <NA> 
   58    42     0

I need to know how to get the p value excluding the NAs, the problem is that I don't know how the packages calculate it. I use 2 packages (tableone) and gt_summary() and none specify.

Also, the 2 packages give me different things and the strange thing is that if I exclude the NA they also continue to give me 2 different things.

Like this:

not excluding NA

With tableone:

base %>%
  CreateTableOne(vars= c("A",
                         "B"),
                 strata= "C",
                 includeNA = F,
                 addOverall = T) %>% 
  print(showAllLevels = T,
        explain= F) 


 level Overall     FALSE      TRUE       p      test
  n       100         58         42                    
  A Blue   36 (55.4)  22 (57.9)  14 (51.9)   0.818     
    Red    29 (44.6)  16 (42.1)  13 (48.1)             
  B bike   34 (52.3)  17 (48.6)  17 (56.7)   0.687     
    car    31 (47.7)  18 (51.4)  13 (43.3)

With gt_summary:

base %>% 
  tbl_summary(missing_text = "(Missing)",
              missing= "always",
              by= "C") %>%
  add_p() %>%
  add_overall() %>%
 as.tibble()


# A tibble: 8 × 5
  level     `**Overall**, N = 100` `**FALSE**, N = 58` `**TRUE**, N = 42` `**p-value**`
  <chr>     <chr>                  <chr>               <chr>              <chr>        
1 A         NA                     NA                  NA                 0.6          
2 Blue      36 (55%)               22 (58%)            14 (52%)           NA           
3 Red       29 (45%)               16 (42%)            13 (48%)           NA           
4 (Missing) 35                     20                  15                 NA           
5 B         NA                     NA                  NA                 0.5          
6 bike      34 (52%)               17 (49%)            17 (57%)           NA           
7 car       31 (48%)               18 (51%)            13 (43%)           NA           
8 (Missing) 35                     23                  12                 NA

and now excluding the NA:

With tableone:

base %>%
  filter(!is.na(A) &
           !is.na(B)) %>%
  CreateTableOne(vars= c("A",
                         "B"),
                 strata= "C",
                 addOverall = T) %>% 
  print(showAllLevels = T,
        explain= F) 

 level Overall    FALSE      TRUE       p      test
  n       38         20         18                    
  A Blue  22 (57.9)  12 (60.0)  10 (55.6)   1.000     
    Red   16 (42.1)   8 (40.0)   8 (44.4)             
  B bike  16 (42.1)   6 (30.0)  10 (55.6)   0.206     
    car   22 (57.9)  14 (70.0)   8 (44.4)

With gt_summary:

base %>% 
  filter(!is.na(A) &
           !is.na(B)) %>%
  tbl_summary(missing_text = "(Missing)",
              missing= "always",
              by= "C") %>%
  add_p() %>%
  add_overall() %>%
  as.tibble()


# A tibble: 8 × 5
  level     `**Overall**, N = 38` `**FALSE**, N = 20` `**TRUE**, N = 18` `**p-value**`
  <chr>     <chr>                 <chr>               <chr>              <chr>        
1 A         NA                    NA                  NA                 0.8          
2 Blue      22 (58%)              12 (60%)            10 (56%)           NA           
3 Red       16 (42%)              8 (40%)             8 (44%)            NA           
4 (Missing) 0                     0                   0                  NA           
5 B         NA                    NA                  NA                 0.11         
6 bike      16 (42%)              6 (30%)             10 (56%)           NA           
7 car       22 (58%)              14 (70%)            8 (44%)            NA           
8 (Missing) 0                     0                   0                  NA

Why does it give me two different values of p value? Why if I exclude the NA it still gives me two different values? Someone could help me?

system · June 8, 2023, 8:45pm

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.