Merging Rows within one Table

Hello,
I have one very large table, and within it are multiple rows of the same ID number all with different data in columns (some have multiple data in the same columns that need to be added together). I need to merge all of each ID number into one row for each one, with the sum of all data in each column going across.
I have tried using merge() and that has not worked. I also tried the code try1 <- table8 %>% group_by('ID') %>% summarize_all(is.numeric, sum, .groups="drop") and that just added all values into one row without separating out the different ID's.

Any help would be severely appreciated .
Thank you for your time!

try: group_by(ID) I think adding the `` confused it.

# prepare a sample data set
# 7:21 PM Monday, October 9, 2023
set.seed(1111)
num_pool <- seq_len(10)
id_pool <- c(4226300,
             4226046,
             4225702,
             4223923,
             4225365,
             4224713,
             4223881,
             4223464)

foster <- sample(num_pool, 10, replace = TRUE)
aunt <- sample(num_pool, 10, replace = TRUE)
father <- sample(num_pool, 10, replace = TRUE)
mother <- sample(num_pool, 10, replace = TRUE)
id <- sample(id_pool, 10, replace = TRUE)

sample_dataset <-
  data.frame(id,
             foster,
             aunt,
             father,
             mother) |>
  dplyr::mutate(code = "withdrew") |>
  dplyr::relocate(code, .before = foster) |> 
  dplyr::arrange(id)

print(sample_dataset)
#>         id     code foster aunt father mother
#> 1  4223464 withdrew     10    2      2     10
#> 2  4223464 withdrew      1    4      2      9
#> 3  4223881 withdrew      1    7     10      4
#> 4  4223923 withdrew     10    5      3      7
#> 5  4224713 withdrew      4    2      5      7
#> 6  4225365 withdrew      6    5      1     10
#> 7  4226046 withdrew      6    1      4      4
#> 8  4226300 withdrew      6    6      2      8
#> 9  4226300 withdrew      2    8      6      1
#> 10 4226300 withdrew      7    4     10      8

# merge rows into a single row by ID
sample_dataset |>
  dplyr::group_by(id, code) |> 
  dplyr::summarise_if(is.numeric, sum)
#> # A tibble: 7 × 6
#> # Groups:   id [7]
#>        id code     foster  aunt father mother
#>     <dbl> <chr>     <int> <int>  <int>  <int>
#> 1 4223464 withdrew     11     6      4     19
#> 2 4223881 withdrew      1     7     10      4
#> 3 4223923 withdrew     10     5      3      7
#> 4 4224713 withdrew      4     2      5      7
#> 5 4225365 withdrew      6     5      1     10
#> 6 4226046 withdrew      6     1      4      4
#> 7 4226300 withdrew     15    18     18     17

Created on 2023-10-09 with reprex v2.0.2

@fletcj3 I am creating a reproducible example here to demonstrate my solution to your question. I hope you find this helpful. If have any further questions, please let me know.

Hi! Thank you for your help, I really appreciate it. Sadly the codes did not do anything. I have attached a screenshot to show you the response I got from R. I copied your codes exactly. Any further advice?
Thank you!

Hi!
When I do not use the ' ' to highlight a variable, I get the error code that R cannot find the variable. Is this not a normal error?
Thanks!

R is case senstive. so id is different from ID

Hi,
Thank you for catching that mistake! I fixed the capitols in the variable codes. It is still not doing what I need though.
Thank you for your help!

sum is functioning as designed, because what is the sum of 1 , 2 , NA.
it is NA, because you don't know , so you dont know...
if you want ignore the NA's you pass the na.rm=TRUE argument
either

summarise_if(is.numeric,sum,na.rm=TRUE)

or perhaps more explicitly

summarise_if(is.numeric,\(x)sum(x,na.rm=TRUE))

Hi!
Thank you! It worked!

Hi!
Thank you guys for all your help! You saved me so many tears of frustration! Thank you!

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.