Create a Cluster based on an ID

Hello everyone,
Thanks in advance for your help. I'm not very experienced in R. I need to create a Cluster based on an ID, such

i have df like this:
Schermata 2021-10-08 alle 11.59.07

I would like to get this:
Schermata 2021-10-08 alle 11.59.14

I tried doing something like this:

df %>%
group_by(ID) %>%
summarise(Cluster=paste(sort(unique(Cluster)), collapse = ", "))

but the result is this:
Schermata 2021-10-08 alle 12.04.12

what am I doing wrong?

Thank you

Your code is fine, and produces the expected results (at least on my setup)

library(tidyverse)
df <- tibble(ID=c(1,1,1,4,5,6,6,5,1),
                 Cluster=c(1,1,4,8,2,4,7,8,7))

df%>% 
  group_by(ID) %>% 
  summarise(Cluster=paste0(sort(unique(Cluster)),
                           collapse=", "))
# A tibble: 4 x 2
     ID Cluster
  <dbl> <chr>  
1     1 1, 4, 7
2     4 8      
3     5 2, 8   
4     6 4, 7

I don't understand what happened. Before the code worked. Now it doesn't work anymore! could it be a library problem?

when I insert the function: group_by makes me select the library (dplyr) and not tydiverse.

tidyverse is a way to load dplyr and other sub libraries all together, its just for convenience. dplyr::group_by is correct.

whit this code:

library(tidyverse)

df <- tibble(ID=c(1,1,1,4,5,6,6,5,1), Cluster=c(1,1,4,8,2,4,7,8,7))

df%>%
group_by(ID) %>%
summarise(Cluster=paste(sort(unique(Cluster)), collapse=", "))

in my configuration I get this:

  
# A tibble: 1 × 1
  Cluster      
  <chr>        
1 1, 2, 4, 7, 8

could it be caused by the different variable type?

what different variable type are you referring to ?
I can only reproduce your one line result if I entirely omit group_by from the code. otherwise, it is grouping.

what do you see when you type :

getAnywhere(group_by)

i see this


> getAnywhere(group_by)
A single object matching ‘group_by’ was found
It was found in the following places
  package:dplyr
  namespace:dplyr
with value

function (.data, ..., .add = FALSE, .drop = group_by_drop_default(.data)) 
{
    UseMethod("group_by")
}
<bytecode: 0x7fd68f036bc8>
<environment: namespace:dplyr>

when i create the df (like tibble) the variables are different from yours.


# A tibble: 9 × 2
     ID Cluster
  <dbl>   <dbl>
1     1       1
2     1       1
3     1       4
4     4       8
5     5       2
6     6       4
7     6       7
8     5       8
9     1       7

that seems entirely fine.
what do you see when you do

df%>%
  group_by(ID)

I get

# A tibble: 9 x 2
# Groups:   ID [4]
     ID Cluster
  <dbl>   <dbl>
1     1       1
2     1       1
3     1       4
4     4       8
5     5       2
6     6       4
7     6       7
8     5       8
9     1       7

which shows Groups: ID [4] as expected

thats not different from mine. mine are both dbl i..e double also...

me too when I do

df%>% 
  group_by(ID)

see:


# A tibble: 9 × 2
# Groups:   ID [4]
     ID Cluster
  <dbl>   <dbl>
1     1       1
2     1       1
3     1       4
4     4       8
5     5       2
6     6       4
7     6       7
8     5       8
9     1       7

but if I run all the code:

tibble(df%>% 
  group_by(ID) %>% 
  summarise(Cluster=paste(sort(unique(Cluster)),
                           collapse=", ")))

i see:

# A tibble: 1 × 1
  Cluster      
  <chr>        
1 1, 2, 4, 7, 8

you see a tibble 4x2 and i 1x1

your code is substantially different from my code, because brackets are extremely important.
if you want df to be a tibble close the bracket before the first pipe.
the fact that you are doing elaborate manipulations within a tibble creation call is producing anomalous outcomes.

I do not understand. I have now used the same brackets you used. But the result always remains different. my config produces a 1x1 df and doesn't report me the ID


library(tidyverse)
df <- tibble(ID=c(1,1,1,4,5,6,6,5,1),
             Cluster=c(1,1,4,8,2,4,7,8,7))

df%>% 
  group_by(ID) %>% 
  summarise(Cluster=paste0(sort(unique(Cluster)),
                           collapse=", "))
     Cluster
1 1, 2, 4, 7, 8

now try to restart pc

I SOLVED THE PROBLEM.
for all users who will read this problem, I solved it by restarting the computer. I noticed that the software asked me to set the time system (in my case I set UTF-8). After the reboot everything worked.

thanks to #nirgrahamuk for helping me with this problem. I am available for anything

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.