Hi,

I am using the distinctive() function to remove duplicate rows in my dataframe. Although when I use a similar function in excel it is removing a different number of duplicates than in rstudio. Would anyone know why this is happening or which program is more accurate to use in this scenario?

Welcome to the community!

If I want to remove duplicate rows from a `dataframe`

in R, I use `base::unique`

.

I'm not aware of a function called `distinctive`

in base R, and a quick Google search was in vain. Can you please mention the package where does this function come from?

It'll also be very helpful if you please share a small part of the data set (say `df`

), and different results you obtain (say `df_excel`

and `df_R`

) in a copy-paste friendly format.

In case you don't know how to do it, there are many options, which include:

Thank you for your reply!

Distinctive comes from the package tidyverse. Although, I just used base::unique to remove duplicate rows and it came up with the same result as distinctive.

Unfortunately I cannot share a small part of the data set. Although, if it helps I am working with a very large data set around 127000 rows and 44 columns. When I complete the removal of duplicates in excel it leaves me with 101275 rows but whereas when I complete it in Rstudio it leaves me with 101636 rows.

I cannot work out which one is correct. I understand if it is too difficult to help without the data set.

Thanks

Not quite true, the function is called `distinct()`

not "distinctive" and comes from `dplyr`

package, which is part of the `tidyverse`

.

About your issue, if I was you, I would import the unique values from excel and perform an anti-join with the result of `distinct()`

in R, that way I could take a look to the difference.

Sorry I am new to R so still trying to understand all the terms and everything!

Thanks so much for your help! Using anti-join worked perfectly and I worked out where I went wrong.

Thanks again!

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.