Winzorizing many variables at once

hello friends ,

for winzorizing a vector on 2 and 98 level :
library(DescTools)
d1<-c(1:100)
Winsorize(d1, minval = NULL, maxval = NULL, probs = c(0.02, 0.98),
na.rm = T, type = 1)
it worked fine ,

but i couldn't figure out how can I winzorize multiple column in a date frame at the same time

df <- diamonds
i want to winzorize the column from 5 till 8 , I tried to use the below code :
but didnt work with me

for (i in 5:8) {
Winsorize(df[,i], minval = NULL, maxval = NULL, probs = c(0.02, 0.98),
na.rm = T, type = 1)
}

any suggestion
Thank you

An option is to use dplyr::across() to apply Winsorize() to several columns with one command.

library(DescTools)
library(dplyr)
library(ggplot2)

diamonds %>% 
  mutate(across(depth:z, ~ Winsorize(.x,
                                     minval = NULL,
                                     maxval = NULL,
                                     probs = c(0.02, 0.98),
                                     na.rm = T,
                                     type = 1))
         )
#> # A tibble: 53,940 × 10
#>    carat cut       color clarity depth table price     x     y     z
#>    <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#>  1  0.23 Ideal     E     SI2      61.5    55   463  4.14  4.16  2.55
#>  2  0.21 Premium   E     SI1      59.8    61   463  4.14  4.16  2.55
#>  3  0.23 Good      E     VS1      58.4    63   463  4.14  4.16  2.55
#>  4  0.29 Premium   I     VS2      62.4    58   463  4.2   4.23  2.63
#>  5  0.31 Good      J     SI2      63.3    58   463  4.34  4.35  2.75
#>  6  0.24 Very Good J     VVS2     62.8    57   463  4.14  4.16  2.55
#>  7  0.24 Very Good I     VVS1     62.3    57   463  4.14  4.16  2.55
#>  8  0.26 Very Good H     SI1      61.9    55   463  4.14  4.16  2.55
#>  9  0.22 Fair      E     VS2      64.7    61   463  4.14  4.16  2.55
#> 10  0.23 Very Good H     VS1      59.4    61   463  4.14  4.16  2.55
#> # … with 53,930 more rows

Created on 2023-02-23 with reprex v2.0.2

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.