Sum if variable name

Hi, I have following simple df:


source <- data.frame(
             stringsAsFactors = FALSE,
                  Employee.ID = c("aaa",
                                  "bbb","sss","ccc","fff","ffg","gedd"),
                    Age.Group = c("20-30",
                                  "30-40","20-30","40-50","40-50","20-30",
                                  "20-30"),
            Action_Positivity = c(100, 0, 100, 0, 0, 100, 100),
        Boundaries_Positivity = c(0, 0, 100, 100, 100, 0, 100),
                       Growth = c(3, 4, 2, 4, 5, 5, 5),
  Career.Wellbeing_Positivity = c(100, 0, 0, 0, 0, 100, 100),
              eSat_Positivity = c(0, 0, 100, 100, 100, 100, 100),
                   Technolory = c(2, 5, 3, 4, 4, 1, 5)
)

I am trying to create a new variable (Index) where all variables ending with "Positivity" would sum up so aaa should get 200, bbb 0...gedd 400. How can I do that? I know I should include

ends_with(match = "Positivity")

and

 mutate_if

but I don't know how :frowning:
Can you help?

Hi @Slavek. Below is one approach that uses pivot_longer() and pivot_wider() from the tidyverse.

library(tidyverse)

source_new = source |>
  pivot_longer(cols = contains('_Positivity')) |>
  group_by(Employee.ID) |>
  mutate(Index = sum(value)) |>
  ungroup() |>
  pivot_wider(names_from = name) |>
  relocate(Index, .after = eSat_Positivity)

glimpse(source_new)
#> Rows: 7
#> Columns: 9
#> $ Employee.ID                 <chr> "aaa", "bbb", "sss", "ccc", "fff", "ffg", …
#> $ Age.Group                   <chr> "20-30", "30-40", "20-30", "40-50", "40-50…
#> $ Growth                      <dbl> 3, 4, 2, 4, 5, 5, 5
#> $ Technolory                  <dbl> 2, 5, 3, 4, 4, 1, 5
#> $ Action_Positivity           <dbl> 100, 0, 100, 0, 0, 100, 100
#> $ Boundaries_Positivity       <dbl> 0, 0, 100, 100, 100, 0, 100
#> $ Career.Wellbeing_Positivity <dbl> 100, 0, 0, 0, 0, 100, 100
#> $ eSat_Positivity             <dbl> 0, 0, 100, 100, 100, 100, 100
#> $ Index                       <dbl> 200, 0, 300, 200, 200, 300, 400

Created on 2024-01-20 with reprex v2.0.2

Here is a base R version

source$Index <- rowSums(source[ , grepl(pattern = "_Positivity", x = names(source))])
  • rowSums - sums the rows of
  • source[ , grepl(pattern = "_Positivity", x = names(source))] - the columns which have a "_Positivity" string in them.
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.