add a new column to a dataset based on the values of a character variable

Rustam · July 15, 2022, 12:22pm

Dear R-experts,

I have a dataset “df” with a character variable “inst” (shown below as an excerpt from "df", in the form of just one column).

I also have two vectors under the "Values" section in my Rstudio named “object1” and “object2”, each containing 10 different numeric values. Let say, “object1” contains values from 91 to 100, and "“object2” has values from 101 to 110 (in reality those values are different, but it is only an example). I want to create a new column in “df”, in which values from “object1” will correspond to the first 10 rows of "West", and then get applied again to the next 10 rows of "West". Values from “object2” have to correspond to “Atlantic constrained” in “inst”. So the final output has to look like this:

Could you guys suggest the code to do it just in one go?

AlexisW · July 15, 2022, 11:09pm

If df is already ordered, you can simply do:

df$inst <- c(object1, object1, object2)

Or in a tidyverse way,

df <- df |>
  add_column(inst = c(object1, object1, object2))

Here is a reprex:

object1 <- 1:10
object2 <- 101:110
df <- data.frame(New_column = c(rep("West", 20), rep("Atlantic constrained", 10)))

df$inst1 <- c(object1, object1, object2)

library(tibble)

df <- df |>
  add_column(inst2 = c(object1, object1, object2))

df
#>              New_column inst1 inst2
#> 1                  West     1     1
#> 2                  West     2     2
#> 3                  West     3     3
#> 4                  West     4     4
#> 5                  West     5     5
#> 6                  West     6     6
#> 7                  West     7     7
#> 8                  West     8     8
#> 9                  West     9     9
#> 10                 West    10    10
#> 11                 West     1     1
#> 12                 West     2     2
#> 13                 West     3     3
#> 14                 West     4     4
#> 15                 West     5     5
#> 16                 West     6     6
#> 17                 West     7     7
#> 18                 West     8     8
#> 19                 West     9     9
#> 20                 West    10    10
#> 21 Atlantic constrained   101   101
#> 22 Atlantic constrained   102   102
#> 23 Atlantic constrained   103   103
#> 24 Atlantic constrained   104   104
#> 25 Atlantic constrained   105   105
#> 26 Atlantic constrained   106   106
#> 27 Atlantic constrained   107   107
#> 28 Atlantic constrained   108   108
#> 29 Atlantic constrained   109   109
#> 30 Atlantic constrained   110   110

^{Created on 2022-07-15 by the reprex package (v2.0.1)}

Rustam · July 18, 2022, 7:48am

Thanks, got it! But what if "West" has not 20 rows, but a hundred, or two hundred rows? In that case it would be unfeasible to put

df$new_column <- c(object1, object1, object1...)

and so on repeat object1 20 times

nirgrahamuk · July 18, 2022, 9:35am

If there is some principled 'reason' behind combining your objects, I would lean into that, and write a program that combines things based on those reasons, the way you have presented what you want to do seems rather arbitrary, so the code that will get written to implement it will by necessity be quite arbitrary.
That said, if you wish to hack on in this vain... you can at least reduce it to manually writing out a pattern of numbers.

library(tidyverse)
object1 <- 1:10
object2 <- 101:110
df <- data.frame(New_column = c(rep("West", 20), rep("Atlantic constrained", 10)))
df2 <- bind_rows(df,df) #extending the example to be more interesting in how we cover it with repetitions

# construct a pattern programmatically
(onums <- rep(c(1,1:2),2)) #use rep() for easy repetition of fixed patterns
(onames <- paste0("object",onums))
(ovals <- mget(onames))

df2 <- df2 |>
  add_column(inst3 = unlist(ovals))

Rustam · July 18, 2022, 10:53am

Thank you very much! You are really helpful!

AlexisW · July 18, 2022, 12:52pm

There's a function for that: rep(object1, times = 20)

Then, as nirgrahamuk says, if you can lean on the reason for that pattern it's much better (I'm particularly worried about assuming the order of the df matches the order within the objects, one of these days one vector could be sorted differently, and you wouldn't even notice that all your results are wrong).

Rustam · July 19, 2022, 6:18am

Got it! You are absolutely right about the order in df and objects.
I do appreciate your help!

system · July 26, 2022, 6:19am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.