Create a new category

Hi everyone:

I´m new in the group. Please help with this instruction because I am very confused.

I have a file "sales" with a category "Year" (1980,...2020).

I need to create a new category "Period" with some conditions: Years from 1980-2018 named as "Past", Year 2019 named as "Present" and Year 2020 named as "Future".

There are some questions needed to be answerwed with the new category "Period" but it is not allowed to remove the category "Year".

Some said to use if.else or mapvalues, but it is impossible to complete.

The problem is to have a category Year as numeric, Period as non numeric and different extensions of the levels.

Could you help me with that? I am lost.

Many thanks,
Mary

Welcome and congratulations with your first question :slightly_smiling_face:

You have three conditions, Year < 2019, Year == 2019 and Year > 2019, so what you need is a nested ifelse() like so:

# Load libraries
library('tidyverse')

# Since we will be sampling som sales, let's do so reproducible
set.seed(448063)

# Create sales data
d = tibble(Year = seq(1980, 2020),
           Sales = runif(41, 10000, 100000))

# Create periods variable using a nested ifelse()
d = d %>%
  mutate(Period = ifelse(Year < 2019, "Past",
                         ifelse(Year == 2019, "Present", "Future")))

Based on you question, I recommend heading over to R4DS to read more

Hope it helps :slightly_smiling_face:

1 Like

Many thanks for your help!!!

Right now I will try. I will check the proposed book too.

Best regards,
Mary

Hi Leon

I have other problem, it is for a concrete activity.

I have a file named "videogamesales".

I need to create a new category "generation", grouping the "sales" of the column "year".
with conditions for grouping sales as a generation:

-year from 1980 to 1989 as "Gold".
-year from 1990 to 1999 as "VideoConsole"
-year from 2000 to 2009 as "NewGen".
-year from 2010 to 2020 as "NextGen".

It is not possible to mark the < > sign for these intervals of years, because the generation category is not numeric and the sign for this is irrelevant for select that.

Is it possible to use if else with that? How will be then?

I think it is different from the last case.

Many thanks,
Mary

It isn't, as soon as you have more than one condition and two upon here depending categories, you'll need nested ifelse(). Here is the thing, I could do this for you, but then you would have learned little to nothing, so work on extending the example code I gave you earlier on and return here with your best bet :slightly_smiling_face:

1 Like

The case_when() function is a pretty sweet replacement for nested ifelse() statements.

#original:
# Create periods variable using a nested ifelse()
d = d %>%
  mutate(Period = ifelse(Year < 2019, "Past",
                         ifelse(Year == 2019, "Present", "Future")))

#new
# Create periods using case_when()
d = d%>%
  mutate(Period = case_when(Year < 2019 ~ 'Past',
                            Year == 2019 ~ 'Present',
                            TRUE ~ 'Future'))
2 Likes

Many thanks Col!!!

Best regards,
Mary

Yup, that also works, but if you are new to R, it has the caveat, that it hides the if/else part, which IMHO is essential for understanding how conditions work and interact :slightly_smiling_face:

Basically the fish versus fishing rod thingy :+1:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.