New to R - do not know how to use If

Tine · September 11, 2019, 1:22pm

I have a datafile with sales data that I want to clean. I want to make a new EndUser column based on an existing column.
I am used to working in SPSS.

What is wrong with this command?

if (sales$EndUser ="B") {sales$EndUserB <- "B2"}

FJCC · September 11, 2019, 2:10pm

You want to use the vectorised version of if, ifselse(). I used NA to fill rows where EndUser !="B" but you may want to do something else.
By the way, notice that the operator for comparing two values is ==, not =

Df <- data.frame(EndUser = sample(c("A", "B", "C"),8, replace = TRUE ))
Df
#>   EndUser
#> 1       C
#> 2       B
#> 3       A
#> 4       C
#> 5       A
#> 6       C
#> 7       B
#> 8       A
Df$EndUserB <- ifelse(Df$EndUser == "B", "B2", NA)
Df
#>   EndUser EndUserB
#> 1       C     <NA>
#> 2       B       B2
#> 3       A     <NA>
#> 4       C     <NA>
#> 5       A     <NA>
#> 6       C     <NA>
#> 7       B       B2
#> 8       A     <NA>

^{Created on 2019-09-11 by the reprex package (v0.2.1)}

WillP · September 11, 2019, 2:11pm

Hi there,
The logical rule of equivalence is actually '=='. So try this...
if (sales$EndUser =="B") {sales$EndUserB <- "B2"}

Also, you could change your line of code somewhat, to assign in a different way...
sales$EndUserB <- if (sales$EndUser =="B") {"B2"} else {"something else"}

Regards,
Will

WillP · September 11, 2019, 2:13pm

FJCC's answer is definitely better in this case. 'ifelse' is much faster!

Will

stkrog · September 11, 2019, 2:24pm

You don't even have to use if:

# just to make sure Df$EndUserB exists
Df$EndUserB <- NA 
# Set all rows where EndUser == "B" to "B2"
Df$EndUserB[Df$EndUser=="B"] <- "B2"

Tine · September 11, 2019, 2:29pm

Thanks a lot.
I tried the == (had it before as well) but get this error message
Error in if (sales$EndUser == "B") { :
missing value where TRUE/FALSE needed
In addition: Warning message:
In if (sales$EndUser == "B") { :
the condition has length > 1 and only the first element will be used

I first used the if else statement, and made else empty ("").
However, when I would then run the second line (I have a whole list - for instance "CC" needs to become "C", else emply --> it would overwrite my first command.

FJCC · September 11, 2019, 2:52pm

If you have many value pairs, make a vector that maps the values as shown below.

Df <- data.frame(EndUser = c("B", "A", "C", "A", "C", "B", "A"))
Df
#>   EndUser
#> 1       B
#> 2       A
#> 3       C
#> 4       A
#> 5       C
#> 6       B
#> 7       A
MapValues <- c("A" = "AZ", "B" = "B2", "C" = "CC")

Df$EndUserB <- MapValues[Df$EndUser]

Df
#>   EndUser EndUserB
#> 1       B       B2
#> 2       A       AZ
#> 3       C       CC
#> 4       A       AZ
#> 5       C       CC
#> 6       B       B2
#> 7       A       AZ

^{Created on 2019-09-11 by the reprex package (v0.2.1)}

Tine · September 11, 2019, 3:06pm

Thank you Stkrog.

When i run this I get
"Error in Df$EndUserB <- NA : object 'Df' not found"

And I would need EndUserB in my original datafile, to cross it to other columns later.

It is really frustrating this. What kind of course or book would you recommend.

stkrog · September 11, 2019, 3:19pm

I left out the part where you populate the data.frame with your original data. I can see from your original post that the df is named "sales". Just replace "Df" in my post with "sales" and you should be home safe.

But the mapping solution from FJCC is actually much nicer as it handles NAs without initialization of sales$EndUserB.

As for a book, google "R for Data Science" by Hadley Wickham and Garrett Grolemund. I leaned it from "An introduction to R" by Venables & Smith and "Statistics with R" by Dalgaard, but that was ages ago.

Tine · September 11, 2019, 3:29pm

Thanks. It now worked, yeah!

The solution of FJCC acquires something after else. And I do not know what to put there. As I tried to leave it empty. But then when I would run C needs to become C2 or emply ---> the B2 row was empty again.

I will look into the books.

jcblum · September 15, 2019, 8:19am

We have a vintage-but-still-great thread here where people contributed lots of great books and other resources for learning R: What's your favorite intro to R?

There's at least one book out there that specifically targets people transitioning from SPSS:

Tine · September 16, 2019, 9:15am

Thanks a lot! will look into it.

system · October 7, 2019, 9:15am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.