I have four types of classifications (reference numbers) that I wish to group into a dummy variable. Points 1-3 should be labelled 1 and Point 4 should be labelled 2.
Dummy variable = 1: Internal recruitrment
References starting with letters "MH" followed by a five-digit unique identification number (i.e. MH12345, MH45678, MH98743 etc.)
References starting with letters "TM" followed by a five-digit unique identification number (i.e. TM12345, TM45678, TM98743 etc.)
References that are purely numeric containing seven digits (ie. 1234567, 4657893, 5480238 etc. )
Dummy variable = 2: External recruitment
4) References starting with letters "FB" followed by a five-digit unique identification number (i.e. FB12345, FB45678, FB98743 etc.)
Any ideas on how to do this based on the reference number?
Many Thanks,
Naja
Hi Andrercs,
How do I then merge this new dummy variable to the dataframe? I get the results for each row as a output when i run the code, but i need the new dummy variable to merge with my existing dataset in order to run tests on it. Is there an easy way to do this?
Many Thanks,
Naja
dplyr doesn't perform changes in-place, it creates a new data frame instead, if you want to overwrite the original data frame you have to assign the result explicitly, like this.