need to run four models logistic regression, classification tree, bagging and random forest on attach dataset and find out best model after cross validation.
(No selection)
Name Type Value
mod_9 | list [30] (S3: glm, lm) | List of length 30 |
---|---|---|
coefficients | double [31] | 0.631667 -0.185302 0.021356 -0.000246 -1.128435 -1.024064 ... |
residuals | double [9037] | -1.24 -1.09 -1.24 -1.08 -1.00 -1.06 ... |
fitted.values | double [9037] | 1.92e-01 7.90e-02 1.95e-01 7.73e-02 1.54e-08 5.77e-02 ... |
effects | double [9037] | 40.591 -0.124 -16.170 -1.065 1.435 0.572 ... |
R | double [31 x 31] | -3.49e+01 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 -1.60e+01 1.74e+01 ... |
rank | integer [1] | 31 |
qr | list [5] (S3: qr) | List of length 5 |
family | list [12] (S3: family) | List of length 12 |
linear.predictors | double [9037] | -1.44 -2.46 -1.42 -2.48 -17.99 -2.79 ... |
deviance | double [1] | 7526.07 |
aic | double [1] | 7588.07 |
null.deviance | double [1] | 8694.365 |
iter | integer [1] | 17 |
weights | double [9037] | 1.55e-01 7.28e-02 1.57e-01 7.13e-02 4.18e-08 5.43e-02 ... |
prior.weights | double [9037] | 1 1 1 1 1 1 ... |
df.residual | integer [1] | 9006 |
df.null | integer [1] | 9036 |
y | double [9037] | 0 0 0 0 0 0 ... |
converged | logical [1] | TRUE |
boundary | logical [1] | FALSE |
model | list [9037 x 12] (S3: data.frame) | A data.frame with 9037 rows and 12 columns |
call | language | glm(formula = Status ~ DOJ Extended + Notice period + `Duration to accept of ... |
formula | formula | Status ~ DOJ Extended + Notice period + Duration to accept offer + `O ... |
terms | formula | Status ~ DOJ Extended + Notice period + Duration to accept offer + `O ... |
data | list [9037 x 15] (S3: tbl_df, tbl, data.frame) | A tibble with 9037 rows and 15 columns |
offset | NULL | Pairlist of length 0 |
control | list [3] | List of length 3 |
method | character [1] | 'glm.fit' |
contrasts | list [7] | List of length 7 |
xlevels | list [7] | List of length 7 |
[image]
R 4.2.1
·
~/R project/
Console
Terminal [image]
Background Jobs [image]
R 4.2.1 · ~/R project/ [image] [image]
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid" Copyright (C) 2022 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin17.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Workspace loaded from ~/R project/.RData] > library(readxl) > HR_Data_Final_ <- read_excel("~/Documents/HR Data Final .xlsx") > View(HR_Data_Final_) > H <- HR_Data_Final_ > View(H) > names(H) [1] "Candidate Ref" "DOJ Extended" "Duration to accept offer" [4] "Notice period" "Offered band" "Percent difference CTC" [7] "Joining Bonus" "Candidate relocate actual" "Gender" [10] "Candidate Source" "Rex in Yrs" "LOB" [13] "Location" "Age" "Status" > dim(H) [1] 9037 15 > attach(h) Error in attach(h) : object 'h' not found > attach(H) > str(H) tibble [9,037 × 15] (S3: tbl_df/tbl/data.frame) Candidate Ref : num [1:9037] 2110407 2112635 2112838 2115021 2115125 ... DOJ Extended : chr [1:9037] "Yes" "No" "No" "No" ... Duration to accept offer : num [1:9037] 14 18 3 26 1 17 37 16 1 6 ... Notice period : num [1:9037] 30 30 45 30 120 30 30 0 30 30 ... Offered band : chr [1:9037] "E2" "E2" "E2" "E2" ... Percent difference CTC : num [1:9037] 42.9 180 0 0 0 ... Joining Bonus : chr [1:9037] "No" "No" "No" "No" ... Candidate relocate actual: chr [1:9037] "No" "No" "No" "No" ... Gender : chr [1:9037] "Female" "Male" "Male" "Male" ... Candidate Source : chr [1:9037] "Agency" "Employee Referral" "Agency" "Employee Referral" ... Rex in Yrs : num [1:9037] 7 8 4 4 6 2 7 8 3 3 ... LOB : chr [1:9037] "ERS" "INFRA" "INFRA" "INFRA" ... Location : chr [1:9037] "Noida" "Chennai" "Noida" "Noida" ... Age : num [1:9037] 34 34 27 34 34 34 32 34 26 34 ... Status : chr [1:9037] "Joined" "Joined" "Joined" "Joined" ... > HDOJ Extended
<-as.factor(H$DOJ Extended
) > H$Joining Bonus
<-as.factor(H$Joining Bonus
) > H$Candidate relocate actual
<-as.factor(H$Candidate relocate actual
) > H$Gender<-as.factor(H$Gender) > H$Candidate Source
<-as.factor(H$Candidate Source
) > H$LOB<-as.factor(H$LOB) > H$Status<-as.factor(H$Status) > H$Location<-as.factor(H$Location) > str(H) tibble [9,037 × 15] (S3: tbl_df/tbl/data.frame) Candidate Ref : num [1:9037] 2110407 2112635 2112838 2115021 2115125 ... DOJ Extended : Factor w/ 2 levels "No","Yes": 2 1 1 1 2 2 2 2 1 1 ... Duration to accept offer : num [1:9037] 14 18 3 26 1 17 37 16 1 6 ... Notice period : num [1:9037] 30 30 45 30 120 30 30 0 30 30 ... Offered band : chr [1:9037] "E2" "E2" "E2" "E2" ... Percent difference CTC : num [1:9037] 42.9 180 0 0 0 ... Joining Bonus : Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ... Candidate relocate actual: Factor w/ 2 levels "No","Yes": 1 1 1 1 2 1 1 1 1 1 ... Gender : Factor w/ 2 levels "Female","Male": 1 2 2 2 2 2 2 1 1 2 ... Candidate Source : Factor w/ 3 levels "Agency","Direct",..: 1 3 1 3 3 3 3 2 3 3 ... Rex in Yrs : num [1:9037] 7 8 4 4 6 2 7 8 3 3 ... LOB : Factor w/ 9 levels "AXON","BFSI",..: 5 8 8 8 8 8 8 7 2 3 ... Location : Factor w/ 11 levels "Ahmedabad","Bangalore",..: 9 3 9 9 9 9 9 9 5 3 ... Age : num [1:9037] 34 34 27 34 34 34 32 34 26 34 ... $ Status : Factor w/ 2 levels "Joined","Not Joined": 1 1 1 1 1 1 1 1 1 1 ... > attach(H) The following objects are masked from H (pos = 3): Age, Candidate Ref, Candidate relocate actual, Candidate Source, DOJ Extended, Duration to accept offer, Gender, Joining Bonus, LOB, Location, Notice period, Offered band, Percent difference CTC, Rex in Yrs, Status > Candidate Ref
=ifelse(Status= joined,1,0) Error in ifelse(Status = joined, 1, 0) : unused argument (Status = joined) > Candidate Ref=ifelse(Status= joined,"1"","0") Error: unexpected symbol in "Candidate Ref" > Candidate Ref=ifelse(Status= joined,"1"","0") Error: unexpected symbol in "Candidate Ref" > Candidate Ref=ifelse(Status= joined,"1"","0",data= H) Error: unexpected symbol in "Candidate Ref" > H=ifelse(Status=joined, "1","0") Error in ifelse(Status = joined, "1", "0") : unused argument (Status = joined) > joined <-ifelse(df$status==joined , 1, 0 ) Error in df$status : object of type 'closure' is not subsettable > joined <-ifelse(Status=joined,"1","0") Error in ifelse(Status = joined, "1", "0") : unused argument (Status = joined) > joined <-ifelse(Status=joined,1,0) Error in ifelse(Status = joined, 1, 0) : unused argument (Status = joined) > mod_1=lm(Status~
DOJ Extended+
Duration to accept offer+
Notice period,data = H) Warning messages: 1: In model.response(mf, "numeric") : using type = "numeric" with a factor response will be ignored 2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors > dim(H) [1] 9037 15 > set.seed(2) > train=sample(1: 9037, 4518) > H.test=H[-train,] > tree.H=H(Status~
DOJ Extended,
Percent difference CTC,subset=train) Error in H(Status ~
DOJ Extended,
Percent difference CTC, subset = train) : could not find function "H" > tree.H=tree::(Status~
DOJ Extended,
Percent difference CTC,subset=train) Error: unexpected '(' in "tree.H=tree::(" > tree.H=tree::Status~
DOJ Extended,
Percent difference CTC,subset=train Error: unexpected ',' in "tree.H=tree::Status~
DOJ Extended," > Status=ifelse(joined=1 ,0) Error in ifelse(joined = 1, 0) : unused argument (joined = 1) > model <- glm(Status ~
DOJ Extended+
Notice period,data = H ,family="binomial") > mod_1<- glm(Status ~
DOJ Extended+
Notice period+
Duration to accept offer+
Offered band+
Joining Bonus+
Candidate relocate actual+
Percent difference CTC+Gender+
Candidate Source+
Rex in Yrs+LOB+Location+Age,data = H ,family="binomial") Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred > > mod_1<- glm(Status ~
DOJ Extended+
Notice period+
Duration to accept offer+
Offered band+
Joining Bonus+
Candidate relocate actual+
Percent difference CTC+Gender+
Candidate Source+
Rex in Yrs+LOB+Location+Age,data = H ,family="binomial") Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred > mod_1<- glm(Status ~
DOJ Extended+
Notice period+
Duration to accept offer+
Offered band+
Joining Bonus+
Candidate relocate actual+
Percent difference CTC+
Candidate Source+
Rex in Yrs+LOB+Location+Age,data = H ,family="binomial") Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred > View(HR_Data_Final_) > attach(HR_Data_Final_) The following objects are masked from H (pos = 3): Age, Candidate Ref, Candidate relocate actual, Candidate Source, DOJ Extended, Duration to accept offer, Gender, Joining Bonus, LOB, Location, Notice period, Offered band, Percent difference CTC, Rex in Yrs, Status The following objects are masked from H (pos = 4): Age, Candidate Ref, Candidate relocate actual, Candidate Source, DOJ Extended, Duration to accept offer, Gender, Joining Bonus, LOB, Location, Notice period, Offered band, Percent difference CTC, Rex in Yrs, Status > str(HR_Data_Final_) tibble [9,037 × 15] (S3: tbl_df/tbl/data.frame) $ Candidate Ref : num [1:9037] 2110407 2112635 2112838 2115021 2115125 ... $ DOJ Extended : chr [1:9037] "Yes" "No" "No" "No" ... $ Duration to accept offer : num [1:9037] 14 18 3 26 1 17 37 16 1 6 ... $ Notice period : num [1:9037] 30 30 45 30 120 30 30 0 30 30 ... $ Offered band : chr [1:9037] "E2" "E2" "E2" "E2" ... $ Percent difference CTC : num [1:9037] 42.9 180 0 0 0 ... $ Joining Bonus : chr [1:9037] "No" "No" "No" "No" ... $ Candidate relocate actual: chr [1:9037] "No" "No" "No" "No" ... $ Gender : chr [1:9037] "Female" "Male" "Male" "Male" ... $ Candidate Source : chr [1:9037] "Agency" "Employee Referral" "Agency" "Employee Referral" ... $ Rex in Yrs : num [1:9037] 7 8 4 4 6 2 7 8 3 3 ... $ LOB : chr [1:9037] "ERS" "INFRA" "INFRA" "INFRA" ... $ Location : chr [1:9037] "Noida" "Chennai" "Noida" "Noida" ... $ Age : num [1:9037] 34 34 27 34 34 34 32 34 26 34 ... $ Status : chr [1:9037] "Joined" "Joined" "Joined" "Joined" ... > mod_1<- glm(Status ~
DOJ Extended+
Notice period+
Duration to accept offer+
Offered band+
Joining Bonus+
Candidate relocate actual+
Percent difference CTC+Gender+
Candidate Source+
Rex in Yrs+LOB+Location+Age,data = H ,family="binomial") Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred > > mod_1<- glm(Status ~
DOJ Extended+
Notice period+
Duration to accept offer+
Offered band+
Joining Bonus+
Candidate relocate actual+
Percent difference CTC+Gender+
Candidate Source+
Rex in Yrs+LOB+Location+Age,data = HR_Data_Final_) Error in y - mu : non-numeric argument to binary operator > mod_1<- glm(Status ~
DOJ Extended+
Notice period+
Duration to accept offer+
Offered band+
Joining Bonus+
Candidate relocate actual+
Percent difference CTC+Gender+
Candidate Source+
Rex in Yrs`+LOB+Location+Age,data = HR_Data_Final_,family="binomial") Error in eval(family$initialize) : y values must be 0 <= y <= 1 > HR_Data_Final_$StatusStatus<-as.factor(HR_Data_Final_$Status) > > Error: unexpected '>' in ">" > HR_Data_Final_$StatusStatus<-as.factor(HR_Data_Final_Status) > str(HR_Data_Final_) tibble [9,037 × 16] (S3: tbl_df/tbl/data.frame) Candidate Ref : num [1:9037] 2110407 2112635 2112838 2115021 2115125 ... DOJ Extended : chr [1:9037] "Yes" "No" "No" "No" ... Duration to accept offer : num [1:9037] 14 18 3 26 1 17 37 16 1 6 ... Notice period : num [1:9037] 30 30 45 30 120 30 30 0 30 30 ... Offered band : chr [1:9037] "E2" "E2" "E2" "E2" ... Percent difference CTC : num [1:9037] 42.9 180 0 0 0 ... Joining Bonus : chr [1:9037] "No" "No" "No" "No" ... Candidate relocate actual: chr [1:9037] "No" "No" "No" "No" ... Gender : chr [1:9037] "Female" "Male" "Male" "Male" ... Candidate Source : chr [1:9037] "Agency" "Employee Referral" "Agency" "Employee Referral" ... Rex in Yrs : num [1:9037] 7 8 4 4 6 2 7 8 3 3 ... LOB : chr [1:9037] "ERS" "INFRA" "INFRA" "INFRA" ... Location : chr [1:9037] "Noida" "Chennai" "Noida" "Noida" ... Age : num [1:9037] 34 34 27 34 34 34 32 34 26 34 ... Status : chr [1:9037] "Joined" "Joined" "Joined" "Joined" ... $ StatusStatus : Factor w/ 2 levels "Joined","Not Joined": 1 1 1 1 1 1 1 1 1 1 ... > attach(HR_Data_Final_)
while running logistic regression getting error.
mod_1=lm(Status ~ DOJ Extended
+Notice period
+Duration to accept offer
+Offered band
+Joining Bonus
+Candidate relocate actual
+Percent difference CTC
+Gender+Candidate Source
+Rex in Yrs
+LOB+Location+Age,data = HR_Data_Final_)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
fix(HR_Data_Final_)