The question is : "The thickness of the oil gasket with the product number "45231-3B660" is normal if it is in the range of 20 to 30, and other defects are considered. Look at which process variables are affecting product defects"
process variables are : a_speed, b_speed, separation, s_separation, rate_terms, mpa, load_time, highpressure_time
The condition : just use lm() function
What I've tried :
q6 = mydata[mydata$prod_no == "45231-3B660",]
q6$z = ifelse((q6$c_thickness < 20) | (q6$c_thickness > 30), 1, 0)
# 1 : defects , 0 : normal
m2 = lm(z ~ a_speed + b_speed + separation + s_separation + rate_terms + mpa + load_time + highpressure_time, data=q6)
Is this way wrong? or Please give me any suggestions..
data :
dput(head(data, 10))
structure(list(prod_date = c("2014-05-01 오전 8:28:56", "2014-05-01 오전 8:27:29",
"2014-05-01 오전 8:26:04", "2014-05-01 오전 8:24:37", "2014-05-01 오전 8:23:11",
"2014-05-01 오전 8:21:46", "2014-05-01 오전 8:20:18", "2014-05-01 오전 8:18:51",
"2014-05-01 오전 8:17:25", "2014-05-01 오전 8:15:59"), prod_no = c("90784-76001",
"90784-76001", "90784-76001", "90784-76001", "90784-76001", "90784-76001",
"90784-76001", "90784-76001", "90784-76001", "90784-76001"),
prod_name = c("Oil Gasket", "Oil Gasket", "Oil Gasket", "Oil Gasket",
"Oil Gasket", "Oil Gasket", "Oil Gasket", "Oil Gasket", "Oil Gasket",
"Oil Gasket"), degree = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2),
mold = c("생산대기", "생산대기", "생산대기", "생산대기",
"생산대기", "생산대기", "생산대기", "생산대기", "생산대기",
"생산대기"), prod = c("생산", "생산", "생산", "생산", "생산",
"생산", "생산", "생산", "생산", "생산"), s_no = c(892890,
892889, 892888, 892887, 892886, 892885, 892884, 892883, 892882,
892881), fix_time = c(85.5, 86.2, 86, 86.1, 86.1, 86.3, 86.5,
86.4, 86.3, 86), a_speed = c(0.611, 0.606, 0.609, 0.61, 0.603,
0.606, 0.606, 0.607, 0.604, 0.608), b_speed = c(1.715, 1.708,
1.715, 1.718, 1.704, 1.707, 1.701, 1.707, 1.711, 1.696),
separation = c(242, 244.7, 242.7, 241.9, 242.5, 244.5, 243.1,
243.1, 245.2, 248), s_separation = c(657.6, 657.1, 657.5,
657.3, 657.3, 656.9, 656.9, 657.3, 656.9, 657.3), rate_terms = c(95,
95, 95, 95, 95, 95, 95, 95, 95, 95), mpa = c(78.2, 77.9,
78, 78.2, 77.9, 77.9, 78.2, 77.5, 77.8, 77.5), load_time = c(18.1,
18.2, 18.1, 18.1, 18.2, 18, 18.1, 18.1, 18, 18.1), highpressure_time = c(58,
58, 82, 74, 56, 78, 55, 57, 50, 60), c_thickness = c(24.7,
22.5, 24.1, 25.1, 24.5, 22.9, 24.3, 23.9, 22.2, 19)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
※ I know I'd better use glm()