I have a problem with the code. I tried to organize a data frame with the number of people in different industries, those who get more than 50 and less than 50 thousand, but there are problems with the number. At first I replaced all unknown indicators with a mod value, combined everything into 4 groups, but for some reason it can't find the number. The same process works fine on other factors and the dataset is fine. What is the problem then?
levels(adult$workclass)[1] <- 'Unknown'
adult$workclass <- gsub(' Federal-gov', 'Government', adult$workclass)
adult$workclass <- gsub(' Local-gov', 'Government', adult$workclass)
adult$workclass <- gsub(' State-gov', 'Government', adult$workclass)
adult$workclass <- gsub(' Self-emp-inc', 'Self-Employed', adult$workclass)
adult$workclass <- gsub(' Self-emp-not-inc', 'Self-Employed', adult$workclass)
adult$workclass <- gsub(' Never-worked', 'Other/Unknown', adult$workclass)
adult$workclass <- gsub(' Without-pay', 'Other/Unknown', adult$workclass)
adult$workclass <- gsub(' Other', 'Other/Unknown', adult$workclass)
adult$workclass <- gsub(' Unknown', 'Other/Unknown', adult$workclass)
adult$workclass <- as.factor(adult$workclass)
summary(adult$workclass)
count <- table(adult[adult$workclass == 'Government',]$income_class)["<=50K"]
count <- c(count, table(adult[adult$workclass == 'Government',]$income_class)[">50K"])
count <- c(count, table(adult[adult$workclass == 'Other/Unknown',]$income_class)["<=50K"])
count <- c(count, table(adult[adult$workclass == 'Other/Unknown',]$income_class)[">50K"])
count <- c(count, table(adult[adult$workclass == 'Private',]$income_class)["<=50K"])
count <- c(count, table(adult[adult$workclass == 'Private',]$income_class)[">50K"])
count <- c(count, table(adult[adult$workclass == 'Self-Employed',]$income_class)["<=50K"])
count <- c(count, table(adult[adult$workclass == 'Self-Employed',]$income_class)[">50K"])
count <- as.numeric(count)
industry <- rep(levels(adult$workclass), each = 2)
income <- rep(c('<=50K', '>50K'), 4)
idf <- data.frame(industry, income, count)
idf
After code have this on console
industry income count
1 Private <=50K NA
2 Private >50K NA
3 Government <=50K NA
4 Government >50K NA
5 Other/Unknown <=50K NA
6 Other/Unknown >50K NA
7 Self-Employed <=50K NA
8 Self-Employed >50K NA