Facing Issue in changing the data in the columns

gain_10 · March 1, 2023, 5:47am

I am trying to change the value in the column in the dataset but its not changing I am attaching the code and the dataset. Please help me out.

library(ggplot2)
library(plotly)
library(dplyr)
library(data.table)
census_data=read.csv("census.csv")
View(census_data)
#column renaming
colnames(census_data)[1:6]=c("Slno","Age","Working Class", "Citizen Score","Education","Education Number")
colnames(census_data)[7:11]=c("Martial Status","Occupation","Relationship","Race","Sex")
colnames(census_data)[12:16]=c("Capital Gain","Capital Loss","Hrs/Week","Native","Avg Salary")
str(census_data)
#frequency Distribution
table(census_data$`Working Class`)

#mean age
tapply(census_data$Age,census_data$`Working Class`, mean)
tapply(census_data$Age,census_data$`Working Class`, min)
tapply(census_data$Age,census_data$`Working Class`, max)

table(census_data$Education)
#Categorise
census_data$Education=as.character(census_data$Education)

census_data$Education[census_data$Education == "1st-4th"]="Primary School"
census_data$Education[census_data$Education == "5th-6th"] ="Primary School"
census_data$Education[census_data$Education == "7th-8th"] ="Secondary School"
census_data$Education[census_data$Education == "9th"] ="Secondary School"
census_data$Education[census_data$Education == "10th"] ="Secondary School"
census_data$Education[census_data$Education == "11th"] ="Higher Secondary School"
census_data$Education[census_data$Education == "12th"] ="Higher Secondary School"
View(census_data)

table(census_data$Education)

dataset link:- https://www.kaggle.com/datasets/palashgain/census-dataset

FJCC · March 1, 2023, 3:13pm

You can make it much easier to help you by posting the data here. To do that, run your code down to the line

census_data$Education=as.character(census_data$Education)

Then make a data frame that contains only the Education column and one other column. I'll assume the other column is LastName but you should use whatever column actually exists in the data set.

ForForum <- census_data[1:25, c("LastName", "Education")]

Post the output of

dput(ForForum)

gain_10 · March 1, 2023, 3:22pm

Sorry Sir I don't get what you have asked for.

FJCC · March 1, 2023, 3:32pm

Run this code

census_data=read.csv("census.csv")
View(census_data)
#column renaming
colnames(census_data)[1:6]=c("Slno","Age","Working Class", "Citizen Score","Education","Education Number")
colnames(census_data)[7:11]=c("Martial Status","Occupation","Relationship","Race","Sex")
colnames(census_data)[12:16]=c("Capital Gain","Capital Loss","Hrs/Week","Native","Avg Salary")
str(census_data)
#frequency Distribution
table(census_data$`Working Class`)

#mean age
tapply(census_data$Age,census_data$`Working Class`, mean)
tapply(census_data$Age,census_data$`Working Class`, min)
tapply(census_data$Age,census_data$`Working Class`, max)

table(census_data$Education)
#Categorise
census_data$Education=as.character(census_data$Education)

ForForum <- census_data[1:25, c("Age", "Education")] 
dput(ForForum)

Post the output of dput(ForForum) as a response on this forum thread.

gain_10 · March 1, 2023, 3:48pm

nirgrahamuk · March 1, 2023, 3:56pm

i.e. you are checking for "9th" but your data has contents like " 9th"
so you can either directly adjust what you are looking for; or throw in a trimws() around the variable you want to match without whitespace, so that differences in trailing whitespace are ignored.

gain_10 · March 1, 2023, 6:34pm

I have used trimws also but then also its not working.

gain_10 · March 1, 2023, 6:34pm

nirgrahamuk · March 1, 2023, 7:48pm

The results of your use of trimws were not assigned to anywhere. You sjould be using <- to assign results in the normal way

Also please avoid using screenshots, they are hard to read, and not possible for us to copy and paste etc.

gain_10 · March 2, 2023, 9:11pm

Thanks for the help I am a beginner in this field.
I appreciate your guidance.

system · March 9, 2023, 9:12pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.