Hi jcblum,
Thanks very much for your response! natenvir
is a factor in the original data frame, gss
. When the barplot causing the problem is called, I am trying to refer to a table created with prop.table
.
Thanks for referring me to the style guide as well. My assignment is complete and runs fine in R (I think) - the issue seems to be the knitting process in RStudio, I assume.
We are required to knit to .html in order to submit the final project for the course. There shouldn't be any issues with the homework policy since the assignment is complete and I am just looking for help with RStudio. I suspect the issue I am having is beyond the scope of what the instructor's expect from the students. The support forum for the Coursera course is not always super-responsive, hence my attempt to find help here which I greatly appreciate. The full code for the assingment is below. Thanks!
#load packages
library(ggplot2)
library(dplyr)
library(statsr)
#load data
load("gss.Rdata")
# creating a barplot for factor relig:
# sets the margins
par(mar = c(11, 11, 5, 2) + 0.1)
# plots the barchart and adjusts the y axis scale
barplot(table(gss$relig), ylim = c(0, 35000), ylab = " ", las=2)
# yaxis label and position
mtext(text = "Number of Respondents", side = 2, line = 4)
# x axis label and position
mtext(text = "Religious Affiliation", side = 1, line = 7)
# creating a barplot for factor natenvir:
# adjusts the margins to accommodate axis labels
par(mar = c(11, 7, 5, 2) + 0.1)
# produces the barplot and adjusts the y-axis scale
barplot(table(gss$natenvir),ylim = c(0, 20000), ylab = " ", las=2)
# y axis label
mtext(text = "Number of Respondents", side = 2, line = 4)
# x axis label
mtext(text = "Spending on the Environment", side = 1, line = 6)
#Determining the levels of factor relig:
levels(gss$relig)
#Determining the levels of factor natenvir:
levels(gss$natenvir)
#Determine the structure of the factors:
str(gss$relig)
str(gss$natenvir)
#Create a table for the relig variable:
relig_table <- table(gss$relig)
relig_table
#Create a table for the natenvir variable:
natenvir_table <- table(gss$natenvir)
natenvir_table
#Display proportions for the relig variable table:
prop.table(relig_table)
#Display proportions for the natenvir table:
prop.table(natenvir_table)
# cell counts for relig and natenvir:
cell_counts_table <-table(gss$relig, gss$natenvir)
cell_counts_table
Removing levels of the factor relig with <5 in one or more cells for natenvir
# removing the level "Hinduism" from relig
y<-data.frame(subset(gss[gss$relig!="Hinduism",]))
# removing the level "Buddhism" from relig
y2<-data.frame(subset(y[y$relig!="Buddhism",]))
# removing the level "Other Eastern" from relig
y3<-data.frame(subset(y2[y2$relig!="Other Eastern",]))
# removing the level "Native American" from relig
y4<-data.frame(subset(y3[y3$relig!="Native American",]))
# removing the level "Inter-Nondenominational" from relig
y5<-data.frame(subset(y4[y4$relig!="Inter-Nondenominational",]))
# removing the level "Orthodox-Christian" from relig
y5<-data.frame(subset(y4[y4$relig!="Orthodox-Christian",]))
# making sure relig is converted back into a factor
y5$relig <- factor(y5$relig)
# chi-square test for independence
chisq.test(y5$relig, y5$natenvir)
# Manipulate data to present in a table and then produce a barplot:
# make a table using just the relig and natenvir factors originally from gss (now housed in y5)
s = table(y5$relig, y5$natenvir)
# make a table of proportions for relig and natenvir factors
s2 = prop.table(s, 1)
# transpose the table of proportions so that natenvir is the dependent variable and relig is the independent variable in the barplot
s3 = t(prop.table(s,1))
# displays the transposed table of proportions showing responses to current spending on the environment (natenvir) within each level of religious affiliation (relig)
s3
# Steps for generating a barplot showing proportions of relig relative to natenvir (proportion of respondents in each religion that answered one of three ways):
# sets the margins around the barplot
par(mar = c(11, 11, 5, 2) + 0.1, las = 2)
# draws barplot using the table (s3) with a legend in the bottom right
barplot(s3, legend = levels(unique(natenvir)), args.legend = list(x = 'bottomright'))
# y axis label
mtext(text = "Proportion of Respondents", side = 2, line = 4, las = 3)
# x axis label
mtext(text = "Religious Affiliation", side = 1, line = 8, las = 1)
#count and print the number of respondents in each relig factor reporting "Too Much" for the natenvir factor
sf <- y5 %>% count(relig, natenvir)
print(sf, n = 33)
# conduct test of two different proportions
jp <- prop.test(x = c(38, 1852), n = c(633, 18778))
jp