I am a researcher running a binomial regression (and coding and doing statistics) for the first time ever for work - it's been an experience! I took over this project for work midway through, so did not develop the initial coding myself. I've never coded before so I've been learning R as I go. My apologies if I've not laid out the issue below as I should have or missed out any critical information, I'm really very much a novice at all of this.
The problem: I've had to expand the dataset R is pulling from, and am getting a bunch of errors due to an apparent mismatch of rows. However I can't figure out what I need to do next to fix this.
The initial dataset was 1,276 individuals (rows), each responding to a selection from 188 questions (columns). I have since been asked to add responses to 8 further questions to this initial dataset, meaning 196 questions (columns) for the final dataset. Overall, there have only have only ever been 9 columns, and that remains unchanged. However, I am having an issue with adjusting my code to account for the addition of these new columns.
Any ideas welcome with respect to what might be causing the mismatch of rows!
The details:
For example, my first code, which would run:
Ans_Data = read_xlsx("DSM Data 15.2.23 IB v4.xlsx",
sheet = "CHANGED Tab 2 - AR weighted",
range = "A12:GG1290", col_names = F, col_types = c("text",rep("numeric",188)))
Question_Data = t(read_xlsx("DSM Data 15.2.23 IB v4.xlsx",
sheet = "CHANGED Tab 2 - AR weighted",
range = "A1:GG10", col_names = T))
colnames(Question_Data) = Question_Data[1,]
Question_Data = Question_Data[-1,]
Question_Data = data.table(Question_Data)
Ans_Data_2 = Ans_Data %>%
pivot_longer(cols = colnames(Ans_Data)[2:189])
for (i in 1:1278) {
if (i==1) {
Question_Data_2 = rbind(Question_Data,Question_Data)
} else {
Question_Data_2 = rbind(Question_Data_2,Question_Data)
}
}
Ans_Data_3 = cbind(Ans_Data_2, Question_Data_2)
However, my updated code:
Ans_Data = read_xlsx("DSM Data 15.2.23 DP v5.xlsx",
sheet = "CHANGED Tab 2 - AR weighted",
range = "A12:GO1287", col_names = F,col_types = c("text",rep("numeric",196)))
Question_Data = t(read_xlsx("DSM Data 15.2.23 DP v5.xlsx",
sheet = "CHANGED Tab 2 - AR weighted",
range = "A1:GO10", col_names = T))
colnames(Question_Data) = Question_Data[1,]
Question_Data = Question_Data[-1,]
Question_Data = data.table(Question_Data)
Ans_Data_2 = Ans_Data %>%
pivot_longer(cols = colnames(Ans_Data)[2:197])
for (i in 1:1278) {
if (i==1) {
Question_Data_2 = rbind(Question_Data,Question_Data)
} else {
Question_Data_2 = rbind(Question_Data_2,Question_Data)
}
}
Ans_Data_3 = cbind(Ans_Data_2, Question_Data_2)
produces the following error:
Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 250096, 250684