Erratic data upload from google sheet to R markdown

I am using R-markdown to knit documents in word format. i have uploaded the data form google sheet by using googlesheets4 package. the data in the table is showing correct values in the columns. however, in the values section in the environment some of the columns are showing Null and those columns on knitting are blank in word document when knitted. when i am using the csv file as source of the data i do not face this problem. however, if i am able to work with google sheet i will save a lot of time. Please help.

Hi @cymarose
Welcome to the Posit/RStudio Community Forum.

Can you post a link to a small example Google Sheets file that causes the problem? Also, in order to help you effectively, we will need to see a short reproducible piece of code that you are using to read that file (see the posting guide on how to make a 'reprex' (reproducible example).

Nice to hear from you..
Here is link to the entire Rmarkdown code Loading Google Docs
And here is the link to the google sheet: https://docs.google.com/spreadsheets/d/1SHKMDwot6Oum1LnMYvF-0xLiTUhngsRzgdPh27ZVlsc/edit?usp=sharing

i would request to explain in simple terms as i am new to R and being from a medical background i lack in depth understanding of coding or computers.
Thanks.

few of the columns showing null values are: Allergy, arms, ASDASCRP, BRM, Cat, CE, Dose1, Dose2, Drug1, Drug2

Thanks.

Hi @cymarose
We cannot access the script file or the Google Sheet unless you give them "public access". Note that you might not want to do that if the data requires privacy. In which case, maybe you can provide some dummy data that can be used to demonstrate the problem.

Hi @cymarose
I copied the crucial part of your Rmarkdown text into a simple R script. With some modifications it now works. There were several problems with inconsistent column names that needed fixing.
The code for the table creation was "not echoed" in the .docx file so we can't reproduce it here. Try copying this code to replace the relevant section in your Rmarkdown file, and then knit. Alternatively, just copy this code into a new script file and run to check that its working for you.

library(dplyr)
library(kableExtra)
library(googlesheets4)

library(janitor)  # You may need to install this package

# Read the data - this works fine
# Now interprets string "NA" as missing value (NA in R)
SPA <- read_sheet("https://docs.google.com/spreadsheets/d/1SHKMDwot6Oum1LnMYvF-0xLiTUhngsRzgdPh27ZVlsc/edit?usp=sharing",
                  na="NA")

# Check the structure and size of the tibble (data frame)
str(SPA)
dim(SPA)
names(SPA)
table(SPA$Opinion)  # Frequency table of the levels of "Opinion"

# Fix the column names so that spaces are replaced with "." (makes the code below work correctly)
SPA <- clean_names(SPA )
names(SPA)

# This filters the dataframe so that only those lines with opinion = 19 will remain. 
PartII_info <- dplyr::filter(SPA, opinion=="19")

# Dataframe columns are copied into individual vectors (not clear why this is being done)
# Names on RHS have been manually modified to match those "cleaned"
Name <- PartII_info$name
Age <- PartII_info$age
Gender <- PartII_info$gender
Disability <- PartII_info$disability
Onset <- PartII_info$onset
PresentCat <- PartII_info$present_category
LoA <- PartII_info$location_of_onset
IP1 <- PartII_info$initial_presenting_feature_1
IP2 <- PartII_info$initial_presenting_feature_2
PastIBP <- PartII_info$past_ibp
PastPA <- PartII_info$past_peripheral_arthritis
PastAAU <- PartII_info$past_aau
PastSacro <- PartII_info$sacroiliitis_on_xray
PastSacroMRI <- PartII_info$sacroiliitis_on_mri
PastHLA <- PartII_info$hla_b27_status
DMARD <- PartII_info$cs_dmar_ds
BRM <- PartII_info$biologic_dmard
Allergy <- PartII_info$any_allergy_or_intoerance
PC <- PartII_info$present_problem
Pulse <- PartII_info$pulse
BP <- PartII_info$bp
Pallor <- PartII_info$pallor
Icterus <- PartII_info$icterus
PE <- PartII_info$pedal_edema
LN <- PartII_info$lymph_nodes
Club <- PartII_info$clubbing
Skin <- PartII_info$skin
Eyes <- PartII_info$eyes
Gait <- PartII_info$gait
Arms <- PartII_info$ul_arthritis_deformity
Legs <- PartII_info$ll_arthritis_deformity
Enthesitis <- PartII_info$enthesitis
Dactylitis <- PartII_info$dactylitis
Spine <- PartII_info$spine_sij_tenderness
MS <- PartII_info$modified_schobers_test
CE <- PartII_info$chest_expansion
TTW <- PartII_info$tragus_to_wall_distance
LF <- PartII_info$lateral_flexion
IMD <- PartII_info$inermalleolar_distance
CR <- PartII_info$cervical_rotation
Chest <- PartII_info$chest
CVS <- PartII_info$cvs
CNS <- PartII_info$cns
Abdo <- PartII_info$abdomen
Hb <- PartII_info$hb
TLC <- PartII_info$tlc
Plt <- PartII_info$platelets
ESR <- PartII_info$esr
CRP <- PartII_info$crp
Cr <- PartII_info$creatinine
AST <- PartII_info$ast
ALT <- PartII_info$alt
Xpelvis <- PartII_info$xray_pelvis_sapa_view
XLspine <- PartII_info$xray_lumbar_spine
XCspine <- PartII_info$xray_cervical_spine
Xhip <- PartII_info$xray_hip_joints
MRISIJ <- PartII_info$mri_si_joints
MRIHip <- PartII_info$mri_hip_joints
HLA <- PartII_info$hla_b27
OtherTest <- PartII_info$any_other_tests
BASDAI <- PartII_info$basdai
BASFI <- PartII_info$basfi
ASDASESR <- PartII_info$asdas_esr
ASDASCRP <- PartII_info$asdas_crp
Diag <- PartII_info$diagnosis
Impr1 <- PartII_info$impession_1_disease_activity
Impr2 <- PartII_info$impression_2_deformities
Impr3 <- PartII_info$impression_3_function
EAM <- PartII_info$eam
NSAID <- PartII_info$nsaid
Drug1 <- PartII_info$drug_1
Drug2 <- PartII_info$drug_2
Dose1 <- PartII_info$dose_1
Dose2 <- PartII_info$dose_2
Cat <- PartII_info$medical_class

# OP said, "few of the columns showing null values are: Allergy, arms, ASDASCRP, BRM, 
# Cat, CE, Dose1, Dose2, Drug1, Drug2"

# Probably due to column name mis-match (should now be fixed)
# Examples:
Allergy
Drug1
Dose1

# Some variables have NA = "not available" = "missing value"
BRM

Thanks a ton.. i copied the code chunk and its working now.. i am trying to understand how did you do it.. :slight_smile:

So thankful to you.. its working so nicely .. and have made my job so easy... as i understood you used the function clean_names to clean the column names and then used the function names(SPA) to get the cleaned names. then these names were copied into the object names i have used. and it works now.

I could not understand what did you do in the following section:

Check the structure and size of the tibble (data frame)

str(SPA)
dim(SPA)
names(SPA)
table(SPA$Opinion) # Frequency table of the levels of "Opinion"

Will you be kind enough to make me understand?
Thanks and regards..

1 Like

Hi @cymarose
I'm pleased I was able to help.
Those four lines of code are just providing "information" about the data that had been read in - they are not actually changing anything; just allowing me to check what's happening.
You can get help on any function by running
E.g. help("dim")

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.