@vedoa sorry, I created a synthetic file but did not run this in R. But, the issue still exists, you might notice 33 columns in the file with some NA
values instead of 7 columns. Example below:
dput(finalDF)
structure(list(gene_name = c("gene1", "protein1ac", "metabolite123",
"compound20", "ABCD", "XYZ", "gene34"), brief_summary = c("gene1 (kinase 1) is a gene that encodes a protein kinase involved in endocytosis, a process by which cells internalize molecules and particles from their environment. The gene1 protein regulates the binding of the AP2 complex to clathrin and the plasma membrane, which is crucial for the formation of clathrin-coated vesicles. This process is important for the internalization of proteins and receptors from the cell surface and their subsequent recycling or degradation.",
"protein1ac, also known as Protein 1 (xx), is a member of the superfamily of ATP-binding cassette (ABC) transporters. These transporters play a crucial role in the cellular efflux of various substances, including drugs, lipids, and xenobiotics. protein1ac specifically functions as a transporter protein that can pump a variety of drugs out of cells, contributing to multidrug resistance in cancer chemotherapy. It is also involved in the transport of leukotriene C4 and other glutathione conjugates out of cells, which is significant for detoxification processes.",
"The metabolite123 gene encodes a member of the (ABC) transporter superfamily, known as multidrug resistance 5 (xx). It functions as a transporter for various molecules, including cyclic nucleotides, antiviral drugs, and other substrates, across cellular membranes. metabolite123 plays a role in cellular detoxification and contributes to the chemoresistance of cancer cells.",
NA, "ABCD (Containing 10) is a Protein Coding gene. The ABCD gene encodes a member of the alpha/beta hydrolase superfamily. This gene has not been characterized thoroughly and its specific functions are still under investigation. It may play a role in metabolic processes, including the metabolism of drugs and other xenobiotics in the body. The expression and regulation of this gene in different tissues, including the airway epithelium, and its involvement in various cellular processes may provide insights into its biological functions.",
"XYZ, or 'Domain Containing 2,' is a gene that encodes an enzyme belonging to the superfamily. This enzyme is involved in various biological processes, including lipid metabolism, signal transduction, and possibly the regulation of the androgen receptor. The exact functions of XYZ in airway epithelium and its role in the context of influenza virus infection are subjects of research and not fully characterized as of the current knowledge cutoff date.",
"ATP lyase (gene34) is an enzyme that plays a crucial role in cellular energy metabolism by converting citrate to acetyl-CoA and oxaloacetate, thus linking the metabolism of carbohydrates and fats with the production of cholesterol and fatty acids."
), `evidence_scores.This gene is specifically associated with the biology of the airway epithelium` = c(1L,
NA, NA, NA, NA, NA, 1L), `evidence_scores.This gene is specifically associated with the biology of bsl cells within the airway epithelium` = c(1L,
NA, NA, NA, NA, NA, 1L), `evidence_scores.This gene is specifically associated with the biology of cla cells within airway epithelium` = c(1L,
NA, NA, NA, NA, NA, 1L), `evidence_scores.This gene is involved in mediating influenza virus egress` = c(3L,
NA, NA, NA, NA, NA, 3L), `evidence_scores.This gene is involved in mediating the initiation or priming of the adaptive immune response` = c(3L,
NA, NA, NA, NA, NA, 3L), `scores.This gene is specifically associated with the biology of the airway epithelium` = c(NA,
1L, NA, NA, NA, NA, NA), `scores.This gene is specifically associated with the biology of bsl cells within the airway epithelium` = c(NA,
1L, NA, NA, NA, NA, NA), `scores.This gene is specifically associated with the biology of cla cells within airway epithelium` = c(NA,
1L, NA, NA, NA, NA, NA), `scores.This gene is involved in mediating influenza virus egress` = c(NA,
3L, NA, NA, NA, NA, NA), `scores.This gene is involved in mediating the initiation or priming of the adaptive immune response` = c(NA,
3L, NA, NA, NA, NA, NA), `associations.This gene is specifically associated with the biology of the airway epithelium` = c(NA,
NA, 1L, NA, NA, NA, NA), `associations.This gene is specifically associated with the biology of bsl cells within the airway epithelium` = c(NA,
NA, 1L, NA, NA, NA, NA), `associations.This gene is specifically associated with the biology of cla cells within airway epithelium` = c(NA,
NA, 1L, NA, NA, NA, NA), `associations.This gene is involved in mediating influenza virus egress` = c(NA,
NA, 3L, NA, NA, NA, NA), `associations.This gene is involved in mediating the initiation or priming of the adaptive immune response` = c(NA,
NA, 3L, NA, NA, NA, NA), summary = c(NA, NA, NA, "compound20, also known as ATP 9, is a gene that encodes for the protein. This protein is part of the ATP-sensitive potassium (KATP) channel complex, which plays a key role in coupling metabolic status to cellular excitability in various tissues, including cardiac and skeletal muscle, neurons, and pancreatic beta cells. The protein encoded by this gene is involved in the regulation of insulin secretion, myocardial contractility, and vascular tone. Mutations in compound20 can cause dilated cardiomyopathy and are associated with Cantu syndrome, a rare disorder characterized by hypertrichosis, osteochondrodysplasia, and cardiomegaly.",
NA, NA, NA), `statements.This gene is specifically associated with the biology of the airway epithelium` = c(NA,
NA, NA, 1L, NA, NA, NA), `statements.This gene is specifically associated with the biology of bsl cells within the airway epithelium` = c(NA,
NA, NA, 1L, NA, NA, NA), `statements.This gene is specifically associated with the biology of cla cells within airway epithelium` = c(NA,
NA, NA, 1L, NA, NA, NA), `statements.This gene is involved in mediating influenza virus egress` = c(NA,
NA, NA, 3L, NA, NA, NA), `statements.This gene is involved in mediating the initiation or priming of the adaptive immune response` = c(NA,
NA, NA, 3L, NA, NA, NA), `gene_associations.This gene is specifically associated with the biology of the airway epithelium` = c(NA,
NA, NA, NA, 1L, NA, NA), `gene_associations.This gene is specifically associated with the biology of bsl cells within the airway epithelium` = c(NA,
NA, NA, NA, 1L, NA, NA), `gene_associations.This gene is specifically associated with the biology of cla cells within airway epithelium` = c(NA,
NA, NA, NA, 1L, NA, NA), `gene_associations.This gene is involved in mediating influenza virus egress` = c(NA,
NA, NA, NA, 3L, NA, NA), `gene_associations.This gene is involved in mediating the initiation or priming of the adaptive immune response` = c(NA,
NA, NA, NA, 3L, NA, NA), `statements_scores.This gene is specifically associated with the biology of the airway epithelium` = c(NA,
NA, NA, NA, NA, 1L, NA), `statements_scores.This gene is specifically associated with the biology of bsl cells within the airway epithelium` = c(NA,
NA, NA, NA, NA, 1L, NA), `statements_scores.This gene is specifically associated with the biology of cla cells within airway epithelium` = c(NA,
NA, NA, NA, NA, 1L, NA), `statements_scores.This gene is involved in mediating influenza virus egress` = c(NA,
NA, NA, NA, NA, 3L, NA), `statements_scores.This gene is involved in mediating the initiation or priming of the adaptive immune response` = c(NA,
NA, NA, NA, NA, 3L, NA)), row.names = c(NA, 7L), class = "data.frame")