I am compiling an report. Most of the report is knitted programmatically to a PDF from an R Markdown script.
I need to append one or more additional PDF files obtained from a distinct source to the end of the markdown report. I don't want to scrape the PDFs; I need to append the original PDF files or images of the PDF pages and insert images as long as the page order is maintained. Some of these PDFs comprise multiple pages.
There has to be a way to do this programmatically but I'm not finding it.
You can do this programmatically within an rmarkdown document using the latex pdfpages package. Here is a sample rmarkdown document that will include the whole pdf file (adapted from this Stackoverflow answer):
---
title: "My Title"
output: pdf_document
header-includes:
- \usepackage{pdfpages}
---
## My rmarkdown document
This is an R Markdown document.
## External PDF file is included below
\includepdf[pages={-}]{my_pdf.pdf}
If you want to include specific pages, you can do, for example:
I wanted to add an addendum to my post describing the final solution to this problem.
While I didn't use any of the suggested methods specifically, in the course of investigating all three methods, the specific answer to my problem became clear. Actually, it was pretty straight-forward.
In a script file, I first called and knitted the markdown document using the render() command and saved the markdown PDF output to the same directory where the second PDF was located. In this development case, that happened to be the master project directory. If you need to specify a different directory, use the here() package.
Then, I used pdftools::pdfcombine() to combine the two PDFs and save the output to a specific directory (Again, in this example, that is the master project directory).
#create and output "two_joined_pdfs" PDF
# https://www.r-bloggers.com/join-split-and-compress-pdf-files-with-pdftools/
rmarkdown::render("name_of_markdown_doc.Rmd",
output_file = "name_of_markdown_doc.pdf")
# merge "name_of_markdown_doc" PDF with "second_PDF" PDF
library(pdftools)
pdf_combine(c("name_of_markdown_doc.pdf", "second_PDF.pdf"),
output = "two_joined_pdfs.pdf")