I am building a Docker image from the rocker/verse:4.04
image and I am installing other packages with tlmgr
. I am expecting that the extra installs are installed at build. But once I try to use knitr to knit my Rmd document to PDF then the my Docker logs shows extra LaTeX installs which should be already present.
Here is my Dockerfile:
FROM rocker/verse:4.0.4
LABEL author.email="email@domain.com"
LABEL author.name="Balazs Kisfali"
RUN apt-get update -qq && apt-get install -y \
git-core \
libssl-dev \
libcurl4-gnutls-dev
# Install Python
RUN apt-get install -y python3 python3-pip
# Copy first just the reqs to leverage Docker cache
COPY ./requirements.txt ./requirements.txt
# Install dependencies
RUN pip3 install -r requirements.txt
RUN R -e "install.packages(c('plumber', 'rmarkdown', 'purrr', 'fs', 'reticulate','knitr', 'googleCloudStorageR'))"
# install some packages
RUN tlmgr init-usertree
RUN tlmgr update --self --all && \
tlmgr install fancyhdr cleveref multirow listings \
xcolor grffile titling amsmath kvsetkeys etoolbox \
pdftexcmds infwarerr geometry fancyvrb framed booktabs \
mdwtools epstopdf-pkg kvoptions ltxcmds auxhook bigintcalc \
bitset etexcmds gettitlestring hycolor hyperref intcalc kvdefinekeys \
letltxmacro pdfescape refcount rerunfilecheck stringenc uniquecounter \
zapfding
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./
ENV PORT=5012
EXPOSE $PORT
CMD ["Rscript", "Main.R"]
For instance this package is installing: kvoptions
even if it was installed at build time of the container image. But beside this there are another 15 packages at least which were already installed. This makes my plumber API very slow at the first request where I kick-off a PDF conversion. Takes about 4 minutes, then the next one obviously few seconds as there are no missing packages anymore. But once I deploy the container to Cloud Run in GCP it happens all the time, I guess due to the fact the new workers are spin up.
How can I restrict my installation to avoid installing missing packages? How can I know exactly which packages I need if I have the following in my .Rmd
file:
output:
pdf_document:
pandoc_args: ["--extract-media", "."]
fig_caption: true
toc: yes
number_sections: true
extra_dependencies:
inputenc: ["utf8"]
fancyhdr: null
cleveref: null
multirow: null
listings: null
# cmbright: null # this caused a lots of problem and the packages was
# not installed in anyway
fontenc: ["T1"]
array: null
xcolor: null
grffile: null # recognise multidots in image files
and something more in the header-includes
:
header-includes:
- \usepackage{titling}
- \usepackage{graphicx}
- \pretitle{\begin{center}
\includegraphics[width=2in,height=2in]{shapemaker_logo_long.png}\LARGE\\}
- \setlength\headheight{33pt}
- \posttitle{\end{center}}
- \usepackage{fancyhdr}
- \pagestyle{fancy}
- \fancyfoot{} # ridd of the default page number
- \fancyhead{} # ridd of the default head
- \fancyfoot[R]{| Page \thepage}
- \fancyfoot[L]{© 2021. Shapemaker AS.}
- \fancyhead[L,C]{}
- \fancyhead[L]{Test site \\ Second }
- \fancyhead[C]{007 \\ line }
- \fancyhead[R]{\includegraphics[height=0.8cm]{shapemaker_logo_long.png}}
Since RStudio installs smartly all missing packages when I knit my doc, I don't know which packages were installed exactly.