Best way to convert part of a large project into a package

Is there a standard or suggested workflow to create a R package with only a part of a project, while the rest of the project is used for other things such as data exploration and archival?

We are working on a reasonably large R-based project with the following setup:

  • .Rproj, git, renv, lintr and Air config files.
  • Multithreading for computationally intensive and parallelizable functions.
  • R/ directory with the main R code to be packaged, consisting of ~50 .R files sorted in subfolders.
  • Directories such as utility-scripts/ and archived-data/ that are not to be packaged but are nonetheless essential for development and exploration.

Our questions about the workflow is: should the package be outside the project, or should the project itself become a package?

  • If the package should be outside the project, what is the best way to automate its creation and management as it grows with the project?
  • If the project should become a package, how should we handle the work that we want to keep for development and archival but exclude from the package?

Any suggestion on how to do this cleanly, or things to avoid, are gladly welcome.

Only you can answer that question depending on your needs. However, based on your first sentence, it seems that using an external project for the package is appropriate.

usethis, devtools and roxygen2.

Simply by not exporting internal objects/functions, although they will remain accessible via the ::: operator. If you've sensitive code that you want to keep private, then you should certainly use a distinct project for your package.

Right now, we setup the package outside the project and deploy it in production. Building the package from the project is somewhat messy, and we would like the project and the package to be "less apart" from each other than they are right now. Many thanks for your reply to my vague question!

Organize your workspace like this:

my-project/
├── analysis/ # data exploration, Rmd
├── archived-data/ # raw/completed datasets
├── R/ # core functions for package
├── data-raw/ # scripts to download/clean data
├── inst/ # non-exported resources
├── tests/
├── vignettes/ # tutorial documents
└── myProject.Rproj

  • R/ holds only the code you’ll package.
  • analysis/, archived-data/ stay out of the package.
  • Use renv, linting, parallel/test configs in the root.

2. :arrows_counterclockwise: Develop with Load & Check

  • Use devtools::load_all() to iteratively build/package.
  • Use R CMD check / devtools::check() before release.

3. :package: Package as Part of Project

Keep the package and exploration in the same repo:

  • Exclude analysis folders using .Rbuildignore.
  • Keep docs and scripts in analysis/, not packaged.

Here’s a clean modular workflow I use where the package lives in R/, while explorations, raw data, and archiving stay separate.

Developing an R Package from a Larger Project

I also integrate visualizations—like flow chart—using the web tool Cloudairy Flow chart Maker with AI capabilities.