R
packageAttribution: This content has been developed on the basis provided by Chapter 1: The Whole Game (
R
packages book by Hadley Wickham & Jenny Bryan, 2e) and the UBC course notes Reproducible and Trustworthy Workflows for Data Science by Tiffany Timbers, Joel Östblom, Florencia D’Andrea, and Rodolfo Lourenzutti
We assume you have followed the installation instructions we shared before the workshop and have: registered for a GitHub account and installed git (more information here)
R
packagesR
packagesR
packagescreate_package()
create_package()
will initialize our new package in a directory of our choiceDesktop
folder for easier referenceDon’ts when choosing your home directory
R
package, or Git repositoryR
package library (i.e., where we usually install other packages from CRAN)ignore
-type files).gitignore
is used by GitHub and lists all “hidden” files created by R
and RStudio that aren’t necessary for the repository.Rbuildignore
contains all files created via R
and RStudio that won’t be necessary when building our package (e.g., eda.Rproj
)DESCRIPTION
contains the metadata and dependency installation instructions for our packageeda.Rproj
is the RStudio project fileNAMESPACE
contains the package’s functions to export along with imports from other packagesR/
directory which will contain all package’s functions as .R
scriptsuse_git()
eda.Rproj
, we will initialize a Git repository via use_git()
.git
directory in the folder {eda}eda.Rproj
Git
tab, click on the clock icon to check your commit history (note your GitHub user is shown in the Author
column)mtcars
) so we (and others) can reuse this code more easily in other projectscount_classes()
data_frame
or data frame extension (e.g., a tibble) along with an unquoted column name containing the class label class_col
{ }
{ class_col }
class_col
) needs extra support (via the curly brackets) because the global environment is not aware of the data frame column namespackage::function()
count_classes()
includes four {dplyr} functions:
group_by()
, summarize()
, n()
, and rename()
dplyr::group_by()
, dplyr::summarize()
, dplyr::n()
, and dplyr::rename()
count_classes <- function(data_frame, class_col) {
if (!is.data.frame(data_frame)) {
stop("`data_frame` should be a data frame or data frame extension (e.g. a tibble)")
}
data_frame |>
dplyr::group_by({{ class_col }}) |>
dplyr::summarize(count = dplyr::n()) |>
dplyr::rename("class" = {{ class_col }})
}
use_r()
.R
script in the R/
subdirectory of {eda}use_r()
creates the .R
script count_classes.R
Git
tab keeps track of all our changes in the repository after our initial commitcount_classes.R
Git
tab, check the box in column Staged
Commit
buttonStage
columnAdd count_classes()
Commit
buttonload_all()
count_classes()
load_all()
from {devtools} makes function count_classes()
available for experimentationmtcars
and column cyl
data_frame
and class_col
, respectivelyREADME.md
file later)Terminal
tab of our RStudio session, we will paste the Git commands shown by GitHub.com from section ...or push an existing repository from the command line
check()
R
add-on package work correctlycheck()
executes R CMD check
in the shell (i.e., terminal)check()
from {devtools} via the R
Console
check()
output)DESCRIPTION
DESCRIPTION
filePackage: eda
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R:
person("First", "Last", , "first.last@example.com", role = c("aut", "cre"),
comment = c(ORCID = "YOUR-ORCID-ID"))
Description: What the package does (one paragraph).
License: `use_mit_license()`, `use_gpl3_license()` or friends to pick a
license
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
DESCRIPTION
template might look different if you followed the instructions in system setup from here: https://r-pkgs.org/setup.html#personal-startup-configurationTitle
, Authors@R
, and Description
ORCID
, you can delete comment = c(ORCID = "YOUR-ORCID-ID")
Package: eda
Title: A Package for Data Wrangling
Version: 0.0.0.9000
Authors@R:
person("G. Alexi", "Rodriguez-Arelis", , "alexrod@stat.ubc.ca", role = c("aut", "cre"))
Description: Provide data wrangling and summary functions to conduct a proper
exploratory data analysis.
License: `use_mit_license()`, `use_gpl3_license()` or friends to pick a
license
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
DESCRIPTION
file, we need to save our changescount_classes()
function, let’s locally commit these changes (use the commit message Edit DESCRIPTION
)Git
tab in RStudio, we will remotely push our edits to our public repository on GitHub by clicking on the Push
buttonuse_mit_license()
check()
, we need to include a LICENSE.md
use_mit_license()
from {usethis} via the R
Console
LICENSE.md
look like?Note: More about license matters later on in this workshop
DESCRIPTION
file!License
field gets updated as follows:Package: eda
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R:
person("G. Alexi", "Rodriguez-Arelis", , "alexrod@stat.ubc.ca", role = c("aut", "cre"))
Description: Provide data wrangling and summary functions to conduct a proper
exploratory data analysis.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
Use MIT license
)document()
count_classes()
function via package {roxygen2}R/count_classes.R
in the source editorCode > Insert roxygen skeleton
count_classes()
#' Count class observations
#' Creates a new data frame with two columns,
#' listing the classes present in the input data frame,
#' and the number of observations for each class.
#'
#' @param data_frame A data frame or data frame extension (e.g. a tibble).
#' @param class_col Unquoted column name of column containing class labels.
#'
#' @return A data frame with two columns.
#' The first column (named class) lists the classes from the input data frame.
#' The second column (named count) lists the number of observations for each class from the input data frame.
#' It will have one row for each class present in input data frame.
#' @export
#'
#' @examples
#' count_classes(mtcars, cyl)
R/count_classes.R
Add roxygen header to document count_classes()
)document()
from {devtools}document()
function in the R
Console
man/count_classes.Rd
in {eda}, which is the help we get when typing ?count_classes
in the R
Console
document()
Run document()
)check()
againLICENSE.md
in {eda}, let’s use check()
again in the R
Console
to ensure the license-related warning is goneinstall()
install.packages()
as with any package in the CRAN, we will use install()
from {devtools}install()
installs a local package in the current working directory, whereas install.packages()
installs from a package repositoryR
consoleuse_package()
count_classes()
uses functions from package {dplyr}use_package()
from {usethis}DESCRIPTION
, more specifically in Imports
R
consoleImport dplyr
)use_readme_rmd()
README.md
file describing the package, installation, and usageuse_readme_rmd()
from {usethis}.Rmd
template, which we have to fill out.Rmd
file.md
, use build_readme()
, commit, and push these changes to the remote repository (use the commit message Write README.Rmd and render
)check()
and install()
R
package!check()
(to ensure all warnings are gone!), and then re-build via install()
test()
will be covered later on) via the below diagram from Chapter 1: The Whole Game (R
packages book by Hadley Wickham & Jenny Bryan, 2e)