Skip to content

Roxygen Guide

mb706 edited this page Aug 18, 2019 · 18 revisions

See the mlr3 Roxygen Guide for most rules.

Table of Contents:

Example Pipeop Doc

comments are in <!-- -->.

#' @title PipeOpClassName <!--the name of the class-->
#'
#' @usage NULL
#' @name mlr_pipeops_defaultid <!--'defaultid' is the 
#'   same as in > mlr_pipeops$get("defaultid")-->
#' @format [`R6Class`] object inheriting from 
#' [`PipeOpTaskPreprocSimple`]/[`PipeOpTaskPreproc`]/[`PipeOp`].
#' <!--The whole inheritance chain-->
#'
#' @description
#' Long form description. This gives a good overview what the
#' [`PipeOp`] is good for. Here, like everywhere, we set links
#' to other concepts such as [`Learner`][mlr3::Learner]s, 
#' [`Task`][mlr3::Task]s, or [`Filter`][mlr3filters::Filter]s if
#' they come up.
#'
#' @section Construction:
#' ```
#' PipeOpClassName$new(argument, id = "defaultid", param_vals = list())
#' ```
#'
#' * `argument` :: `numeric(1)`\cr <!--example argument-->
#'   Example argument, type is `numeric(1)`.
#" * `id` :: `character(1)`\cr  <!--this is always present-->
#'   Identifier of resulting object, default `"defaultid"`.
#'   <!--Exceptions are things like PipeOpLearner or PipeOpFilter,
#'     see there-->
#' * `param_vals` :: named `list`\cr  <!--this is always present-->
#'   List of hyperparameter settings, overwriting the
#'   hyperparameter settings that would otherwise be set during
#'   construction. Default `list()`.
#'
#' @section Input and Output Channels:
  1. no inheritance, or behaviour different from inheritance:
    #' [`PipeOpClassName`] has one input channel named `"input"`,
    #' taking a [`Task`][mlr3::Task] both during training and prediction.
    #'
    #' [`PipeOpClassName`] has multiple output channels depending on the
    #' `outnum` construction argument, named `"output1"`, `"output2"`, ...;
    #' producing `NULL` during training and a [`Prediction`][mlr3::Prediction]
    #' during prediction.
    #'
    #' The output is the <!-- describe how the output is computed,
    #' during train and predict -->
  2. inheritance, with minimal change
    #' Input and output channels are inherited from [`PipeOpTaskPreproc`]
    #' <!-- Don't mention PipeOpTaskPreprocSimple because it does not
    #'   introduce any changes to channels! -->
    #' <!--optional, if applicable: --> Instead of a [`Task`][mlr3::Task],
    #' a [`TaskClassif`][mlr3::TaskClassif] is used as input and output
    #' during training and prediction.
    #'
    #' The output is the <!-- describe how the output is computed,
    #' during train and predict -->
#'
#' @section State:
  1. no state, no inheritance
    #' The `$state` is left empty (`list()`).
  2. no inheritance, state contains information
    #' The `$state` is a named `list` with the following elements:
    #' * `scores` :: named `numeric`\cr <!--example slot-->
    #'   Scores calculated for all features of the training
    #'   [`Task`][mlr3::Task] which are being used...
  3. inheriting, but nothing is in the state. note we just mention PipeOpTaskPreproc, not PipeOp, because PipeOp does nothing to the $state:
    #' The `$state` is a named `list` with the `$state` elements
    #'   inherited from [`PipeOpTaskPreproc`].
  4. inheriting, with additional state items:
    #' The `$state` is a named `list` with the `$state` elements
    #'   inherited from [`PipeOpTaskPreproc`], as well as:
    #' * `scores` :: named `numeric`\cr <!--example slot-->
    #'   Scores calculated for all features of the training
    #'   [`Task`][mlr3::Task] which are being used...
#'
#' @section Parameters:
  1. No parameters:
    #' [`PipeOpClassName`] has no parameters.
  2. Parameters, no inheritance:
    #' * `selection` :: `numeric(1)` | `character(1)`\cr
    #'   Selection of branching path to take. Is a `ParamInt`
    #'   if the `options` parameter...
    #'   Initialized to -1. <!-- do not say "Default" for parameters
    #'   that are merely initialized to a value. -->
  3. Inheritance, no additional parameters; only mention classes that actually introduce parameters:
    #' The parameters are the parameters inherited from
    #' [`PipeOpSuperclass`]/[`PipeOpSuperSuperSuperclass`].
  4. Inheritance, as well as additional parameters; only mention classes that actually introduce parameters:
    #' The parameters are the parameters inherited from
    #' [`PipeOpTaskPreproc`], as well as:
    #' * `selection` :: `numeric(1)` | `character(1)`\cr
    #'   Selection of branching path to take. Is a `ParamInt`
    #'   if the `options` parameter...
    #'   Initialized to 0.
#'
#' @section Internals:
#' Some internal notes. Interesting to developers working on
#' the class, or powerusers encountering edge cases.
#'
#' @section Fields:
  1. No additional fields; mention only superclasses that introduce fields:
    #' Only fields inherited from [`PipeOpTaskPreproc`]/[`PipeOp`].
  2. new fields introduced:
    #' Only methods inherited from [`PipeOpTaskPreproc`]/[`PipeOp`], as well as:
    #' * `learner`  :: [`Learner`][mlr3::Learner]\cr <!--example field-->
    #'   [`Learner`][mlr3::Learner] that is being wrapped. Read-only.
#'
#' @section Methods:
  1. no additional methods:
    #' Methods inherited from [`PipeOpTaskPreproc`]/[`PipeOp`].
  2. additional methods introduced:
    #' Methods inherited from [`PipeOp`], as well as:
    #' * `weighted_avg_prediction(inputs, weights, row_ids, truth)`\cr
    #'   (`list` of [`Prediction`][mlr3::Prediction], `numeric`,
    #'     `integer` | `character`, `list`) -> `NULL`\cr
    #'   Create [`Prediction`][mlr3::Prediction]s that correspond to
    #'   the weighted average...
#'
#' @examples
#' # The following copies the output of 'scale' automatically to both
#' Some examples. Make sure to import() or :: if you use mlr3 things
#' @family PipeOps
#' @include PipeOp.R <!--may instead be PipeOpTaskPreproc.R if
#'   inheriting from there-->
#' @export

mlr3pipelines-Specific Slots

Some mlr3pipelines specific rules when documenting PipeOps:

  • @section Internals: documents behaviour relevant mostly for development or very technical edge-case behaviour
  • @section State: documents the $state slot of a PipeOp. Should either be The `$state` is left empty (`list()`). or The `$state` is set to a named `list` with these members:, followed by the members in slot description format.
  • @section Parameters: documents the $param_set, in slot description format.
  • @section Input and Output Channels: documents the input and output channels (types, numbers, training, predict).
  • @name should always correspond to mlr_pipeops_<id>, where <id> is the key of the mlr_pipeops-dictionary by which the PipeOp can be constructed. For most PipeOps this should also be the default ID.

Roxygen Guide Amendments

Some amendments and clarifications:

  • Do not forget the colon (":") after @section titles!

Typename Format (<TYPENAME>)

  • S3 / R6 etc. classes are named in backticks, the topmost class only. Links to all classes that are not in R base.
    [`Graph`]
    `matrix`
    `NULL`
    
  • For atomic types of length n, this is followed by (n), otherwise not.
    `character(2)`
    `numeric`
    
  • Numbers that should be integer numbers should be integer/integer(n) even if integer numbers of type numeric are accepted.
    `integer`
    `integer(7)`
    
  • Atomic types and lists can be prefixed with "named"
    named `list`
    named `character(2)`
    
  • Lists can be postfixed with of <TYPENAME>
    `list` of `PipeOp`
    
  • data.table / data.frame with defined columns should have these columns named and typename given.
    [`data.table`] with columns `name` (`character`), `constructor` (`list` of `R6ClassGenerator`)
    
  • Alternatives are separated by vertical bars | and possibly put in parentheses if necessary to avoid confusion.
    `numeric` | `NULL`
    
    But:
    `list` of (`character(1)` | `NULL`)
    
  • Special type any for no type specification
    `any`
    

Slot Description Format

Used for slots / active bindings of R6 classes, for arguments of S3 or package level functions, and for parameter descriptions. Always listed in an unnumbered list. Description should end with a full stop.

Format:

* `NAME` :: <TYPENAME>\cr
  Description.

Example:

* `id` :: `character(1)`\cr
  Name of the task.
* `backend` :: (`data.table` | `DataBackend`)\cr
  Backend to use.

Method Description Format

Used for methods of R6 classes. Always listed in an unnumbered list. First line: function header (name, argument list including defaults). Second line: In-type tuple, followed by ->, followed by return type. Following lines: function description, ending with a full stop.

Functions without input ("nullary functions") have () in-type, functions that return invisible(NULL) (e.g. print, plot) have `NULL` return type. Methods that modify the R6 object itself (mutators) return self.

* `nop()`\cr
  () -> `NULL`\cr
  Do nothing.
* `non_nop(id, names = c("Bert", "Ruediger"), attributes = list())`\cr
  (`character(1)` | `NULL`, `character`, named `list`) -> [`data.table`] with columns `id` (`character`), `state` (`list` of `any`)\cr
  Does something and returns the `data.table` of the results.
* `reset()`\cr
  () -> `self`
  Resets the object.

Description Text Format

I.e. format of text describing slots, methods, the @description section etc.

  • R6 / S3 class names: Backtick-quoted, linked if not in base R or in any R default package. Many [`Graph`]s The [`Graph`]'s size This is a [`data.table`], not a `data.frame`
  • Field / slot names: Backtick quoted, prepended by dollar sign. The `Graph`'s `$id` Resets the `$state` to `NULL`
  • Method names: Backtick-quoted, prepended by dollar sign, followed by (). Use `$add_pipeop()` to...
  • literal strings: Backtick and double quoted: The default ID is `"pca"`.
  • functions that are not R6 object methods: Followed by (), and in brackets ([]), but not in backticks since roxygen prints them in monospace typeface automatically. Use [print()] to... This is equivalent to [data.table::copy()].

Linking

  • Simple links: [link].
  • Links to things in monospace typeface: [`link`], except functions (happens automatically): [link()]
  • Links to entities in other packages: [`package::link`], [package::link()], [`link`][package::link], [link()][package::link]
  • Linking to packages themselves: [mlr3][mlr3::mlr3-package]

@family

The @family tag creates a group of documentation pages that mutually link each other. Writing @family <TEXT> will create the line "Other : [link] [link] [link]". The following rules for this:

  • <TEXT> should be short but is allowed to, and should probably, contain spaces. It should make a natural sentence when written as "Other :". This means if it is a noun (e.g. "PipeOps") it should probably be plural.
  • A page can be member of multiple families if that is natural.
  • Do not create families with only one member.