--- title: "YAML: an Overview" output: prettydoc::html_pretty: theme: tactile vignette: | %\VignetteIndexEntry{YAML: an Overview} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( echo = FALSE, message = FALSE, collapse = TRUE, comment = "", warning = FALSE ) ``` ```{r setup} library(ymlthis) oldoption <- options(devtools.name = "Malcolm Barrett", crayon.enabled = FALSE) ``` ```{r pandoc_check, echo=FALSE} if (!rmarkdown::pandoc_available()) { cat("pandoc is required to use ymlthis. Please visit https://pandoc.org/ for more information.") knitr::knit_exit() } ``` # Basic syntax The basic syntax of YAML is to use key-value pairs in the format `key: value`. A YAML code block should be fenced in with `---` before and after (you can also use `...` to end the YAML block, but this is not very common in **R Markdown**). ```{r} yml(date = FALSE) %>% asis_yaml_output() ``` In R, the equivalent structure is a list with named character vector: `list(author = "Malcolm Barrett")`. In fact, you can call this list in **R Markdown** using the `metadata` object; in this case, `metadata$author` will return `"Malcolm Barrett"` In YAML, spaces are used to indicate nesting. When we want to specify the output function `pdf_document(toc = TRUE)`, we need to nest it under the `output` field. We also need to nest `toc` under `pdf_document` so that it gets passed to that function correctly. ```{r, warning = FALSE} yml_empty() %>% yml_output(pdf_document(toc = TRUE)) %>% asis_yaml_output() ``` In R, the equivalent structure is a nested list, each with a name: `list(output = list(pdf_document = list(toc = TRUE)))`. Similarly, you can call this in R Markdown using the `metadata` object, e.g. `metadata$output$pdf_document$toc`. The hierarchical structure (which you can see with `draw_yml_tree()`) looks like this: ```{r, warning = FALSE} yml_empty() %>% yml_output(pdf_document(toc = TRUE)) %>% draw_yml_tree() ``` Without the extra indents, YAML doesn't know `toc` is connected to `pdf_document` and thinks the value of `pdf_document` is `NULL`. YAML that looks like this: ```yaml --- output: pdf_document: toc: true --- ``` has a hierarchy that looks like this: ```{r} list(output = list(pdf_document = NULL), toc = TRUE) %>% as_yml() %>% draw_yml_tree() ``` If you use output functions without additional arguments, the value of `output` can simply be the name of the function. ```{r} yml_empty() %>% yml_output(html_document()) %>% asis_yaml_output() ``` However, if you're specifying more than one output type, you must use the nesting syntax. If you don't want to include additional arguments, use `"default"` as the function's value. ```{r, warning = FALSE} yml_empty() %>% yml_output(html_document(), pdf_document(toc = TRUE)) %>% asis_yaml_output() ``` Some YAML fields take unnamed vectors as their value. You can specify an element of the vector by adding a new line and `-` (note that the values are not indented below `category` here). ```{r} yml_empty() %>% yml_category(c("R", "Reprodicible Research")) %>% asis_yaml_output() ``` In R, the equivalent structure is a list with a named vector: `list(categories = c("R", "Reprodicible Research"))`. `metadata$category` will return `c("R", "Reprodicible Research")`. Another way to specify vectors is to use `[]` with each object separated by a column, as in the syntax for `c()`. This YAML is equivalent to the YAML above: ```yaml --- category: [R, Reprodicible Research] --- ``` By default, **ymlthis** uses the `-` syntax for vectors.`-` is also used to group elements together. For instance, in the `params` field for parameterized reports, we group parameter information together by using `-`. The first line is the name and value of the parameter, while all the lines until the next `-` are extra information about the parameter. While you can use `metadata` to call objects in `params`, `params` has it's own object you can call directly: `params$a` and `params$data` will return the values of `a` and `data`. ```{r} yml_empty() %>% yml_params( list(a = 1, input = "numeric"), list(data = "data.csv", input = "text") ) %>% asis_yaml_output() ``` In R, the equivalent structure is a nested list that contains a list of unnamed lists: `list(param = list(list(a = 1, input = numeric), list(data = "data.csv", input = "file")))`. The inner-most lists group items together, e.g. `list(a = 1, input = numeric)` groups `a` and `input`. ```{r} yml_empty() %>% yml_params( list(a = 1, input = "numeric"), list(data = "data.csv", input = "text") ) %>% draw_yml_tree() ``` # Types in YAML You may have noticed that strings in YAML don't always need to be quoted. However, it can be useful to explicitly wrap strings in quotes when they contain special characters like `:` and `@`. ```{r} yml_empty() %>% yml_title("R Markdown: An Introduction") %>% asis_yaml_output() ``` R code can be written as inline expressions `` `r knitr::inline_expr('expr')` ``. `yml_code()` will capture R code for you and put it in a valid format. R code in `params` needs to be slightly different: use `!r` (e.g. `!r expr`) to call an R object. ```yaml author: '`r knitr::inline_expr('whoami::fullname()')`' params: date: !r Sys.Date() ``` Logical values in YAML are unusual: `true/false`, `yes/no`, and `on/off` are all equivalent to `TRUE/FALSE` in R. Any of these turn on the table of contents: ```yaml toc: true toc: yes toc: on ``` By default, **ymlthis** uses `true/false`. If you want to use any of these values literally (e.g. you want a string equal to `"yes"`), you need to wrap them in quotation marks: ```{r} yml_empty() %>% yml_params(x = "yes") %>% asis_yaml_output() ``` `NULL` can be specified using `null` or `~`. By default, **ymlthis** uses `null`. If you want to specify an empty vector, use `[]`, e.g. `category: []`. For an empty string, just use empty quotation marks (`""`). # Sources of YAML Where do the YAML fields you use in **R Markdown** come from? Many YAML fields that we use come from Pandoc from the **rmarkdown** package. These both use YAML to specify the build of the document and to pass information to be printed in a template. Pandoc templates can also be customized to add new YAML. The most common sources of YAML are: 1. Pandoc 1. **R Markdown** 1. Output functions (such as `rmarkdown::pdf_document()`) 1. Custom Pandoc templates 1. **R Markdown** extension packages (such as **blogdown**) 1. Hugo (in the case of **blogdown**) Because YAML is an extensible approach to metadata, and there is often no way to validate that your YAML is correct. YAML will often fail silently if you, for instance, make a typo in the field name or misspecify the nesting between fields. For more information on the fields available in R Markdown and friends, see the [YAML Fieldguide](yaml-fieldguide.html). ```{r, include=FALSE} options(oldoption) ```