pandoc-lua-filters/README.md

233 lines
10 KiB
Markdown
Raw Normal View History

# text-crossrefs: getting references to page and note numbers in Pandoc
This filters aims at extending Pandoc's cross-referencing
capacities by enabling automatic references to any piece of text
by either its page or, whenever it applies, its note number. It
currently supports the following target formats:
* context
* docx
* latex
* odt
* opendocument
It does not permit to refer to references in other files: if you want to do this, use text-extrefs.
## Format-specific preliminary notices
### DOCX and ODT/Opendocument
When opening for the first time a file produced by Pandoc with text-crossrefs, you probably will have to refresh the fields in order to get the correct values. In LibreOffice, press `F9`; in Word, a dialog box should appear when the file opens.
### TeX-based formats
All references are wrapped in a macro named `\tcrfenum`. It has two optional arguments: the first one is the reference type, the second specifies whether the prefix (e.g. “p. ”) should be printed or not (can be set to `withprefix`, `noprefix`, `yes` or `no`). The default values for these arguments should match those of `tcrf-default-reftype` and `tcrf-default-prefixref` (resp. `page` and `yes`, i.e. `withprefix`). The mandatory argument of `\tcrfenum` is a group containing a list of groups. Each of them contain a reference (either a single reference or a range). Here are some valid invocations:
* \tcrfenum\[note\]\[withprefix\]{{lblone}{lbltwo}{lblthree}}
* \tcrfenum\[page\]\[noprefix\]{{lblone}{lbltwo}{lblthree}}
* \tcrfenum\[noprefix\]{{lblone}{lbltwo}{lblthree}} (the first argument defaults to `page`)
* \tcrfenum{{lblone}{lbltwo}{lblthree}} (the second argument defaults to `withprefix`)
* \tcrfenum{{only-one}} (even if the enumeration is limited to one item, it must be inside its own group)
* \tcrfenum{{lblone to lbltwo}{lblthree}} (the first reference points to a range)
It is up to you to define `\tcrfenum` in your preamble. If your target format is LaTeX, it should be possible to define it as a wrapper for the `\zcref` macro provided by [the zref-clever package](https://ctan.org/pkg/zref-clever). Alternatively, you can use [my implementation](TODO), which supports ConTeXt, LaTeX and other formats. Here are some hints about the implementation:
* [The `\tcrfenum` macro is supposed to output the numbers along with the prefixes and delimiters (e.g. “p. ” and “–”)]{#prefixes-tex};
* In ConTeXt, there is no way to retrieve the note number from a `\reference` or a `\pagereference` contained in the note as is customary in LaTeX. to work around this, footnotes are labelled automatically with the first identifier attached to a span in the note suffixed with `_note`.
## Usage
### Basics
Mark the span of text you want to refer to later with an
identifier composed of alphanumeric characters, periods, colons, underscores and hyphens:
``` markdown
Émile Gaboriau published [_L'Affaire Lerouge_ in
1866]{#publication}.[^1]
[^1]: It is a very [fine piece of literature]{#my-evaluation}.
[It was very popular.]{#reception}
```
You can refer to it using another span with class `ref` containing
the target's identifier. If the targetted span is part of a
footnote, you can refer to it either by page or by note number according to
the value of the `reftype` attribute (defaults to `page`). For instance, this:
``` markdown
See [publication]{.tcrf} for the publication date. I gave my
opinion in [my-evaluation]{.tcrf reftype=note}, [my-evaluation]{.tcrf}.
```
will render in ConTeXt or LaTeX output:
``` tex
See \tcrfenum{{publication}} for the publication date. I expressed
my thoughts about it in \tcrfenum[note]{{my-evaluation}},
\tcrfenum{{my-evaluation}}.
```
If you want to give a reference by note and page number like in the example above, you can also use the following shorthand:
```md
[my-evaluation]{.tcrf reftype=pagenote}
```
You can refer to headers as well using either explicit or automatically generated identifiers (see Pandoc users guide).
To suppress the prefixes (e.g. “p. ”), you can set the `prefixref` attribute to `no` (defaults to `yes`). It can be useful, for instance, for small manually formatted indexes[^1]:
``` markdown
Gaboriau: [publication, my-evaluation, reception]{.tcrf prefixref=no}
```
[^1]: About the comma-delimited syntax used in this example, see [the section on enumerations below](#enums).
### Page ranges
You can refer to a page range like this:
``` markdown
If you want to know more about _L'Affaire Lerouge_, see [publication>reception]{.tcrf}.
```
The separator (here `>`) can be set to any string composed of characters other than alphanumeric, period, colon, underscore, hyphen and space.
In LaTeX and ConTeXt output, the above-mentionned `\tcrfenum` macro should be defined so that the range is printed as a simple page reference if the page numbers are identical. The syntax of a range is:
``` tex
\tcrfenum{{publication to reception}}
```
In DOCX and ODT/Opendocument output, the same result can be achieved in a word processor by automatically searching and replacing duplicates with regular expressions and/or macros.
### Enumerations {#enums}
You can enumerate several references as a comma-delimited list, for instance:
``` markdown
[ref-one, ref-two>ref-three, ref-four]{.tcrf}
```
In DOCX and ODT/Opendocument output, all these references will be printed, potentially resulting in unnecessary repetitions.
In TeX-based output formats, they will be wrapped in `\tcrfenum` like this:
``` tex
\tcrfenum{{ref-one}{ref-two to ref-three}{ref-four}}
```
## Customization
### Common options
The following metadata fields can be set as strings:
* `tcrf-references-enum-separator`:
* the string between two references in an enumeration in a reference span; can be composed of any characters not authorized in an identifier;
* defaults to `, ` (with a space after the comma).
* `tcrf-references-range-separator`:
* the string used to separate two references in a reference span; can be composed of any characters not authorized in an identifier;
* defaults to `>`.
* `tcrf-only-explicit-labels`:
* set it to `true` if you want that _tcrf_ handle only spans with class `label`;
* defaults to `false`.
* `tcrf-default-prefixref`:
* default value for the `prefixref` attribute;
* defaults to `yes`.
* `tcrf-default-reftype`:
* default value for the `reftype` attribute;
* defaults to `page`.
* `tcrf-filelabel-ref-separator`:
* only useful in conjunction with the text-exrefs filter;
* separator between external files' labels and references;
* defaults to `::`.
### Options specific to DOCX and ODT/Opendocument
Here are some metadata fields only useful in conjunction with `docx`, `odt` and `opendocument` formats (see [above](#prefixes-tex) why they are ignored with `context` and `latex`):
* `tcrf-page-prefix`:
* “page” prefix;
* defaults to `p. `.
* `tcrf-pages-prefix`:
* “pages” prefix;
* defaults to `pp. `.
* `tcrf-note-prefix`:
* “note” prefix;
* defaults to `n. `.
* `tcrf-notes-prefix`:
* “notes” prefix;
* defaults to `nn. `.
* `tcrf-pagenote-separator`:
* the separator between the references when `reftype` is set to `pagenote`;
* defaults to `, `.
* `tcrf-pagenote-at-end`:
* the string printed at the end of a pagenote reference;
* defaults to the empty string, can be used to achieve something like *n. 3 (p. 5)*.
* `tcrf-pagenote-factorize-first-prefix-in-enum`:
* defines if the prefixes of the type printed first in a reference to page and note should be repeated (e.g. “p. 6, n. 1 and p. 9, n. 3”) or expressed globally at the beginning of the enumeration (e.g. “pp. 6, n. 1 and 9, n. 3”);
* defaults to `no`, can be set to `yes`.
* `tcrf-pagenote-first-type`:
* the information that is printed first in references to page and note;
* defaults to `page`, can be set to `note`.
* `tcrf-range-separator`:
* the string inserted between the page numbers in a range;
* defaults to ``.
* `tcrf-references-enum-separator`:
* the string used to separate the elements of an enumeration in a reference span;
* defaults to a comma followed by a space.
2021-10-24 10:47:24 +01:00
* `tcrf-multiple-delimiter`:
* the string inserted between two elements (but the two last ones) in an enumeration;
* defaults to a comma followed by a space.
2021-10-24 10:47:24 +01:00
* `tcrf-multiple-before-last`:
* the string inserted between the two last elements in an enumeration;
* defaults to `and` surrounded with spaces.
### Options specific to the formats based on TeX
Since TeX is extensible, you may wish to support types other than `page`, `note` and `pagenote` for ConTeXt and LaTeX output. `tcrf-additional-types` can be provided with a list of supplementary accepted types, e.g.:
``` yaml
tcrf-additional-types:
- line
- figure
```
In addition, the following metadata field can be used to control the rendering of ranges of labels in `\tcrfenum`:
* `tcrf-range-delim-tcrfenum`:
* the delimiter between the labels of a range in the list of references passed to `\tcrfenum`;
* defaults to ` to ` (mind the spaces).
## Compatibility with other filters
Text-crossrefs must be run after all other filters that can create, delete or move
footnotes, like citeproc.
In order to give and identifier to a note produced by a citation inside square brackets, the span should not include the citation key, the locator or the `;`
delimiter. If it is placed immediatly after the locator, this should be surrounded by curly brackets. So this should work:
``` markdown
[@Jones1973, p. 5-70; @Doe2004[]{#jones-doe}]
[@Jones1973, p. 5-70; [it was elaborated upon]{#further-elaboration} by @Doe2004]
[@Jones1973, {p. 5-70}[]{#ref-to-jones}; @Doe2004]
```
not that:
``` markdown
[[@Jones1973, p. 5-70]{#ref-to-jones}; @Doe2004]
[[@Jones1973, p. 5-70; @Doe2004]{#jones-doe}]
[@Jones1973, p. 5-70[]{#ref-to-jones}; @Doe2004]
```
You can set classes and attributes to your spans other than those defined by text-crossrefs (for instance `[some text]{#to-be-referred-to .highlighted color=red}` or `[reference]{.tcrf color=red}`). No span is removed.
Text-crossrefs is fully compatible with text-extrefs. Whenever possible, when a metadata is not set for text-extrefs, its value is taken from its text-crossrefs equivalent, so that you don't need to duplicate similar variables.