10 KiB
text-crossrefs: getting references to page and note numbers in Pandoc
This filters aims at extending Pandoc's cross-referencing capacities by enabling automatic references to any piece of text by either its page or, whenever it applies, its note number. It currently supports the following target formats:
- context
- docx
- latex
- odt
- opendocument
It does not permit to refer to references in other files: if you want to do this, use text-extrefs.
Format-specific preliminary notices
DOCX and ODT/Opendocument
When opening for the first time a file produced by Pandoc with text-crossrefs, you probably will have to refresh the fields in order to get the correct values. In LibreOffice, press F9
; in Word, a dialog box should appear when the file opens.
TeX-based formats
All references are wrapped in a macro named \tcrfenum
. It has two optional arguments: the first one is the reference type, the second specifies whether the prefix (e.g. “p. ”) should be printed or not (can be set to withprefix
, noprefix
, yes
or no
). The default values for these arguments should match those of tcrf-default-reftype
and tcrf-default-prefixref
(resp. page
and yes
, i.e. withprefix
). The mandatory argument of \tcrfenum
is a group containing a list of groups. Each of them contain a reference (either a single reference or a range). Here are some valid invocations:
- \tcrfenum[note][withprefix]{{lblone}{lbltwo}{lblthree}}
- \tcrfenum[page][noprefix]{{lblone}{lbltwo}{lblthree}}
- \tcrfenum[noprefix]{{lblone}{lbltwo}{lblthree}} (the first argument defaults to
page
) - \tcrfenum{{lblone}{lbltwo}{lblthree}} (the second argument defaults to
withprefix
) - \tcrfenum{{only-one}} (even if the enumeration is limited to one item, it must be inside its own group)
- \tcrfenum{{lblone to lbltwo}{lblthree}} (the first reference points to a range)
It is up to you to define \tcrfenum
in your preamble. If your target format is LaTeX, it should be possible to define it as a wrapper for the \zcref
macro provided by the zref-clever package. Alternatively, you can use my implementation, which supports ConTeXt, LaTeX and other formats. Here are some hints about the implementation:
- [The
\tcrfenum
macro is supposed to output the numbers along with the prefixes and delimiters (e.g. “p. ” and “–”)]{#prefixes-tex}; - In ConTeXt, there is no way to retrieve the note number from a
\reference
or a\pagereference
contained in the note as is customary in LaTeX. to work around this, footnotes are labelled automatically with the first identifier attached to a span in the note suffixed with_note
.
Usage
Basics
Mark the span of text you want to refer to later with an identifier composed of alphanumeric characters, periods, colons, underscores and hyphens:
Émile Gaboriau published [_L'Affaire Lerouge_ in
1866]{#publication}.[^1]
[^1]: It is a very [fine piece of literature]{#my-evaluation}.
[It was very popular.]{#reception}
You can refer to it using another span with class ref
containing
the target's identifier. If the targetted span is part of a
footnote, you can refer to it either by page or by note number according to
the value of the reftype
attribute (defaults to page
). For instance, this:
See [publication]{.tcrf} for the publication date. I gave my
opinion in [my-evaluation]{.tcrf reftype=note}, [my-evaluation]{.tcrf}.
will render in ConTeXt or LaTeX output:
See \tcrfenum{{publication}} for the publication date. I expressed
my thoughts about it in \tcrfenum[note]{{my-evaluation}},
\tcrfenum{{my-evaluation}}.
If you want to give a reference by note and page number like in the example above, you can also use the following shorthand:
[my-evaluation]{.tcrf reftype=pagenote}
You can refer to headers as well using either explicit or automatically generated identifiers (see Pandoc user’s guide).
To suppress the prefixes (e.g. “p. ”), you can set the prefixref
attribute to no
(defaults to yes
). It can be useful, for instance, for small manually formatted indexes1:
Gaboriau: [publication, my-evaluation, reception]{.tcrf prefixref=no}
Page ranges
You can refer to a page range like this:
If you want to know more about _L'Affaire Lerouge_, see [publication>reception]{.tcrf}.
The separator (here >
) can be set to any string composed of characters other than alphanumeric, period, colon, underscore, hyphen and space.
In LaTeX and ConTeXt output, the above-mentionned \tcrfenum
macro should be defined so that the range is printed as a simple page reference if the page numbers are identical. The syntax of a range is:
\tcrfenum{{publication to reception}}
In DOCX and ODT/Opendocument output, the same result can be achieved in a word processor by automatically searching and replacing duplicates with regular expressions and/or macros.
Enumerations
You can enumerate several references as a comma-delimited list, for instance:
[ref-one, ref-two>ref-three, ref-four]{.tcrf}
In DOCX and ODT/Opendocument output, all these references will be printed, potentially resulting in unnecessary repetitions.
In TeX-based output formats, they will be wrapped in \tcrfenum
like this:
\tcrfenum{{ref-one}{ref-two to ref-three}{ref-four}}
Customization
Common options
The following metadata fields can be set as strings:
tcrf-references-enum-separator
:- the string between two references in an enumeration in a reference span; can be composed of any characters not authorized in an identifier;
- defaults to
,
(with a space after the comma).
tcrf-references-range-separator
:- the string used to separate two references in a reference span; can be composed of any characters not authorized in an identifier;
- defaults to
>
.
tcrf-only-explicit-labels
:- set it to
true
if you want that tcrf handle only spans with classlabel
; - defaults to
false
.
- set it to
tcrf-default-prefixref
:- default value for the
prefixref
attribute; - defaults to
yes
.
- default value for the
tcrf-default-reftype
:- default value for the
reftype
attribute; - defaults to
page
.
- default value for the
tcrf-filelabel-ref-separator
:- only useful in conjunction with the text-exrefs filter;
- separator between external files' labels and references;
- defaults to
::
.
Options specific to DOCX and ODT/Opendocument
Here are some metadata fields only useful in conjunction with docx
, odt
and opendocument
formats (see above why they are ignored with context
and latex
):
tcrf-page-prefix
:- “page” prefix;
- defaults to
p.
.
tcrf-pages-prefix
:- “pages” prefix;
- defaults to
pp.
.
tcrf-note-prefix
:- “note” prefix;
- defaults to
n.
.
tcrf-notes-prefix
:- “notes” prefix;
- defaults to
nn.
.
tcrf-pagenote-separator
:- the separator between the references when
reftype
is set topagenote
; - defaults to
,
.
- the separator between the references when
tcrf-pagenote-at-end
:- the string printed at the end of a pagenote reference;
- defaults to the empty string, can be used to achieve something like n. 3 (p. 5).
tcrf-pagenote-factorize-first-prefix-in-enum
:- defines if the prefixes of the type printed first in a reference to page and note should be repeated (e.g. “p. 6, n. 1 and p. 9, n. 3”) or expressed globally at the beginning of the enumeration (e.g. “pp. 6, n. 1 and 9, n. 3”);
- defaults to
no
, can be set toyes
.
tcrf-pagenote-first-type
:- the information that is printed first in references to page and note;
- defaults to
page
, can be set tonote
.
tcrf-range-separator
:- the string inserted between the page numbers in a range;
- defaults to
–
.
tcrf-references-enum-separator
:- the string used to separate the elements of an enumeration in a reference span;
- defaults to a comma followed by a space.
tcrf-multiple-delimiter
:- the string inserted between two elements (but the two last ones) in an enumeration;
- defaults to a comma followed by a space.
tcrf-multiple-before-last
:- the string inserted between the two last elements in an enumeration;
- defaults to
and
surrounded with spaces.
Options specific to the formats based on TeX
Since TeX is extensible, you may wish to support types other than page
, note
and pagenote
for ConTeXt and LaTeX output. tcrf-additional-types
can be provided with a list of supplementary accepted types, e.g.:
tcrf-additional-types:
- line
- figure
In addition, the following metadata field can be used to control the rendering of ranges of labels in \tcrfenum
:
tcrf-range-delim-tcrfenum
:- the delimiter between the labels of a range in the list of references passed to
\tcrfenum
; - defaults to
to
(mind the spaces).
- the delimiter between the labels of a range in the list of references passed to
Compatibility with other filters
Text-crossrefs must be run after all other filters that can create, delete or move footnotes, like citeproc.
In order to give and identifier to a note produced by a citation inside square brackets, the span should not include the citation key, the locator or the ;
delimiter. If it is placed immediatly after the locator, this should be surrounded by curly brackets. So this should work:
[@Jones1973, p. 5-70; @Doe2004[]{#jones-doe}]
[@Jones1973, p. 5-70; [it was elaborated upon]{#further-elaboration} by @Doe2004]
[@Jones1973, {p. 5-70}[]{#ref-to-jones}; @Doe2004]
not that:
[[@Jones1973, p. 5-70]{#ref-to-jones}; @Doe2004]
[[@Jones1973, p. 5-70; @Doe2004]{#jones-doe}]
[@Jones1973, p. 5-70[]{#ref-to-jones}; @Doe2004]
You can set classes and attributes to your spans other than those defined by text-crossrefs (for instance [some text]{#to-be-referred-to .highlighted color=red}
or [reference]{.tcrf color=red}
). No span is removed.
Text-crossrefs is fully compatible with text-extrefs. Whenever possible, when a metadata is not set for text-extrefs, its value is taken from its text-crossrefs equivalent, so that you don't need to duplicate similar variables.
-
About the comma-delimited syntax used in this example, see the section on enumerations below. ↩︎