Go to file
2022-10-29 19:47:05 +02:00
test Cleaned up the repo and updated the test suite 2022-10-29 19:47:05 +02:00
Makefile Cleaned up the repo and updated the test suite 2022-10-29 19:47:05 +02:00
README.md ConTeXt: Revert to the strategy of inverting the label and the footnote content 2022-10-29 18:56:13 +02:00
sample.md Cleaned up the repo and updated the test suite 2022-10-29 19:47:05 +02:00
text-crossrefs.lua ConTeXt: Revert to the strategy of inverting the label and the footnote content 2022-10-29 18:56:13 +02:00

text-crossrefs: getting references to page and note numbers in Pandoc

This filters aims at extending Pandoc's cross-referencing capacities by enabling automatic references to any piece of text by either its page or, whenever it applies, its note number. It currently supports the following target formats:

  • context
  • docx
  • latex
  • odt
  • opendocument

It does not permit to refer to references in other files: if you want to do this, use text-extrefs.

Format-specific preliminary notices

DOCX and ODT/Opendocument

When opening for the first time a file produced by Pandoc with text-crossrefs, you probably will have to refresh the fields in order to get the correct values. In LibreOffice, press F9; in Word, a dialog box should appear when the file opens.

TeX-based formats

All references are wrapped in a macro named \tcrfenum. It has two optional arguments: the first one is the reference type, the second specifies whether the prefix (e.g. “p. ”) should be printed or not (can be set to withprefix, noprefix, yes or no). The default values for these arguments should match those of tcrf-default-reftype and tcrf-default-prefixref (resp. page and yes, i.e. withprefix). The mandatory argument of \tcrfenum is a group containing a list of groups. Each of them contain a reference (either a single reference or a range). Here are some valid invocations:

  • \tcrfenum[note][withprefix]{{lblone}{lbltwo}{lblthree}}
  • \tcrfenum[page][noprefix]{{lblone}{lbltwo}{lblthree}}
  • \tcrfenum[noprefix]{{lblone}{lbltwo}{lblthree}} (the first argument defaults to page)
  • \tcrfenum{{lblone}{lbltwo}{lblthree}} (the second argument defaults to withprefix)
  • \tcrfenum{{only-one}} (even if the enumeration is limited to one item, it must be inside its own group)
  • \tcrfenum{{lblone to lbltwo}{lblthree}} (the first reference points to a range)

It is up to you to define \tcrfenum in your preamble. If your target format is LaTeX, it should be possible to define it as a wrapper for the \zcref macro provided by the zref-clever package. Alternatively, you can use my implementation, which supports ConTeXt, LaTeX and other formats. Here are some hints about the implementation:

  • [The \tcrfenum macro is supposed to output the numbers along with the prefixes and delimiters (e.g. “p. ” and “–”)]{#prefixes-tex};
  • In ConTeXt, there is no way to retrieve the note number from a \reference or a \pagereference contained in the note as is customary in LaTeX. to work around this, footnotes are labelled automatically with the first identifier attached to a span in the note prefixed with note:. Contrary to the ConTeXt syntax, this label is placed after the footnote content, which implies redefining the \footnote macro. If your template includes the header-includes metadata variable like in the default template, this redefinition will happen automatically. Otherwise, you can copy-paste the following code in your preamble:
\catcode`\@=11
\let\origfootnote\footnote
\def\footnote#1#2{
  \def\tcrf@secondArg{#2}%
  \ifx\tcrf@secondArg\tcrf@bracket
    \def\tcrf@todo{\tcrf@footnote@withlabel{#1}#2} %
  \else
    \def\tcrf@todo{\origfootnote{#1}#2}%
  \fi
  \tcrf@todo
}
\def\tcrf@bracket{[}
\def\tcrf@footnote@withlabel#1[#2]{\origfootnote[#2]{#1}}
\catcode`\@=13

Usage

Basics

Mark the span of text you want to refer to later with an identifier composed of alphanumeric characters, periods, colons, underscores and hyphens:

Émile Gaboriau published [_L'Affaire Lerouge_ in
1866]{#publication}.[^1]

[^1]: It is a very [fine piece of literature]{#my-evaluation}.

[It was very popular.]{#reception}

You can refer to it using another span with class ref containing the target's identifier. If the targetted span is part of a footnote, you can refer to it either by page or by note number according to the value of the reftype attribute (defaults to page). For instance, this:

See [publication]{.tcrf} for the publication date. I gave my
opinion in [my-evaluation]{.tcrf reftype=note}, [my-evaluation]{.tcrf}.

will render in ConTeXt or LaTeX output:

See \tcrfenum{{publication}} for the publication date. I expressed
my thoughts about it in \tcrfenum[note]{{my-evaluation}},
\tcrfenum{{my-evaluation}}.

If you want to give a reference by note and page number like in the example above, you can also use the following shorthand:

[my-evaluation]{.tcrf reftype=pagenote}

You can refer to headers as well using either explicit or automatically generated identifiers (see Pandoc users guide).

To suppress the prefixes (e.g. “p. ”), you can set the prefixref attribute to no (defaults to yes). It can be useful, for instance, for small manually formatted indexes1:

Gaboriau: [publication, my-evaluation, reception]{.tcrf prefixref=no}

Page ranges

You can refer to a page range like this:

If you want to know more about _L'Affaire Lerouge_, see [publication>reception]{.tcrf}.

The separator (here >) can be set to any string composed of characters other than alphanumeric, period, colon, underscore, hyphen and space.

In LaTeX and ConTeXt output, the above-mentionned \tcrfenum macro should be defined so that the range is printed as a simple page reference if the page numbers are identical. The syntax of a range is:

\tcrfenum{{publication to reception}}

In DOCX and ODT/Opendocument output, the same result can be achieved in a word processor by automatically searching and replacing duplicates with regular expressions and/or macros.

Enumerations

You can enumerate several references as a comma-delimited list, for instance:

[ref-one, ref-two>ref-three, ref-four]{.tcrf}

In DOCX and ODT/Opendocument output, all these references will be printed, potentially resulting in unnecessary repetitions. In TeX-based output formats, they will be wrapped in \tcrfenum like this:

\tcrfenum{{ref-one}{ref-two to ref-three}{ref-four}}

Customization

Common options

The following metadata fields can be set as strings:

  • tcrf-references-enum-separator:
    • the string between two references in an enumeration in a reference span; can be composed of any characters not authorized in an identifier;
    • defaults to , (with a space after the comma).
  • tcrf-references-range-separator:
    • the string used to separate two references in a reference span; can be composed of any characters not authorized in an identifier;
    • defaults to >.
  • tcrf-only-explicit-labels:
    • set it to true if you want that tcrf handle only spans with class label;
    • defaults to false.
  • tcrf-default-prefixref:
    • default value for the prefixref attribute;
    • defaults to yes.
  • tcrf-default-reftype:
    • default value for the reftype attribute;
    • defaults to page.
  • tcrf-filelabel-ref-separator:
    • only useful in conjunction with the text-exrefs filter;
    • separator between external files' labels and references;
    • defaults to ::.

Options specific to DOCX and ODT/Opendocument

Here are some metadata fields only useful in conjunction with docx, odt and opendocument formats (see above why they are ignored with context and latex):

  • tcrf-page-prefix:
    • “page” prefix;
    • defaults to p. .
  • tcrf-pages-prefix:
    • “pages” prefix;
    • defaults to pp. .
  • tcrf-note-prefix:
    • “note” prefix;
    • defaults to n. .
  • tcrf-notes-prefix:
    • “notes” prefix;
    • defaults to nn. .
  • tcrf-pagenote-separator:
    • the separator between the references when reftype is set to pagenote;
    • defaults to , .
  • tcrf-pagenote-at-end:
    • the string printed at the end of a pagenote reference;
    • defaults to the empty string, can be used to achieve something like n. 3 (p. 5).
  • tcrf-pagenote-factorize-first-prefix-in-enum:
    • defines if the prefixes of the type printed first in a reference to page and note should be repeated (e.g. “p. 6, n. 1 and p. 9, n. 3”) or expressed globally at the beginning of the enumeration (e.g. “pp. 6, n. 1 and 9, n. 3”);
    • defaults to no, can be set to yes.
  • tcrf-pagenote-first-type:
    • the information that is printed first in references to page and note;
    • defaults to page, can be set to note.
  • tcrf-range-separator:
    • the string inserted between the page numbers in a range;
    • defaults to .
  • tcrf-references-enum-separator:
    • the string used to separate the elements of an enumeration in a reference span;
    • defaults to a comma followed by a space.
  • tcrf-multiple-delimiter:
    • the string inserted between two elements (but the two last ones) in an enumeration;
    • defaults to a comma followed by a space.
  • tcrf-multiple-before-last:
    • the string inserted between the two last elements in an enumeration;
    • defaults to and surrounded with spaces.

Options specific to the formats based on TeX

Since TeX is extensible, you may wish to support types other than page, note and pagenote for ConTeXt and LaTeX output. tcrf-additional-types can be provided with a list of supplementary accepted types, e.g.:

tcrf-additional-types:
- line
- figure

In addition, the following metadata field can be used to control the rendering of ranges of labels in \tcrfenum:

  • tcrf-range-delim-tcrfenum:
    • the delimiter between the labels of a range in the list of references passed to \tcrfenum;
    • defaults to to (mind the spaces).

Compatibility with other filters

Text-crossrefs must be run after all other filters that can create, delete or move footnotes, like citeproc.

In order to give and identifier to a note produced by a citation inside square brackets, the span should not include the citation key, the locator or the ; delimiter. If it is placed immediatly after the locator, this should be surrounded by curly brackets. So this should work:

[@Jones1973, p. 5-70; @Doe2004[]{#jones-doe}]

[@Jones1973, p. 5-70; [it was elaborated upon]{#further-elaboration} by @Doe2004]

[@Jones1973, {p. 5-70}[]{#ref-to-jones}; @Doe2004]

not that:

[[@Jones1973, p. 5-70]{#ref-to-jones}; @Doe2004]

[[@Jones1973, p. 5-70; @Doe2004]{#jones-doe}]

[@Jones1973, p. 5-70[]{#ref-to-jones}; @Doe2004]

You can set classes and attributes to your spans other than those defined by text-crossrefs (for instance [some text]{#to-be-referred-to .highlighted color=red} or [reference]{.tcrf color=red}). No span is removed.

Text-crossrefs is fully compatible with text-extrefs. Whenever possible, when a metadata is not set for text-extrefs, its value is taken from its text-crossrefs equivalent, so that you don't need to duplicate similar variables.


  1. About the comma-delimited syntax used in this example, see the section on enumerations below. ↩︎