Mathematics (MathML)

Mathematics (MathML)

htmlbag has a native MathML reader. <math> elements embedded directly in the HTML source are picked up as foreign content, parsed, and typeset through the OpenType-MATH engine in boxes and glue. No Pandoc, no JavaScript, no external converter — just <math>…</math> in your markup and a math font.

<p>The Pythagorean relation
<math>
  <msup><mi>a</mi><mn>2</mn></msup><mo>+</mo>
  <msup><mi>b</mi><mn>2</mn></msup><mo>=</mo>
  <msup><mi>c</mi><mn>2</mn></msup>
</math>
holds for every right triangle.</p>

The formula flows inline with the surrounding text; the line-break and spacing passes treat it as one opaque box.

A math font is required

The OpenType-MATH engine works on glyphs from a font that carries an OpenType MATH table (Latin Modern Math, STIX Two Math, XITS Math, Cambria Math, …). Point the <math> element at such a font with a font-family rule — the engine reads the font off the element’s own resolved style:

<style>
  @font-face {
    font-family: "Latin Modern Math";
    src: url("latinmodern-math.otf");
  }
  math { font-family: "Latin Modern Math"; font-size: 12pt; }
</style>

A font without a MATH table is rejected: the engine needs the table’s constants (axis height, fraction rule thickness, script shifts, …) and the variant/assembly data for stretchy delimiters.

Inline and display mode

The root element’s display attribute selects the layout style, exactly as in a browser:

<!-- inline: tighter script shifts, smaller operators -->
<math><mfrac><mn>1</mn><mn>2</mn></mfrac></math>

<!-- display: larger operators, limits above/below big operators -->
<math display="block"><munderover>
  <mo>&#x2211;</mo>
  <mrow><mi>k</mi><mo>=</mo><mn>0</mn></mrow>
  <mi>n</mi>
</munderover></math>

In display mode a big operator like places its bounds above and below (limits); in inline mode they become a subscript/superscript pair.

Supported elements

Category Elements
Containers (transparent) <math>, <mrow>, <mstyle>, <mpadded>, <mphantom>, <semantics>
Token leaves <mi>, <mn>, <mo>
Fractions / radicals <mfrac> (linethickness="0" → binomial), <msqrt>, <mroot>
Scripts <msup>, <msub>, <msubsup>
Limits / accents <munder>, <mover>, <munderover> (accent="true" → accent placement)
Annotations <semantics> is transparent; <annotation> / <annotation-xml> are skipped

Unknown elements are treated transparently — their children are processed in place rather than raising an error — so a document with a stray unsupported element still renders the parts the engine understands.

mathvariant defaults

The reader follows the MathML convention for identifier styling:

  • <mi>x</mi> (single character) defaults to italic, and is auto-mapped to the Mathematical Italic alphabet (U+1D44E ff), so the font renders the expected 𝑥 rather than the upright ASCII glyph.
  • <mi>sin</mi> (multiple characters) defaults to upright — the standard convention for function names.
  • <mi mathvariant="normal">x</mi> forces upright on a single character.
  • Special case: <mi>h</mi> maps to U+210E (Planck constant ) because the Mathematical Italic h slot is reserved in Unicode.

Accessibility

Under PDF/UA each formula is tagged as a Formula structure element with plain-text /Alt fallback, and under PDF/UA-2 the original MathML is embedded as an associated file so assistive technology can read the math semantically. See PDF/UA tagging — Mathematics.

Scope

The reader covers presentation MathML for the common cases — fractions, radicals, scripts, limits, accents, and the operator dictionary for spacing classes. Not yet implemented:

  • <mtable>, <mtr>, <mtd> (matrices and aligned equations)
  • <mtext>, <mspace> (text runs, explicit spacing)
  • <menclose> (boxed/struck expressions)
  • mathvariant values beyond italic / normal (bold, double-struck, fraktur, sans-serif, …)
  • Greek-letter italic remapping (ASCII az and AZ only)

TeX-style math input (writing $x^2$ and getting the same render) is a separate, deferred project; the MathML path is the supported entry point today.