OpenType MATH in HarfBuzz
TL;DR:

•
Work is in progress to add OpenType MATH support in HarfBuzz and will be instrumental for many math rendering engines relying on that library, including browsers.

•
For stretchy operators, an efficient way to determine the required number of glyphs and their overlaps has been implemented and is described here.
In the context of Igalia browser team effort to implement MathML support using TeX rules and OpenType features, I have started implementation of OpenType MATH support in HarfBuzz. This table from the OpenType standard is made of three subtables:

•
The MathConstants table, which contains layout constants. For example, the thickness of the fraction bar of $\frac{a}{b}$.

•
The MathGlyphInfo table, which contains glyph properties. For instance, the italic correction indicating how slanted an integral is e.g. to properly place the subscript in $\displaystyle\displaystyle\int_{D}$.

•
The MathVariants table, which provides larger size variants for a base glyph or data to build a glyph assembly. For example, either a larger parenthesis or a assembly of U+239B, U+239C, U+239D to write something like:
$\left(\frac{\frac{\frac{a}{b}}{\frac{c}{d}}}{\frac{\frac{e}{f}}{\frac{g}{h}}}\right.$
Code to parse this table was added to Gecko and WebKit two years ago. The existing code to build glyph assembly in these Web engines was adapted to use the MathVariants data instead of only private tables. However, as we will see below the MathVariants data to build glyph assembly is more general, with arbitrary number of glyphs or with additional constraints on glyph overlaps. Also there are various fallback mechanisms for old fonts and other bugs that I think we could get rid of when we move to OpenType MATH fonts only.
In order to add MathML support in Blink, it is very easy to import the OpenType MATH parsing code from WebKit. However, after discussions with some Google developers, it seems that the best option is to directly add support for this table in HarfBuzz. Since this library is used by Gecko, by WebKit (at least the GTK port) and by many other applications such as Servo, XeTeX or LibreOffice it make senses to share the implementation to improve math rendering everywhere.
The idea for HarfBuzz is to add an API to

1.
Expose data from the MathConstants and MathGlyphInfo.

2.
Shape stretchy operators to some target size with the help of the MathVariants.
It is then up to a higherlevel math rendering engine (e.g. TeX or MathML rendering engines) to beautifully display mathematical formulas using this API. The design choice for exposing MathConstants and MathGlyphInfo is almost obvious from the reading of the MATH table specification. The choice for the shaping API is a bit more complex and discussions is still in progress. For example because we want to accept stretching after glyphlevel mirroring (e.g. to draw RTL clockwise integrals) we should accept any glyph and not just an input Unicode strings as it is the case for other HarfBuzz shaping functions. This shaping also depends on a stretching direction (horizontal/vertical) or on a target size (and Gecko even currently has various ways to approximate that target size). Finally, we should also have a way to expose italic correction for a glyph assembly or to approximate preferred width for Web rendering engines.
As I mentioned at the beginning, the data and algorithm to build glyph assembly is the most complex part of the OpenType MATH and deserves a special interest. The idea is that you have a list of $n\geq 1$ glyphs available to build the assembly. For each $0\leq i\leq n1$, the glyph $g_{i}$ has advance $a_{i}$ in the stretch direction. Each $g_{i}$ has straight connector part at its start (of length $s_{i}$) and at its end (of length $e_{i}$) so that we can align the glyphs on the stretch axis and glue them together. Also, some of the glyphs are “extenders” which means that they can be repeated 0, 1 or more times to make the assembly as large as possible. Finally, the end/start connectors of consecutive glyphs must overlap by at least a fixed value $o_{\mathrm{min}}$ to avoid gaps at some resolutions but of course without exceeding the length of the corresponding connectors. This gives some flexibility to adjust the size of the assembly and get closer to the target size $t$.
To ensure that the width/height is distributed equally and the symmetry of the shape is preserved, the MATH table specification suggests the following iterative algorithm to determine the number of extenders and the connector overlaps to reach a minimal target size $t$:

1.
Assemble all parts by overlapping connectors by maximum amount, and removing all extenders. This gives the smallest possible result.

2.
Determine how much extra width/height can be distributed into all connections between neighboring parts. If that is enough to achieve the size goal, extend each connection equally by changing overlaps of connectors to finish the job.

3.
If all connections have been extended to minimum overlap and further growth is needed, add one of each extender, and repeat the process from the first step.
We note that at each step, each extender is repeated the same number of times $r\geq 0$. So if $I_{\mathrm{Ext}}$ (respectively $I_{\mathrm{NonExt}}$) is the set of indices $0\leq i\leq n1$ such that $g_{i}$ is an extender (respectively is not an extender) we have $r_{i}=r$ (respectively $r_{i}=1$). The size we can reach at step $r$ is at most the one obtained with the minimal connector overlap $o_{\mathrm{min}}$ that is
$\sum_{i=0}^{N1}\left(\sum_{j=1}^{r_{i}}{a_{i}o_{\mathrm{min}}}\right)+o_{ \mathrm{min}}=\left(\sum_{i\in I_{\mathrm{NonExt}}}{a_{i}o_{\mathrm{min}}} \right)+\left(\sum_{i\in I_{\mathrm{Ext}}}r{(a_{i}o_{\mathrm{min}})}\right)+o% _{\mathrm{min}}$ 
We let $N_{\mathrm{Ext}}={I_{\mathrm{Ext}}}$ and $N_{\mathrm{NonExt}}={I_{\mathrm{NonExt}}}$ be the number of extenders and nonextenders. We also let $S_{\mathrm{Ext}}=\sum_{i\in I_{\mathrm{Ext}}}a_{i}$ and $S_{\mathrm{NonExt}}=\sum_{i\in I_{\mathrm{NonExt}}}a_{i}$ be the sum of advances for extenders and nonextenders. If we want the advance of the glyph assembly to reach the minimal size $t$ then
${S_{\mathrm{NonExt}}o_{\mathrm{min}}\left(N_{\mathrm{NonExt}}1\right)}+{r% \left(S_{\mathrm{Ext}}o_{\mathrm{min}}N_{\mathrm{Ext}}\right)}\geq t$ 
We can assume $S_{\mathrm{Ext}}o_{\mathrm{min}}N_{\mathrm{Ext}}>0$ or otherwise we would have the extreme case where the overlap takes at least the full advance of each extender. Then we obtain
$r\geq r_{\mathrm{min}}=\max\left(0,\left\lceil\frac{t{S_{\mathrm{NonExt}}+o_{ \mathrm{min}}\left(N_{\mathrm{NonExt}}1\right)}}{S_{\mathrm{Ext}}o_{\mathrm{ min}}N_{\mathrm{Ext}}}\right\rceil\right)$ 
This provides a first simplification of the algorithm sketched in the MATH table specification: Directly start iteration at step $r_{\mathrm{min}}$. Note that at each step we start at possibly different maximum overlaps and decrease all of them by a same value. It is not clear what to do when one of the overlap reaches $o_{\mathrm{min}}$ while others can still be decreased. However, the sketched algorithm says all the connectors should reach minimum overlap before the next increment of $r$, which means the target size will indeed be reached at step $r_{\mathrm{min}}$.
One possible interpretation is to stop overlap decreasing for the adjacent connectors that reached minimum overlap and to continue uniform decreasing for the others until all the connectors reach minimum overlap. In that case we may lose equal distribution or symmetry. In practice, this should probably not matter much. So we propose instead the dual option which should behave more or less the same in most cases: Start with all overlaps set to $o_{\mathrm{min}}$ and increase them evenly to reach a same value $o$. By the same reasoning as above we want the inequality
${S_{\mathrm{NonExt}}o\left(N_{\mathrm{NonExt}}1\right)}+{r_{\mathrm{min}} \left(S_{\mathrm{Ext}}oN_{\mathrm{Ext}}\right)}\geq t$ 
which can be rewritten
$S_{\mathrm{NonExt}}+r_{\mathrm{min}}S_{\mathrm{Ext}}{o\left(N_{\mathrm{NonExt% }}+{r_{\mathrm{min}}N_{\mathrm{Ext}}}1\right)}\geq t$ 
We note that $N=N_{\mathrm{NonExt}}+{r_{\mathrm{min}}N_{\mathrm{Ext}}}$ is just the exact number of glyphs used in the assembly. If there is only a single glyph, then the overlap value is irrelevant so we can assume $N_{\mathrm{NonExt}}+{rN_{\mathrm{Ext}}}1=N1\geq 1$. This provides the greatest theorical value for the overlap $o$:
$o_{\mathrm{min}}\leq o\leq o_{\mathrm{max}}^{\mathrm{theorical}}=\frac{S_{ \mathrm{NonExt}}+r_{\mathrm{min}}S_{\mathrm{Ext}}t}{N_{\mathrm{NonExt}}+{r_{ \mathrm{min}}N_{\mathrm{Ext}}}1}$ 
Of course, we also have to take into account the limit imposed by the start and end connector lengths. So $o_{\mathrm{max}}$ must also be at most $\min{(e_{i},s_{i+1})}$ for $0\leq i\leq n2$. But if $r_{\mathrm{min}}\geq 2$ then extender copies are connected and so $o_{\mathrm{max}}$ must also be at most $\min{(e_{i},s_{i})}$ for $i\in I_{\mathrm{Ext}}$. To summarize, $o_{\mathrm{max}}$ is the minimum of $o_{\mathrm{max}}^{\mathrm{theorical}}$, of $e_{i}$ for $0\leq i\leq n2$, of $s_{i}$ $1\leq i\leq n1$ and possibly of $e_{0}$ (if $0\in I_{\mathrm{Ext}}$) and of of $s_{n1}$ (if ${n1}\in I_{\mathrm{Ext}}$).
With the algorithm described above $N_{\mathrm{Ext}}$, $N_{\mathrm{NonExt}}$, $S_{\mathrm{Ext}}$, $S_{\mathrm{NonExt}}$ and $r_{\mathrm{min}}$ and $o_{\mathrm{max}}$ can all be obtained using simple loops on the glyphs $g_{i}$ and so the complexity is $O(n)$. In practice $n$ is small: For existing fonts, assemblies are made of at most three nonextenders and two extenders that is $n\leq 5$ (incidentally, Gecko and WebKit do not currently support larger values of $n$). This means that all the operations described above can be considered to have constant complexity. This is much better than a naive implementation of the iterative algorithm sketched in the OpenType MATH table specification which seems to require at worst
$\sum_{r=0}^{r_{\mathrm{min}}1}{N_{\mathrm{NonExt}}+rN_{\mathrm{Ext}}}=N_{ \mathrm{NonExt}}r_{\mathrm{min}}+\frac{r_{\mathrm{min}}\left(r_{\mathrm{min}}% 1\right)}{2}N_{\mathrm{Ext}}={O(n\times r_{\mathrm{min}}^{2})}$ 
and at least $\Omega(r_{\mathrm{min}})$.
One of issue is that the number of extender repetitions $r_{\mathrm{min}}$ and the number of glyphs in the assembly $N$ can become arbitrary large since the target size $t$ can take large values e.g. if one writes \underbrace{\hspace{65535em}} in LaTeX. The improvement proposed here does not solve that issue since setting the coordinates of each glyph in the assembly and painting them require $\Theta(N)$ operations as well as (in the case of HarfBuzz) a glyph buffer of size $N$. However, such large stretchy operators do not happen in reallife mathematical formulas. Hence to avoid possible hangs in Web engines a solution is to impose a maximum limit $N_{\mathrm{max}}$ for the number of glyph in the assembly so that the complexity is limited by the size of the DOM tree. Currently, the proposal for HarfBuzz is $N_{\mathrm{max}}=128$. This means that if each assembly glyph is 1em large you won’t be able to draw stretchy operators of size more than 128em, which sounds a quite reasonable bound. With the above proposal, $r_{\mathrm{min}}$ and so $N$ can be determined very quickly and the cases $N\geq N_{\mathrm{max}}$ rejected, so that we avoid losing time with such edge cases…
Finally, because in our proposal we use the same overlap $o$ everywhere an alternative for HarfBuzz would be to set the output buffer size to $n$ (i.e. ignore $r1$ copies of each extender and only keep the first one). This will leave gaps that the client can fix by repeating extenders as long as $o$ is also provided. Then HarfBuzz math shaping can be done with a complexity in time and space of just $O(n)$ and it will be up to the client to optimize or limit the painting of extenders for large values of $N$…