To summarize the situation briefly, if we follow Intel’s approach then the
currently upstreamed Wayland code only
compiles for ChromeOS (i.e. target_os="chromeos" in your build config)
not for standard Linux
build of Chrome. You can only have accelerated rendering at the price of
removing separation of UI/GPU processes (i.e. via the --in-process-gpu
switch). If you keep the default behavior of having the UI and GPU components
running in separate processes then accelerated rendering fails. A fallback
to software rendering is then attempted but it in turn fails
in gpu_process_transport_factory.cc.
When Antonio joined Igalia, he
did some experiments on Ozone/Wayland and wrote a quick workaround
to make the software rendering fallback work when UI and GPU are in separate
processes. He was also able to run standard Linux build of Chrome on Wayland
by upstreaming more code from Intel. He also noticed that in Google’s code,
only GL drawing is happening in the GPU component (i.e. Wayland objects are
owned by the UI, contrary to Intel’s approach) which is the reason why
accelerated rendering fails in separate-process mode.
After discussion with Google developers it however became clear that
they would prefer a different approach
that is consistent with two features they are working on:
Mus-ash, a project to separate
the window management and shell functionality of ash from the chrome browser
process.
During the project, we were able to get chrome running on Wayland using
the Mus code path, either on ChromeOS or on standard Linux build.
For that code path, GPU and UI components are
currently running in the same process so as discussed above accelerated rendering works for Wayland. However, the plan for mus is to move these two components into separate
processes and hence we need to adapt the Wayland code in
order to allow communication between the GPU (doing GL drawings) and the UI
(owning Wayland objects).
I’ll let Antonio describe more precisely on
his blog the work we have been
doing to get chrome --mash running on Wayland.
In this blog post, I’m aligning with Google’s goal and hence I’m focusing on
this Mus code path. I’m going to give a quick overview of the structure of
Ozone and more specifically what is used by the Wayland platform to perform
accelerated rendering.
Ozone Architecture
Ozone Platform
OzonePlatform
is the main Ozone interface used to instantiate and initialize an Ozone
platform (X11, Wayland, DRM/GBM…). It provides factory getters for helper
classes (SurfaceFactoryOzone, PlatformWindow…).
The goal is to have OzonePlatform::InitializeForUI and
OzonePlatform::InitializeForGPU called from different processes as well as two
different Mojo connectors for the UI and GPU services respectively.
However at the time of writing, the two components are only in
different threads in the same process. There is also only one Mojo connector
available: The one for the service:ui service, passed to
OzonePlatform::InitializeForUI.
Two important implementations to consider in this project are OzonePlatformWayland (the one we work on) and OzonePlatformGbm (the one that seems the best maintained and tested upstream).
AcceleratedWidget is just a platform-specific object representing a surface on which compositors can paint pixels.
PlatformWindow represents a single window in the underlying platform windowing system with the usual property: It can be minimized, maximized, closed, put in fullscreen, etc
PlatformWindowDelegate. This is just a delegate to which the PlatformWindow sends events like e.g. OnBoundsChanged.
The implementations of PlatformWindow used for the Wayland and DRM/GBM platforms are respectively WaylandWindow and DrmWindow. At the moment, several features in WaylandWindow are not fully implemented but the minimal code to paint content into the window is present.
Accelerated drawing (GL path). This is done via a GLOzone instance returned by SurfaceFactoryOzone::GetGLOzone
Software Drawing (Skia). This is done via a SurfaceOzoneCanvas instance returned by SurfaceFactoryOzone::CreateCanvasForWidget.
In this project, we focus on accelerated rendering and on EGL, so we consider the GLOzoneEGL class. The following virtual pure functions must be implemented in derived class:
GLOzoneEGL::LoadGLES2Bindings, performing the GL initalization. In general, the default works well.
GLOzoneEGL::CreateOffscreenGLSurface returning an offscreen GLSurface of the specified dimension. It seems that it is mostly needed to create a dummy zero-dimensional offscreen surface during initialization. Hence the generic SurfacelessEGL should work fine.
See Issue 2387063002.
GLOzoneEGL::GetNativeDisplay returns the EGL display connection to use.
GLOzoneEGL::CreateViewGLSurface returns a new GLSurface associated to a gfx::AcceleratedWidget.
SurfaceFactoryOzone also provides functions to create NativePixmap, which represents a buffer that can be directly imported via GL for rendering, or exported via dma-buf fds. The DRM/GBM platform implements it and uses GbmPixmap. For now, such pixmap objects are not needed by the Wayland platform so the SurfaceFactoryOzone function members are not implemented.
The instances of GLSurface returned by CreateViewGLSurface for the Wayland and DRM/GBM platforms are respectively GLSurfaceWayland and GbmSurface. GLSurfaceWayland is just a gl::NativeViewGLSurfaceEGL associated to a window created by wl_egl_window_create. GbmSurface instead provides surface-like
semantics by deriving from GbmSurfaceless itself deriving from gl::SurfacelessEGL. Internally, a framebuffer is bound automatically for GL drawing in the GPU and the result are exported to the UI via pixmaps.
Wayland Platform
Wayland Connection
WaylandConnection is a class specific to the Wayland platform that helps to instantiate all the objects necessary to communicate with the Wayland display server.
It also manages a map from gfx::AcceleratedWidget instances to WaylandWindow instances that you can modify with the public GetWindow, AddWindow and RemoveWindow function members.
Wayland Platform Initialization
OzonePlatformWayland::InitializeUI is called from the UI thread. It creates
instances of WaylandConnection and WaylandSurfaceFactory as members of
OzonePlatformWayland. The Mojo connection of the service:ui service is
discarded.
OzonePlatformWayland::InitializeGPU in the same process but a different
thread. The WaylandConnection and WaylandSurfaceFactory members previously
created are hence accessible from the GPU thread too. Nothing is done by
this initialization function.
Wayland Window
A WaylandWindow is tied to a WaylandConnection and takes care of registering and
unregistering itself to that WaylandConnection. It is merely a wrapper to
native Wayland surfaces (wl_surface and xdg_surface) with additional window
bounds. It also maintains communication with the PlatformWindowDelegate.
Wayland Surface and OpenGL
Currently, the constructor of
WaylandSurfaceFactory receives the
WaylandConnection. Because we want the UI process to hold the WaylandConnection
and the GPU process to use WaylandSurfaceFactory for the GL rendering, the
current setup will not work after mus is split. Instead, we should pass it the
Mojo connector of the service:gpu service in order to indirectly communicate with
the service:ui service (for testing purpose, we can for now just use the Mojo
connector of the service:ui service passed to OzonePlatformWayland::InitializeUI).
The WaylandSurfaceFactory::CreateCanvasForWidget function and the WaylandCanvasSurface instance created require the WaylandConnection. However, they are used for software rendering so we can ignore them for now.
GLOzoneEGL::LoadGLES2Bindings and GLOzoneEGL::CreateOffscreenGLSurface do not seem fundamental here and they do not need any Wayland-specific code. Hence we can probably just keep the current default implementations.
At the moment GLOzoneEGLWayland::GetNativeDisplay just returns a native display provided by the WaylandConnection. It seems that we will not be able to do so when the WaylandConnection is no longer available on the GPU side. Probably we should just do like GLOzoneEGLGbm and return EGL_DEFAULT_DISPLAY.
The remaining GLOzoneEGLWayland::CreateViewGLSurface uses WaylandConnection::GetWindow to retrieve the WaylandWindow associated to AcceleratedWidget to draw on.
This window provides a wl_surface and sizes that can be passed to wl_egl_window_create to create an egl_window. Then as said in the previous paragraph, a GLSurfaceWayland instance is created which is merely a gl::NativeViewGLSurfaceEGL associated to the egl_window.
Conclusion
In this blog post, an overview of the Ozone Architecture was provided, with
focus on the Wayland platform. It also contains
an analysis of how we could get accelerated rendering in a separate GPU process
aligned with Google’s goals (Mus+ash and Mojo) and hence have it
well-integrated with upstream code.
The main problem is that the GLOzoneEGLWayland code
is very tied to the Wayland native objects (connection, surface, window…).
We should instead only provide a Mojo connector to the GLOzoneEGLWayland
class and hence to the GLSurfaceWayland class it constructs.
Instances of WaylandWindow and WaylandConnector will only live on the UI side
while the GL classes will live on the GPU side.
The GLSurfaceWayland should be rewritten to derive from SurfacelessEGL instead
of NativeViewGLSurfaceEGL. We would then follow what is done in GbmSurface
to provide some surface-like semantics but without the need to have a real
egl_window object. Under the hood, the GL drawings will be performed on
a framebuffer.
We should then find a way to export those framebuffers to the UI component via
the Mojo connector. Again, following the DRM/GBM code with this NativePixmap
objects seems the right option. At the end, the UI would convert the buffers
into wl_buffers that can finally be attach to the wl_surface of the WaylandWindow.
During the past two months we have been maintaining the upstream Ozone code and
made the Wayland platform work again. Try bots now build and run tests for the
Wayland platform and we expect that such continuous testing will make the whole
thing more robust. We have also been working on making Linux desktop work
with the Mus+ash code path and started
encouraging experiments of Mojo communication between the UI and GPU
components of the Wayland platform.
It is really great to work with Antonio on this project and we are looking
forward to continuing the collaboration on this with Google and Ozone
developers. Last but not least, I would like to
thank Renesas for supporting Igalia in this work to add Wayland support to chromium.
Last week I travelled to
Galicia for one of the
regular gatherings organized by Igalia. It was a great
pleasure
to meet again all the Igalians and friends. Moreover, this time was a bit
special since we celebrated our 15th anniversary :-)
I also attended the third edition of the
Web Engines Hackfest,
sponsored by Igalia,
Collabora and Mozilla.
This year, we had various participants from the Web Platform including folks from
Apple, Collabora, Google, Igalia, Huawei, Mozilla or Red Hat.
For my first hackfest as an Igalian, I invited some experts on fonts &
math rendering to collaborate
on OpenType MATH support HarfBuzz and its use in math rendering engines.
In this blog post, I am going to focus on the work I have made
with Behdad Esfahbod and
Khaled Hosny. I think it was again a great and
productive hackfest and I am looking forward to attending the next edition!
Behdad gave a talk with a nice overview of the work accomplished in
HarfBuzz during ten years.
One thing appearing recently in HarfBuzz is the need for APIs to parse
OpenType tables on all platforms. As part of my job at Igalia, I had started
to experiment adding support for the
MATH table
some months ago and it was
nice to have Behdad finally available to review, fix and improve commits.
When I talked to Mozilla employee
Karl Tomlinson, it became apparent that
the simple shaping API for stretchy operators proposed
in my blog post
would not cover all the special cases currently implemented in Gecko. Moreover,
this shaping API is also very similar to another one existing in HarfBuzz for
non-math script so we would have to decide the best way to share the logic.
As a consequence, we decided for now to focus on providing an API to access all
the data of the MATH table. After the Web Engines Hackfest,
such a math API is now integrated into the development
repository of HarfBuzz and will available in version 1.3.3 :-)
MathML in Web Rendering Engines
Currently, several math rendering engines have their own code to parse the data
of the OpenType MATH table. But many of them actually use HarfBuzz for normal
text shaping and hence could just rely on the new math API the math rendering
too.
Before the hackfest, Khaled already had tested my work-in-progress branch with
libmathview and I had done the
same for Igalia’s Chromium MathML branch.
MathML test for OpenType MATH Fraction parameters in Gecko, Blink and WebKit.
Once the new API landed into HarfBuzz, Khaled was also able to use it for the
XeTeX typesetting system.
I also started to experiment this
for Gecko and
WebKit.
This seems to work pretty well and we get consistent results for
Gecko, Blink and WebKit! Some random thoughts:
The math data is exposed through a hb_font_t which contains the text size.
This means that the various values are now directly resolved and returned as a
fixed-point number
which should allow to avoid rounding errors we may currently have in
Gecko or WebKit when multiplying by float factors.
HarfBuzz has some magic to automatically handle invalid offsets and sizes
that greatly simplifies the code, compared to what exist in Gecko and WebKit.
Contrary to Gecko’s implementation, HarfBuzz does not cache the latest
result for glyph-specific data. Maybe we want to keep that?
The WebKit changes were tested on the GTK port, where HarfBuzz is enabled.
Other ports may still need to use the existing parsing code from the
WebKit tree. Perhaps
Apple should consider adding support for the OpenType MATH table to
CoreText?
Gecko, WebKit and Chromium bundle their own copy of the Brotli, WOFF2
or OTS libraries in their source repositories. However:
We have to use more or less automated mechanisms to keep these bundled copies
up-to-date. This is especially annoying for Brotli and WOFF2 since they are
still in development and we must be sure to always integrate the latest security
fixes. Also, we get compiler warnings or coding style errors that do not exist
upstream and that must be disabled or patched until they are fixed upstream
and imported again.
This obviously is not an optimal sharing of system library and may increase
the size of binaries.
Using shared libraries is what maintainers of Linux (or other
FLOSS systems) generally ask and this was raised during the WebKitGTK+
session. Similarly, we should really use the system Brotli/WOFF2 bundled in
future releases of Apple’s operating systems.
There are several issues that make hard for package maintainers to
provide these libraries: no released binaries or release tags, no proper build
system to generate shared libraries, use of git submodule to include one library source
code into another etc Things have
gotten a bit better for Brotli and I was able to
tweak the CMake script to produce shared libraries. For WOFF2,
issue 40 and
issue 49 have been inactive but
hopefully these will be addressed in the future…
In a previous blog post, I explained the work made by Igalia’s
web platform team to refactor WebKit’s
MathML layout classes. I stated that although some rendering improvements were
a nice side effect, the main goal of the first phase was really to clean the code up so
that it is easier for developers to work on MathML in the future. Indeed this
really made things easier to review: Quite unexpectedly to me,
the second phase
only took 4 days to be upstreamed… Kudos to
Brent Fulgham for having reviewed so many
patches in such a short period of time!
In this blog post, I am going to give an overview of the improvements made
during these two first phases taking changeset r203109 as a reference. The changes will be available in WebKitGTK+ 2.14 in September and are likely to be included this month in the next Safari Technology Preview.
It definitely remains more work to do such as
the third phase or other rendering improvements, but I believe we have already made a big
step forward!
Mathematical Fonts
Two years ago, basic support for operator stretching via the OpenType MATH
table was added to WebKit. During the refactoring, we improved that support
and also made use of more parameters to improve the math layout (see section
about OpenType MATH parameters below). While
Latin Modern Math will be used in most screenshots, the following one shows that you can
use any math fonts. By default
WebKit will try and use one of these fonts but if none are available or if you
force a non-math font-family then the rendering quality may not be good.
The following screenshot gives the rendering for various fonts.
For the last one
we used the value sans-serif to illustrate issues with non-math fonts
(displaystyle integral too small, mathvariant italic glyphs taken from another
font, missing italic correction etc).
This new feature is obvious: You can now create a hyperlink for any part
of a mathematical formula! Here is a screenshot of the MathML Torture Test 21
with additional links, as displayed in WebKit r203109. Click the image to load
the test case in your browser and test it.
The mathvariant Attribute
Unicode contains Mathematical Alphanumeric Symbols to convey special meaning such as double-struck or specific Arabic styles. Characters for these symbols are generally provided by
math fonts.
In MathML, mathematical variables are automatically rendered using the
italic characters from this Unicode block. One can also access these characters
via the mathvariant attribute and that attribute is actually used by many
LaTeX-to-MathML converters.
In the following screenshot, you can see that the letters f, x and y are now
drawn with this special mathematical italic glyphs and that WebKit uses the
conventional fraktur style for the Lie algebra g. Note that the prime is still too
small because WebKit does not make use of the ssty feature yet.
Homomorphism of Lie algebra
Top: Safari 9.1.1. Bottom: Safari r203109.
Operators and Spacing
As said in my previous blog post, the rendering of large and stretchy operators
have been rewritten a lot and as a consequence the rendering has improved.
Also, I mentioned that the width of operators may depend on their height. This may cause accumulated approximations
during the computation of preferred widths. The old flexbox-based implementation
incorrectly forced layout during preferred computation to avoid that but a quick
workaround for that security concern caused the approximate
preferred widths to be used for the logical widths. With our new implementation,
the logical width is now correctly calculated.
Finally, we added partial support for the mpadded element
which is often used to tweak spacing in mathematical formulas.
The screenshot below illustrates the fix for a
serious regression with large operator (summation symbol) as well as improvements in the rendering of
stretchy operators (horizontal braces). Note that the formula has a hack with
a zero-width mpadded element which used to cause improper spacing
(large gap between the group of a’s and the group of b’s).
Tests 21 and 22 from the MathML torture test
Left: Safari 9.1.1. Right: Safari r203109.
The following screenshot shows how incorrect width computations used to cause
excessive spacing after the stretchy radical and slash symbols:
Mathematical formulas can be integrated inside a paragraph of text (inline math
in TeX terminology) or displayed in its own horizontally centered paragraph
(display math in TeX terminology). In the latter case, the formula is in
displaystyle and does not have any restrictions on vertical spacing.
In the former case, the layout of the mathematical formula is modified a bit to
optimize this vertical spacing and to better integrate within the surrounding text.
The displaystyle property can also be set using the corresponding attribute or
can change automatically in subformulas (e.g. in fractions).
In the following screenshot the fix for the large operator regression is
obvious but you can also notice that the summation is now slightly different
for the definition of a Bézier curve (top) and for the one of
a rational Bézier curve
(bottom). For example, to save some vertical space in the fractions, the
sigma symbol is drawn smaller and the scripts attached to it are moved on
its right. However, the script size could still be improved when we implement
the scriptlevel property.
Use of the AxisHeight parameter to set vertical position of fractions,
tables and symmetric operators.
Use of layout constants for radicals (some of them were already used),
scripts and fractions. This improves math spacing and positioning and allows
to adjust them according to value of the displaystyle property discussed
in the previous section.
Use of the italic correction of large operator glyph to set the position of
subscripts.
The screenshots below illustrate some of these improvements. For the first one,
the use of AxisHeight allows to better align the fraction bar with the plus
sign. For the second one, the use of layout constants for scripts as well
as the italic correction of the integral improves the placement of limits.
One of the advantage of the old flexbox-based implementation is that
right-to-left layout was available for free. This support has of course been
preserved in the new implementation. We also added a simple workaround to mirror
radicals using a scale transform
as shown in the screenshot below. However, general
glyph-level mirroring is still
missing.
Igalia’s web platform team has been able to follow the
MathML in HTML5 Implementation Note in order to significantly improve the rendering of mathematical
expressions in WebKit. More work remains to do but
we will definitely appreciate any feedback that can
help improving native MathML support in web engines.
We are also excited to continue work and collaboration at the next
Web Engines Hackfest!
If you follow WebKit developments, you are certainly aware that
Igalia has been working on WebKit’s
MathML implementation for some time. More recently, effort has been made to
write a clean implementation
addressing issues reported by WebKit reviewers in the past.
After joining Igalia in March, I have been in charge of
getting this work reviewed and merged into WebKit’s development branch.
In the past four months, we have been successful in upstreaming the
first phase
of the refactoring and the work accomplished is described in this blog post.
Note that the focus was on code refactoring so the improvement may not be obvious
for non-developers. Nevertheless many issues have already been fixed as a
consequence of that work: math italic, position of scripts,
stretchy and large operators, rendering update and more.
More importantly, this preliminary step opens the way for
beautiful math rendering based on TeX rules and the OpenType MATH table. Rendering improvements and
implementation of new features have already started in the
next phase
of the refactoring, so stay tuned…
Design Issues
As explained in a previous report, the main design issues in the
flexbox-based implementation released in 2013
can essentially be summarized in three points:
WebKit’s code to stretch operator was not efficient at all and was limited to some basic fences buildable via Unicode characters.
WebKit’s MathML code violated many layout invariants, making the code unreliable.
WebKit’s MathML code relied heavily on the C++ renderer classes for flexboxes and had to manage too many anonymous renderers.
For the first point, the performance issue had been fixed by Igalia developers
right after the initial feedback from WebKit developers and
we improved that again
during our refactoring.
Partial support for the OpenType MATH table was added during
my crowdfunding project and allowed
to stretch more operators with the help of math fonts. For the second point,
the main issue was also
fixed right after the initial feedback. However one could still have some
doubts regarding the layout steps, given the complexity implied by the third
point. That last issue was still problematic so far and addressing it was the
main achievement of our refactoring.
Technically, the dependence on flexbox is unnecessary and the implementation
actually only used a limited set of flexbox features. Thus executing the whole
flexbox code was overkill. It can also be a burden for
people working on other places of the layout code. For example
Javi Fernández has worked on improving the
box alignments in the past
and he had hard time fixing the MathML code impacted by his changes. This is probably the cause of the
bad position of the summation symbol
that can be seen in the screenshot above.
From the layout perspective, most of the rendering logic was
implemented in the flexbox classes and the MathML “renderer” classes were really
just managing the creation and update of anonymous nodes and style.
Although this sounds good code
reuse it actually made impossible to understand how and when layout steps
happen or to add more advanced features.
The new implementation replaced this manipulation of the render tree with
simple arithmetic calculations on box metrics which is safer and more reliable.
Also, complex renderers such as
RenderMathMLScripts or RenderMathMLRoot
actually achieve better rendering quality with less code!
As an example of the complexity, RenderMathMLUnderOver can behave as a
RenderMathMLScripts
in some situation so we really want the former class to reuse the
latter class. As we will see below the old implementation of the two renderers
were quite different: RenderMathMLUnderOver only relied on setting column
direction in the user agent stylesheet while RenderMathMLScripts created
a complex render tree structures with anonymous style. Hence it seemed difficult
to share the two cases or to handle DOM changes causing to move from one
case to the other one. With our new implementation, this is simply reduced to
simple C++ inheritance.
When I started to work on WebKit some years ago,
I made the mistake of continuing with
the existing approach. The implementation of multiscripts or automatic italic mathvariant added more anonymous objects
and made the situation even worse. After the end of my crowdfunding project,
Alex Castro
did more cleanup and tried to implement important features such as
displaystyle
but he also soon realized that it was too hard to work with the current code
base…
Layout Refactoring
In order to solve the issues discussed in the previous section,
Javi and Alex worked on a new MathML branch where the first step was to remove
the inheritance on the flexbox layout classes.
During the Web Engines Hackfest 2015, I collaborated with the Igalia’s
web platform team
team to continue the work on this branch. In a second step, we rewrote many
MathML renderer functions so that they stop creating anonymous nodes or style.
We obtained very encouraging results: The implementation looked much
simpler and much more understandable!
Alex announced the initial plan on the webkit-dev mailing list. He started opening bugs
and attaching patches to merge the first step.
However, that step still required many of the flexbox logic and so made code
hard to understand for reviewers. Hence when I joined Igalia four months ago
Alex asked me to try and see how to reorganize patches so that the two initial
steps can be submitted in one go.
This corresponds to the first phase mentioned in the introduction. As indicated
on the wiki page,
the layout refactoring consisted in rewriting the following member functions
of each renderer class:
computePreferredLogicalWidths: calculate preferred widths, based on the
preferred widths of child renderers.
layoutBlock: set final position and size of child renderers.
firstLineBaseLine: calculate the ascent of the renderer.
paint (optional): perform special painting such as fraction bars.
Refactored renderers no longer rely on any flexbox code nor anonymous
renderers and the functions mentioned above essentially perform arithmetic
computations. By reading the code, one can be sure that we follow
standard layout rules and that we do not perform unnecessary reflow.
Moreover, the rules specific to math rendering are only located in the MathML renderers and can be
better understood. Details for each class are provided in the next subsections.
After all the layout functions were rewritten and the code managing the
render tree structure removed, we were able to make the
RenderMathMLBlock class
inherit from RenderBlock instead of RenderFlexibleBox.
Many of the bugs could
then be immediately closed or otherwise fixed with small follow-up patches.
Spacing
RenderMathMLSpace is a simple class inserting blank boxes for adjusting
spacing of math formulas.
Obviously, we do not need any of the complexity of flexbox so it was
straightforward to write the layout functions.
$$3\phantom{\rule{3em}{0ex}}x$$
Large space between 3 and x.
Grouping
RenderMathMLRow performs rendering of a row of math items.
Since WebKit does not support linebreaking in MathML at the
moment, this is just putting child boxes on a same baseline.
One specificity is that some operators can be stretched vertically and so
their width may depend on their height.
$$\{\frac{2}{x}{x}^{3}$$
Row containing a stretched brace, a fraction and a scripted element.
Again, flexbox features are useless here. With the old code, it was
not clear whether we were violating the CSS invariant with preferred and
logical widths and which kind of relayout or render tree changes would happen
when doing the stretch call. By properly implementing the layout functions
previously mentioned all of this became much more trustable.
Fractions
RenderMathMLFraction draws a fraction with numerator and denominator.
$$\frac{x+1}{y+2}$$
Simple fraction.
This used to be implemented using a column direction for the fraction element.
Numerator and denominator were wrapped into anonymous nodes with additional
style to leave space for the fraction bar and to adjust the horizontal
alignments.
It was relatively easy to implement this without any anonymous nodes and again
the use of flexbox did not sound justified.
For example, to calculate the preferred width we just take the maximum
of the preferred widths of the numerator and denominator.
For the layout, the calculation of the logical width is similar and we
calculate the horizontal coordinates of numerator and denominator so that
their centers are aligned. Vertical metrics are similarly calculated
from the vertical metrics of the numerator and denominator.
During that step, we also fixed some bugs with the linethickness attribute and
added support for some OpenType MATH table constants.
Scripts above and below
RenderMathMLUnderOver is used to attach some scripts above and below a base.
Each child can itself be a horizontal stretchy operator.
$$\overrightarrow{\text{base}}$$
Base with stretchy arrow over it.
This was implemented in the user agent stylesheet by using
flexboxes with column direction for the corresponding MathML elements and
the C++ class had
additional rules to fire the stretching. So the problems and solutions for
this class were essentially a mixed of the cases of
RenderMathMLFraction and RenderMathMLRow we just discussed.
Subscripts and Superscripts
RenderMathMLScripts is used for a base with some arbitrary number of scripts.
All the scripts can have different positions (pre, post, sub, super) and
metrics (width, ascent and descent). We must avoid collisions and take care
of horizontal and vertical alignements.
$${}^{d}{}_{e}{}^{f}\text{base}_{a}^{b}{}_{c}$$
Base with pre and post scripts.
The old code used a complex render tree with additional style to achieve the
best possible result.
However, the quality was still bad as you can see for the script
attached to
the integral in the screenshot above.
Managing the render tree was a nightmare: Just to give the idea, additional
anonymous node and style were used to allow horizontal and vertical
adjustments (similar to RenderMathMLFraction above) and prescripts had
negative order property so that they were positioned before the base.
RenderMathMLScripts
Base Wrapper (anonymous flexbox)
RenderMathMLRow (base)
...
SubSupPair Wrapper (anonymous flexbox with column direction)
RenderMathMLRow (post-subscript)
...
RenderMathMLRow (subperscript)
...
SubSupPair Wrapper (anonymous flexbox with column direction)
RenderMathMLRow (post-subscript)
...
RenderMathMLRow (post-superscript)
...
... (more postscripts)
RenderMathMLBlock (prescripts separator)
SubSupPair Wrapper (anonymous flexbox with column direction and order -1)
RenderMathMLRow (pre-subscript)
...
RenderMathMLRow (pre-subperscript)
...
SubSupPair Wrapper (anonymous flexbox with column direction and order -1)
RenderMathMLRow (pre-subscript)
...
RenderMathMLRow (pre-superscript)
...
... (more prescripts)
Rules from TeX and the OpenType MATH table are
relatively complex and we decided to implement them directly in the new
refactoring as otherwise it was impossible to get decent quality. The code is
still complex but we now have clear rules, we only perform simple
calculations and the render tree structure matches the DOM tree.
“Enclosing” Notations
RenderMathMLMenclose is a row of math items with some additional notations.
Gurpreet Kaur implemented this element two years ago
but she followed the same approch, combining anonymous nodes and style for
some simple notations and special painting for others.
$$\overline{)x+1}$$
circle and strike notations
During the refactoring, the code has been completely
rewritten so that RenderMathMLMenclose is now essentially a derived class
of RenderMathMLRow
with the measuring and painting functions adjusted to take into account the
additional notations. During that refactoring, we also
removed support for unused radical notation, which was implemented using an anonymous
RenderMathMLSquareRoot (see Radicals section below).
Helper Classes for Operators
The RenderMathMLOperator class is used for math operators.
It was quite complex class and we decided to extract from it two features that
are unrelated to layout:
The MathML operator dictionary and corresponding search functions have been moved into a
MathOperatorDictionary class.
The remaining code was indeed the real layout part but the mess with
anonymous node and style was only removed later (see Text Classes below).
Although it seems we just needed to move the code out of RenderMathMLOperator
into those new classes, the case of MathOperator was particularly difficult.
We had to split the effort into several small steps to make review possible
and also fixed many issues due to the entanglement and confusion of these
three different features of the RenderMathMLOperator class…
The work done for MathOperator actually improved the rendering
of stretchy operators as you can see for the horizontal braces in the
screenshot above.
Radicals
RenderMathMLRoot is used for square root or arbitrary N-th root.
Many of the TeX and OpenType MATH table rules
were already used by the old implementation with anonymous
nodes and style. However, there were bugs difficult to fix related to
zooming,
child removal or
style change due to the
management of the anonymous RenderMathMLOperator to draw the radical sign.
$$\sqrt{x+1}+\sqrt[3]{x+1}$$
square and cube roots
The old implementation actually had two classes for the square and general
cases (RenderMathMLSquareRoot and RenderMathMLRoot). The usual
technique
with various anonymous wrappers and style was used.
After the refactoring, we were able to merge everything in a single
RenderMathMLRoot
class. Because the square root behaves as an mrow, we also made that class
derive from RenderMathMLRow to reuse as much code as possible.
Here is are how the render trees used to look like:
RenderMathMLSquareRoot
RenderMathMLBlock (anonymous used for metric adjustements)
RenderMathMLRadicalOperator (anonymous used for the radical symbol)
...
RenderMathMLRootWrapper (anonymous used for the children)
RenderMathMLRow (child 1)
...
RenderMathMLRow (child 2)
...
...
RenderMathMLRow (child N)
...
RenderMathMLRoot
RenderMathMLRootWrapper (anonymous for the index)
...
RenderMathMLBlock (anonymous used for metric adjustements)
RenderMathMLRadicalOperator (anonymous used for the radical symbol)
...
RenderMathMLRootWrapper (anonymous for the base)
...
Again, we rewrote the implementation using only simple box positioning.
The difficult part was to get rid of the anonymous
RenderMathMLRadicalOperator to draw the radical symbol. This class was
derived from RenderMathMLOperator and extended it with some fallback drawing
when math fonts were not available. After having extracted stretchy operator
shaping from RenderMathMLOperator it became possible to use the
MathOperator
helper class to draw the radical symbol. We implemented the fallback for
missing math fonts the same as Gecko: Use a scale transform to stretch
the base glyph for U+221A SQUARE ROOT. As a bonus, we used such transform to
implement glyph mirroring, as required to draw right-to-left radicals in
some Arabic mathematical notations.
Text Classes
These classes are containers for math text such as variables or
operators. There is a generic RenderMathMLToken class and
a derived class RenderMathMLOperator adding features
specific to operators such as spacing, dictionary property, stretching…
Anonymous wrappers and style were used to implement
automatic italic mathvariant
or operator spacing. The RenderText child
of RenderMathMLOperator was (re)built as an anonymous text node
so that is was possible to
convert U+002D HYPHEN-MINUS into U+2212 MINUS SIGN
or to provide some text for anonymous operators created by
RenderMathMLFenced (see Unchanged Classes section).
RenderMathMLToken (e.g. mi element)
RenderMathMLBlock (anonymous flexbox used to apply CSS italic)
RenderBlock (anonymous created elsewhere to honor CSS rules)
RenderText
text run "x"
RenderMathMLOperator (mo element)
RenderMathMLBlock (anonymous flexbox used for spacing)
RenderBlock (anonymous created elsewhere to honor CSS rules)
RenderText (anonymous destroyed and built again)
text run "−"
We did a big refactoring to remove all the anonymous nodes
created by the MathML renderer classes.
Just like for MathOperator, we had to be careful and submit
various small pieces as the text rendering was quite sensible to code change.
The simplified operator spacing that was supported by WebKit was easy to
implement with the new approach.
To do automatic italic mathvariant, we modified the paint function to use
Mathematical Alphanumeric Symbols instead of CSS italic as you can notice for the
variables displayed in the
screenshot above. Hence we could
remove the RenderMathMLBlock anonymous wrapper.
The use of an anonymous node for the text prevented it to appear in the dumped
render tree of layout tests and also
required some hacks in the accessibility code
to expose that text. In order to address the cases of
the minus sign and of mfenced operators,
we decided to use our new MathOperator class again.
Indeed MathOperator is actually also able to draw unstretched operators
made of a single character and this works for the minus sign and for mfenced
operators used in practice.
Unchanged Classes
Two classes have not been modified but such modifications were not needed to
remove the dependency on RenderFlexibleBox:
RenderMathMLFenced is used for an mrow-like element that is
defined in the MathML specification as strictly equivalent
to constructions with rows and operators.
It is implemented as a derived class
of RenderMathMLRow and creates anonymous RenderMathMLOperators. This is the
only remaining class that modifies the render tree structure. Note that
prominent MathML websites and generators do not use the mfenced element,
so it is not a big concern.
RenderMathMLTable is used for table layout. It is just derived from
RenderTable, not RenderFlexibleBox. We did not change anything for now
but we considered creating our own
implementation in order to make our code independent from HTML table,
to support MathML-specific table features and to make it better integrated
with the rest of the MathML code.
Accessibility
Even if our main focus was on rendering, the changes we made also had impact on
the MathML accessibility code. Indeed, the accessibility tree is generated
from the MathML renderer classes: Since we changed the latter during the
refactoring, we also had to adjust the accessibility code. Fortunately,
we are lucky to have Joanmarie Diggs in our team and she was able to provide some help here.
First, the accessibility code exposes the linethickness of fractions to
implement Apple’s AXMathLineThickness attribute. In practice, this is really
necessary to know whether the linethickness is null or not
(e.g.
binomial coefficient VS
the Legendre symbol).
Apple’s unit test seemed to expose the ratio between the actual thickness and
the default thickness but the accessibility code really just reads the actual
thickness calculated by RenderMathMLFraction.
Our fix and improvement for linethickness made the Apple’s unit test fail so we
had to adjust RenderMathMLFraction to expose the value expected by that test.
In general, the accessibility code does not care about anonymous nodes
created for layout purpose and there was some code to avoid exposing
them in the accessibility tree. So removing all the anonymous during the
layout refactoring was actually a good and safe thing to do. There were some
helper functions to implement Apple’s AXMathRootRadicand and
AXMathRootIndex attributes that had to be adjusted, though. These functions
used to do some work to skip the anonymous wrappers and we were actually able
to simplify them.
There was also some specific code for the RenderMathMLOperators and their
anonymous RenderText that were necessary to expose the text content.
Actually, there was an old bug
in the accessibility code and the anonymous
RenderMathMLOperators created by mfenced were not correctly exposed. The
unit test
we had for mfenced operators was only checking the text content but it was still
passing and so the regression had never been detected before. After
the layout refactoring we removed the anonymous RenderText of mfenced
operators and so broke that test…
We thus spent some time to fix the RenderMathMLOperator code. Essentially,
we removed all the old hacks and only left a specific handling for mfenced
operators. We also used this opportunity to improve and extend our MathML
accessibility tests.
Finally, the MathML accessibility code was directly implemented into a generic
AccessibilityRenderObject class. There was some functions to access
math nodes and properties but also specific cases scattered all over the code
(anonymous boxes, mfenced operators, math roles etc). In order to
facilitate future work and maintenance we decided to move all the
MathML code into a new AccessibilityMathMLElement class. Hence the work implied
by the layout refactoring actually encouraged us to improve the organization and
testing of our accessibility code!
Conclusion
In the past four months, Igalia’s web platform team has successfully upstreamed
the refactoring of WebKit’s MathML renderer classes and we are now very confident
about the quality of the layout code.
In addition to the people mentioned above I would personally like to thank
everybody who helped with this work.
More specifically, I am very grateful to other people from Igalia
(Martin Robinson,
Sergio Villar and
Manuel Rego)
or Apple (Brent Fulgham
and Darin Adler) who have
spent some time to review patches.
As a nice side effect of this work,
mathematical formulas look better and the accessibility code has been improved.
More is happenning in the
next two phases.
We are looking forward to continuing
implementation of Web standards and collaboration with browser vendors
at the next Web Engines Hackfest!
Work is in progress to add OpenType MATH support in
HarfBuzz and will be instrumental for many math rendering engines relying
on that library, including browsers.
•
For stretchy operators, an efficient way to determine the required number
of glyphs and their overlaps has been implemented and is described here.
The MathConstants table, which contains layout constants. For example, the thickness of the fraction bar of $\frac{a}{b}$.
•
The MathGlyphInfo table, which contains glyph properties. For instance, the italic correction indicating how slanted an integral is e.g. to properly place the subscript in $\displaystyle\displaystyle\int_{D}$.
•
The MathVariants table, which provides larger size variants for a base glyph or data to build a glyph assembly. For example, either a larger parenthesis or a assembly of U+239B, U+239C, U+239D to write something like:
Code to parse this table was added to Gecko and WebKit two years ago. The
existing code to build glyph assembly in these Web engines was adapted
to use the MathVariants data instead of only private tables. However, as
we will see below the MathVariants data to build glyph assembly is more
general, with arbitrary number of glyphs or with additional constraints on
glyph overlaps. Also
there are various fallback mechanisms
for old fonts and other bugs that I think we could get rid of
when we move to OpenType MATH fonts only.
In order to add MathML support in Blink, it is very easy to
import the OpenType MATH parsing code from WebKit. However, after discussions
with some Google developers,
it seems that the best option is to directly add support
for this table in HarfBuzz. Since this library is used by Gecko, by WebKit
(at least the GTK port) and by many other applications such as Servo, XeTeX or
LibreOffice it make senses to share the implementation to improve math rendering
everywhere.
The idea for HarfBuzz is to add an API to
1.
Expose data from the MathConstants and MathGlyphInfo.
2.
Shape stretchy operators to some target size with the help of the
MathVariants.
It is then up to a higher-level math rendering engine
(e.g. TeX or MathML rendering engines) to beautifully display mathematical
formulas using this API. The design choice for exposing MathConstants and
MathGlyphInfo is almost obvious from the reading of the MATH table
specification. The choice for the shaping API is a bit more complex and
discussions is still in progress. For example because we want to accept
stretching after glyph-level mirroring (e.g. to draw RTL clockwise integrals) we
should accept any glyph and not just an input Unicode strings as it is the case
for other HarfBuzz shaping functions. This shaping also depends on a stretching
direction (horizontal/vertical) or on a target size (and Gecko even currently
has various ways to approximate that target size). Finally,
we should also have a way to expose italic correction for a glyph assembly
or to approximate preferred width for Web rendering engines.
As I mentioned at the beginning, the data and algorithm to build glyph assembly
is the most complex part of the OpenType MATH and deserves a special interest.
The idea is that you have a list of $n\geq 1$ glyphs available to build
the assembly. For each
$0\leq i\leq n-1$,
the glyph $g_{i}$ has advance $a_{i}$ in the stretch direction. Each $g_{i}$ has
straight connector part at its start (of length $s_{i}$) and at its end
(of length $e_{i}$) so that we can align the glyphs on the stretch axis and glue
them together. Also, some of the glyphs are “extenders” which means that
they can be repeated 0, 1 or more times to make the assembly as large as
possible. Finally,
the end/start connectors of consecutive glyphs must overlap by at least a fixed
value $o_{\mathrm{min}}$ to avoid gaps at some resolutions but of course without
exceeding the length of the corresponding connectors. This gives some
flexibility to adjust the size of the assembly and get closer to the target
size $t$.
To ensure that the width/height is distributed equally and the symmetry of the
shape is preserved, the MATH table specification suggests the following iterative
algorithm to determine the number of extenders and the connector overlaps
to reach a minimal target size $t$:
1.
Assemble all parts by overlapping connectors by maximum amount, and
removing all extenders. This gives the smallest possible result.
2.
Determine how much extra width/height can be distributed into all
connections between neighboring parts. If that is enough to achieve the size
goal, extend each connection equally by changing overlaps of connectors to
finish the job.
3.
If all connections have been extended to minimum overlap and further
growth is needed, add one of each extender, and repeat the process from the
first step.
We note that at each step, each extender is repeated the same number of times
$r\geq 0$. So if $I_{\mathrm{Ext}}$ (respectively $I_{\mathrm{NonExt}}$) is the set of
indices $0\leq i\leq n-1$ such that $g_{i}$ is an extender
(respectively is not
an extender) we have $r_{i}=r$ (respectively $r_{i}=1$). The size we can reach
at step $r$ is at most the one obtained with the minimal connector overlap
$o_{\mathrm{min}}$ that is
We let $N_{\mathrm{Ext}}={|I_{\mathrm{Ext}}|}$ and
$N_{\mathrm{NonExt}}={|I_{\mathrm{NonExt}}|}$ be the number of extenders and
non-extenders. We also let
$S_{\mathrm{Ext}}=\sum_{i\in I_{\mathrm{Ext}}}a_{i}$ and
$S_{\mathrm{NonExt}}=\sum_{i\in I_{\mathrm{NonExt}}}a_{i}$ be the sum of advances
for extenders and non-extenders. If we want the advance of the glyph assembly
to reach the minimal size $t$ then
We can assume $S_{\mathrm{Ext}}-o_{\mathrm{min}}N_{\mathrm{Ext}}>0$ or otherwise we
would have the extreme case where the overlap takes at least the full
advance of each extender. Then we obtain
This provides a first simplification of the algorithm sketched in the
MATH table specification:
Directly start iteration at step $r_{\mathrm{min}}$. Note that at each step we
start at possibly different maximum overlaps and decrease all of them
by a same value. It is not clear what to do when one of the overlap reaches
$o_{\mathrm{min}}$ while others can still be decreased. However, the sketched
algorithm says all the connectors should reach minimum overlap before
the next increment of $r$,
which means the target size will indeed be reached at step $r_{\mathrm{min}}$.
One possible interpretation is to stop overlap decreasing for the adjacent
connectors that reached minimum overlap
and to continue uniform decreasing for the others until
all the connectors reach minimum overlap. In that case we may lose equal
distribution or symmetry. In practice, this should probably not matter much.
So we propose instead the dual option which should behave more or less the
same in most cases: Start with all overlaps set to $o_{\mathrm{min}}$ and increase
them evenly to reach a same value $o$. By the same reasoning as above we want
the inequality
We note that $N=N_{\mathrm{NonExt}}+{r_{\mathrm{min}}N_{\mathrm{Ext}}}$ is just the exact
number of glyphs used in the assembly. If there is only a single glyph, then the
overlap value is irrelevant so we can assume
$N_{\mathrm{NonExt}}+{rN_{\mathrm{Ext}}}-1=N-1\geq 1$. This provides the
greatest theorical value for the overlap $o$:
Of course, we also have to take into account the limit imposed by the start and
end connector lengths. So $o_{\mathrm{max}}$ must also be at most
$\min{(e_{i},s_{i+1})}$ for $0\leq i\leq n-2$. But if $r_{\mathrm{min}}\geq 2$
then extender copies are connected and so $o_{\mathrm{max}}$ must also be at most
$\min{(e_{i},s_{i})}$ for $i\in I_{\mathrm{Ext}}$. To summarize, $o_{\mathrm{max}}$ is
the minimum of $o_{\mathrm{max}}^{\mathrm{theorical}}$, of $e_{i}$ for $0\leq i\leq n-2$,
of $s_{i}$$1\leq i\leq n-1$ and possibly of $e_{0}$ (if $0\in I_{\mathrm{Ext}}$)
and of of $s_{n-1}$ (if ${n-1}\in I_{\mathrm{Ext}}$).
With the algorithm described above
$N_{\mathrm{Ext}}$, $N_{\mathrm{NonExt}}$, $S_{\mathrm{Ext}}$, $S_{\mathrm{NonExt}}$
and $r_{\mathrm{min}}$ and $o_{\mathrm{max}}$ can all be obtained using simple loops
on the glyphs $g_{i}$
and so the complexity is $O(n)$. In practice $n$ is small: For
existing fonts, assemblies are made of at most three non-extenders and two extenders that is
$n\leq 5$ (incidentally, Gecko and WebKit do not currently support larger values of $n$).
This means that all the operations described above can be considered to have
constant complexity. This is much better than a naive implementation of the
iterative algorithm sketched in the OpenType MATH table specification which
seems to require at worst
One of issue is that the number of extender repetitions $r_{\mathrm{min}}$ and
the number of glyphs in the assembly $N$ can
become arbitrary large since the target size $t$ can take large values
e.g. if one writes
\underbrace{\hspace{65535em}}
in LaTeX. The improvement proposed here does not solve that issue since setting
the coordinates of each glyph in the assembly and painting them
require $\Theta(N)$ operations as well as
(in the case of HarfBuzz) a glyph buffer of size
$N$. However, such large stretchy operators do not
happen in real-life mathematical formulas. Hence to avoid possible hangs in
Web engines a solution is to impose a maximum limit $N_{\mathrm{max}}$ for the
number of glyph in the assembly so that the complexity is limited by the
size of the DOM tree. Currently, the proposal for HarfBuzz is
$N_{\mathrm{max}}=128$. This means that if each assembly glyph is 1em large
you won’t be able to draw stretchy operators of size more than 128em, which
sounds a quite reasonable bound.
With the above proposal, $r_{\mathrm{min}}$ and so
$N$ can be determined very quickly and the cases $N\geq N_{\mathrm{max}}$ rejected, so that we avoid losing time with such edge cases…
Finally, because in our proposal we use the same overlap $o$ everywhere
an alternative for HarfBuzz would be to set the output buffer size to
$n$ (i.e. ignore $r-1$ copies of each extender and only keep the first one).
This will leave gaps that the client can fix by repeating extenders
as long as $o$ is also provided. Then HarfBuzz math shaping can be done
with a complexity in time and space of just $O(n)$ and it will be up to the
client to optimize or limit the painting of extenders for large values of $N$…