Source code

Revision control

Copy as Markdown

Other Tools

Sharing rust libraries across the Firefox (for Android) stack
=============================================================
`Agi Sferro <agi@sferro.dev>`
March 20th, 2021
The problem
-----------
We don’t have a good story for integrating a rust library so that it’s
available to use in Gecko, GeckoView, AC and Fenix and also in a way that rust
can call rust directly avoiding a C FFI layer.
Goals
-----
- Being able to integrate a rust library that can be called from Gecko,
GeckoView, AC, Fenix, including having singleton-like instances that are
shared across the stack, per-process.
- The rust library should be able to call and be called by other rust libraries
or rust code in Gecko directly (i.e. without a C FFI layer)
- A build-time assurance that all components in the stack compile against the
same version of the rust library
- Painless, quick and automated updates. Should be able to produce chemspill
updates for the rust library in under 24h with little manual intervention
(besides security checks / code review / QA).
- Support for non-Gecko consumers of the rust library is essential. I.e.
providing a version of Gecko that does not include any of the libraries
- (optional?) Provide an easy way to create bundles of rust libraries depending
on consumers needs.
Proposal
--------
1. Rename libmegazord.so to librustcomponents.so to clarify what the purpose of
this artifact is.
2. Every rust library that wants to be called or wants to call rust code
directly will be included in libxul.so (which contains most of Gecko native
code), and vendored in mozilla-central. This includes, among others, Glean and
Nimbus.
3. libxul.so will expose the necessary FFI symbols for the Kotlin wrappers
needed by the libraries vendored in mozilla-central in step (2).
4. At every nightly / beta / release build of Gecko, we will generate an (or
possibly many) additional librustcomponents.so artifacts that will be published
as an AAR in maven.mozilla.org. This will also publish all the vendored
libraries in mozilla-central to maven, which will have a dependency on the
librustcomponents.so produced as part of this step. Doing this will ensure that
both libxul.so and librustcomponents.so contain the exact same code and can be
swapped freely in the dependency graph.
5. Provide a new GeckoView build with artifactId geckoview-omni which will
depend on all the rust libraries. The existing geckoview will not have such
dependency and will be kept for third-party users of GeckoView.
6. GeckoView will depend on the Kotlin wrappers of all the libraries that
depend on librustcomponents.so built in step (4) in the .pom file. For example
.. code:: xml
<dependency>
<groupId>org.mozilla.telemetry</groupId>
<artifactId>glean</artifactId>
<version>33.1.2</version>
<scope>compile</scope>
</dependency>
It will also exclude the org.mozilla.telemetry.glean dependency to
librustcomponents.so, as the native code is now included in libxul.so as part
of step (2). Presumably Glean will discover where its native code lives by
either trying librustcomponents.so or libxul.so (or some other better methods,
suggestions welcome).
7. Android Components and Fenix will remove their explicit dependency on Glean,
Nimbus and all other libraries provided by GeckoView, and instead consume the
one provided by GeckoView (this step is optional, note that any version
conflict would cause a build error).
The good
--------
- We get automated integration with AC for free. When an update for a library
is pushed to mozilla-central, a new nightly build for GeckoView will be
produced which is already consumed by AC automatically (and indirectly into
Fenix).
- Publishing infrastructure to maven is already figured out, and we can reuse
the existing process for GeckoView to publish all the dependencies.
- If a consumer (say AC) uses a mismatched version for a dependency, a
compile-time error will be thrown.
- All consumers of the rust libraries packaged this way are on the same version
(provided they stay up to date with releases)
- Non-Mozilla consumers get first-class visibility into what is packaged into
GeckoView, and can independently discover Glean, Nimbus, etc, since we define
our dependencies in the pom file.
- Gecko Desktop and Gecko Mobile consumer Glean and other libraries in the same
way, removing unnecessary deviation.
Worth Noting
------------
- Uplifts to beta / release versions of Fenix will involve more checks as they
impact Gecko too.
The Bad
-------
- Libraries need to be vendored in mozilla-central. Dependencies will follow
the Gecko train which might not be right for them, as some dependencies don’t
really have a nightly vs stable version. - This could change in the future, as
the integration gets deeper and updates to the library become more frequent /
at every commit.
- Locally testing a change in a rust library involves rebuilding all of Gecko.
This is a side effect of statically linking rust libraries to Gecko.
- All rust libraries that are both used by Android and Gecko will need to be
updated together, and we cannot have separate versions on Desktop/Mobile.
Although this can be mitigated by providing flexible dependency on the library
side (e.g. nimbus doesn’t need to depend on a specific version of - Glean and
can accept whatever is in Gecko)
- Code that doesn’t natively live in mozilla-central has double the work to get
into a product - first a release process is needed from the native repo, then
a phabricator process for the vendoring.
Alternatives Considered
-----------------------
Telemetry delegate
^^^^^^^^^^^^^^^^^^
GeckoView provides a Java Telemetry delegate interface that Glean can implement
on the AC layer to provide Glean functionality to consumers. Glean would offer
a rust wrapper to the Java delegate API to transparently call either the
delegate (when built for mobile) or the Glean instance directly (when built for
Desktop).
Drawbacks
"""""""""
- This involves a lot of work on the Glean side to build and maintain the
delegate
- A large section of the Glean API is embedded in the GeckoView API without a
direct dependency
- We don’t expect the telemetry delegate to have other implementations other
than Glean itself, despite the apparent generic nature of the telemetry
delegate
- Glean and GeckoView engineers need to coordinate for every API update, as an
update to the Glean API likely triggers an update to the GV API.
- Gecko Desktop and Gecko Mobile use Glean a meaningfully different way
- Doesn’t solve the dependency problem: even though in theory this would allow
Gecko to work with multiple Glean versions, in practice the GV Telemetry
delegate is going to track Glean so closely that it will inevitably require
pretty specific Glean versions to work.
Advantages
""""""""""
- Explicit code dependency, an uninformed observer can understand how telemetry
is extracted from GeckoView by just looking at the API
- No hard Glean version requirement, AC can be (in theory) built with a
different Glean version than Gecko and things would still work
Why we decided against
""""""""""""""""""""""
The amount of ongoing maintenance work involved on the Glean side far outweighs
the small advantages, namely to not tie AC to a specific Glean version.
Significantly complicates the stack.
Dynamic Discovery
^^^^^^^^^^^^^^^^^
Gecko discovers when it’s being loaded as part of Fenix (or some other
Gecko-powered browser) by calling dlsym on the Glean library. When the
discovery is successful, and the Glean version matches, Gecko will directly use
the Glean provided by Fenix.
Drawbacks
"""""""""
- Non standard, non-Mozilla apps will not expect this to work the way it does
- “Magic”: there’s no way to know that the dyscovery is happening (or what
version of Glean is provided with Gecko) unless you know it’s there.
- The standard failure mode is at runtime, as there’s no built-in way to check
that the version provided by Gecko is the same as the one provided by Fenix
at build time.
- Doesn’t solve the synchronization problem: Gecko and Fenix will have to be on
the same Glean version for this to work.
- Gecko Mobile deviates meaningfully from Desktop in the way it uses Glean for
no intrinsic reason
Advantages
""""""""""
- This system is transparent to Consuming apps, e.g. Nimbus can use Glean as
is, with no significant modifications needed.
Why we decided against
""""""""""""""""""""""
- This alternative does not provide substantial benefits over the proposal
outlined in this doc and has significant drawbacks like the runtime failure
case and the non-standard linking process.
Hybrid Dynamic Discovery
^^^^^^^^^^^^^^^^^^^^^^^^
This is a variation of the Dynamic Discovery where Gecko and GeckoView include
Glean directly and consumers get Glean from Gecko dynamically (i.e. they dlsym
libxul.so).
Drawbacks
"""""""""
- Glean still needs to build a wrapper for libraries not included in Gecko
(like Nimbus) that want to call Glean directly.
Advantages
""""""""""
- The dependency to Glean is explicit and clear from an uninformed observer
point of view.
- Smaller scope, only Glean would need to be moved to mozilla-central
Why we decided against
""""""""""""""""""""""
Not enough advantages over the proposal, significant ongoing maintenance work
required from the Glean side.
Open Questions
--------------
- How does iOS consume megazord today? Do they have a maven-like dependency
system we can use to publish the iOS megazord?
- How do we deal with licenses in about:license? Application-services has a
build step that extracts rust dependencies and puts them in the pom file
- What would be the process for coordinating a-c breaking changes?
- Would the desire to vendor apply even if this were not Rust code?
Common Questions
----------------
- **How do we make sure GV/AC/Gecko consume the same version of the native
libraries?** The pom dependency in GeckoView ensures that any GeckoView
consumers depend on the same version of a given library, this includes AC and
Fenix.
- **What happens to non-Gecko consumers of megazord?** This plan is transparent
to a non-Gecko consumer of megazord, as they will still consume the native
libraries through the megazord dependency in Glean/Nimbus/etc. With the added
benefit that, if the consumer stays up to date with the megazord dependency,
they will use the same version that Gecko uses.
- **What’s the process to publish an update to the megazord?** When a team
wants to publish an update to the megazord it will need to commit the update
to mozilla-central. A new build will be generated in the next nightly cycle,
producing an updated version of the megazord. My understanding is that current
megazord releases are stable (and don’t have beta/nightly cycles) so for
external consumers, consuming the nightly build could be adequate, and provide
the fastest turnaround on updates. For Gecko consumers the turnaround will be
the same to Firefox Desktop (i.e. roughly 6-8 weeks from commit to release
build).
- **How do we handle security uplifts?** If you have a security release one
rust library you would need to request uplift to beta/release branches
(depending on impact) like all other Gecko changes. The process in itself can
be expedited and have a fast turnaround when needed (below 24h). We have been
using this process for all Gecko changes so I would not expect particular
problems with it.
- **What about OOP cases? E.g. GeckoView as a service?** We briefly discussed
this in the email chain, there are ways we could make that work (e.g.
providing a IPC shim). The details are fuzzy but since we don’t have any
immediate need for such support knowing that it’s doable with a reasonable
amount of work is enough for now.
- **Vendoring in mozilla-central seems excessive.** I agree. This is an
unfortunate requirement stemming from a few assumptions (which could be
challenged! We are choosing not to):
- Gecko wants to vendor whatever it consumes for rust
- We want rust to call rust directly (without a C FFI layer)
- We want adding new libraries to be a painless experience
Because of the above, vendoring in mozilla-central seems to be the best if not
the only way to achieve our goals.