index.rst - mozsearch

============

SpiderMonkey

============

*SpiderMonkey* is the *JavaScript* and *WebAssembly* implementation library of

the *Mozilla Firefox* web browser. The implementation behaviour is defined by

the `ECMAScript <https://tc39.es/ecma262/>`_ and `WebAssembly

<https://webassembly.org/>`_ specifications.

Much of the internal technical documentation of the engine can be found

throughout the source files themselves by looking for comments labelled with

`[SMDOC]`_. Information about the team, our processes, and about embedding

*SpiderMonkey* in your own projects can be found at https://spidermonkey.dev.

Specific documentation on a few topics is available at:

.. toctree::

   :maxdepth: 1

   build

   test

   hacking_tips

   Debugger/index

   SavedFrame/index

   feature_checklist

   bytecode_checklist

   use_counter

Components of SpiderMonkey

##########################

🧹 Garbage Collector

*********************

.. toctree::

   :maxdepth: 2

   :hidden:

   Overview <gc>

   Rooting Hazard Analysis <HazardAnalysis/index>

   Running the Analysis <HazardAnalysis/running>

*JavaScript* is a garbage collected language and at the core of *SpiderMonkey*

we manage a garbage-collected memory heap. Elements of this heap have a base

C++ type of `gc::Cell`_. Each round of garbage collection will free up any

*Cell* that is not referenced by a *root* or another live *Cell* in turn.

See :doc:`GC overview<gc>` for more details.

📦 JS::Value and JSObject

**************************

*JavaScript* values are divided into either objects or primitives

(*Undefined*, *Null*, *Boolean*, *Number*, *BigInt*, *String*, or *Symbol*).

Values are represented with the `JS::Value`_ type which may in turn point to

an object that extends from the `JSObject`_ type. Objects include both plain

*JavaScript* objects and exotic objects representing various things from

functions to *ArrayBuffers* to *HTML Elements* and more.

Most objects extend ``NativeObject`` (which is a subtype of ``JSObject``)

which provides a way to store properties as key-value pairs similar to a hash

table. These objects hold their *values* and point to a *Shape* that

represents the set of *keys*. Similar objects point to the same *Shape* which

saves memory and allows the JITs to quickly work with objects similar to ones

it has seen before. See the `[SMDOC] Shapes`_ comment for more details.

C++ (and Rust) code may create and manipulate these objects using the

collection of interfaces we traditionally call the **JSAPI**.

🗃️ JavaScript Parser

*********************

In order to evaluate script text, we parse it using the *Parser* into an

`Abstract Syntax Tree`_ (AST) temporarily and then run the *BytecodeEmitter*

(BCE) to generate `Bytecode`_ and associated metadata. We refer to this

resulting format as `Stencil`_ and it has the helpful characteristic that it

does not utilize the Garbage Collector. The *Stencil* can then be

instantiated into a series of GC *Cells* that can be mutated and understood

by the execution engines described below.

Each function as well as the top-level itself generates a distinct script.

This is the unit of execution granularity since functions may be set as

callbacks that the host runs at a later time. There are both

``ScriptStencil`` and ``js::BaseScript`` forms of scripts.

By default, the parser runs in a mode called *syntax* or *lazy* parsing where

we avoid generating full bytecode for functions within the source that we are

parsing. This lazy parsing is still required to check for all *early errors*

that the specification describes. When such a lazily compiled inner function

is first executed, we recompile just that function in a process called

*delazification*. Lazy parsing avoids allocating the AST and bytecode which

saves both CPU time and memory. In practice, many functions are never

executed during a given load of a webpage so this delayed parsing can be

quite beneficial.

⚙️ JavaScript Interpreter

**************************

The *bytecode* generated by the parser may be executed by an interpreter

written in C++ that manipulates objects in the GC heap and invokes native

code of the host (eg. web browser). See `[SMDOC] Bytecode Definitions`_ for

descriptions of each bytecode opcode and ``js/src/vm/Interpreter.cpp`` for

their implementation.

⚡ JavaScript JITs

*******************

.. toctree::

   :maxdepth: 1

   :hidden:

   MIR-optimizations/index

In order to speed up execution of *bytecode*, we use a series of Just-In-Time

(JIT) compilers to generate specialized machine code (eg. x86, ARM, etc)

tailored to the *JavaScript* that is run and the data that is processed.

As an individual script runs more times (or has a loop that runs many times)

we describe it as getting *hotter* and at certain thresholds we *tier-up* by

JIT-compiling it. Each subsequent JIT tier spends more time compiling but

aims for better execution performance.

Baseline Interpreter

--------------------

The *Baseline Interpreter* is a hybrid interpreter/JIT that interprets the

*bytecode* one opcode at a time, but attaches small fragments of code called

*Inline Caches* (ICs) that rapidly speed-up executing the same opcode the next

time (if the data is similar enough). See the `[SMDOC] JIT Inline Caches`_

comment for more details.

Baseline Compiler

-----------------

The *Baseline Compiler* use the same *Inline Caches* mechanism from the

*Baseline Interpreter* but additionally translates the entire bytecode to

native machine code. This removes dispatch overhead and does minor local

optimizations. This machine code still calls back into C++ for complex

operations. The translation is very fast but the ``BaselineScript`` uses

memory and requires ``mprotect`` and flushing CPU caches.

WarpMonkey

----------

The *WarpMonkey* JIT replaces the former *IonMonkey* engine and is the

highest level of optimization for the most frequently run scripts. It is able

to inline other scripts and specialize code based on the data and arguments

being processed.

We translate the *bytecode* and *Inline Cache* data into a Mid-level

`Intermediate Representation`_ (Ion MIR) representation. This graph is

transformed and optimized before being *lowered* to a Low-level Intermediate

Representation (Ion LIR). This *LIR* performs register allocation and then

generates native machine code in a process called *Code Generation*.

See `MIR Optimizations`_ for an overview of MIR optimizations.

The optimizations here assume that a script continues to see data similar

what has been seen before. The *Baseline* JITs are essential to success here

because they generate *ICs* that match observed data. If after a script is

compiled with *Warp*, it encounters data that it is not prepared to handle it

performs a *bailout*. The *bailout* mechanism reconstructs the native machine

stack frame to match the layout used by the *Baseline Interpreter* and then

branches to that interpreter as though we were running it all along. Building

this stack frame may use special side-table saved by *Warp* to reconstruct

values that are not otherwise available.

🟪 WebAssembly

***************

In addition to *JavaScript*, the engine is also able to execute *WebAssembly*

(WASM) sources.

WASM-Baseline (RabaldrMonkey)

-----------------------------

This engine performs fast translation to machine code in order to minimize

latency to first execution.

WASM-Ion (BaldrMonkey)

----------------------

This engine translates the WASM input into same *MIR* form that *WarpMonkey*

uses and uses the *IonBackend* to optimize. These optimizations (and in

particular, the register allocation) generate very fast native machine code.

.. _gc::Cell: https://searchfox.org/mozilla-central/search?q=[SMDOC]+GC+Cell

.. _JSObject: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JSObject+layout

.. _JS::Value: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JS%3A%3AValue+type&path=js%2F

.. _[SMDOC]: https://searchfox.org/mozilla-central/search?q=[SMDOC]&path=js%2F

.. _[SMDOC] Shapes: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Shapes

.. _[SMDOC] Bytecode Definitions: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Bytecode+Definitions&path=js%2F

.. _[SMDOC] JIT Inline Caches: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JIT+Inline+Caches

.. _Stencil: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Script+Stencil

.. _Bytecode: https://en.wikipedia.org/wiki/Bytecode

.. _Abstract Syntax Tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree

.. _Intermediate Representation: https://en.wikipedia.org/wiki/Intermediate_representation

.. _MIR Optimizations: ./MIR-optimizations/index.html

Source code

Revision control

Copy as Markdown

Other Tools