Source code

Revision control

Copy as Markdown

Other Tools

/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*-
* vim: set ts=8 sts=2 et sw=2 tw=80:
*
* Copyright 2016 Mozilla Foundation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/*
* [SMDOC] WebAssembly baseline compiler (RabaldrMonkey)
*
* For now, see WasmBCClass.h for general comments about the compiler's
* structure.
*
* ----------------
*
* General assumptions for 32-bit vs 64-bit code:
*
* - A 32-bit register can be extended in-place to a 64-bit register on 64-bit
* systems.
*
* - Code that knows that Register64 has a '.reg' member on 64-bit systems and
* '.high' and '.low' members on 32-bit systems, or knows the implications
* thereof, is #ifdef JS_PUNBOX64. All other code is #if(n)?def JS_64BIT.
*
* Coding standards are a little fluid:
*
* - In "small" code generating functions (eg emitMultiplyF64, emitQuotientI32,
* and surrounding functions; most functions fall into this class) where the
* meaning is obvious:
*
* Old school:
* - if there is a single source + destination register, it is called 'r'
* - if there is one source and a different destination, they are called 'rs'
* and 'rd'
* - if there is one source + destination register and another source register
* they are called 'r' and 'rs'
* - if there are two source registers and a destination register they are
* called 'rs0', 'rs1', and 'rd'.
*
* The new thing:
* - what is called 'r' in the old-school naming scheme is increasingly called
* 'rsd' in source+dest cases.
*
* - Generic temp registers are named /temp[0-9]?/ not /tmp[0-9]?/.
*
* - Registers can be named non-generically for their function ('rp' for the
* 'pointer' register and 'rv' for the 'value' register are typical) and those
* names may or may not have an 'r' prefix.
*
* - "Larger" code generating functions make their own rules.
*/
/*
* [SMDOC] WebAssembly baseline compiler -- Lazy Tier-Up mechanism
*
* For baseline functions, we compile in code to monitor the function's
* "hotness" and request tier-up once that hotness crosses a threshold.
*
* (1) Each function has an associated int32_t counter,
* FuncDefInstanceData::hotnessCounter. These are stored in an array in
* the Instance. Hence access to them is fast and thread-local.
*
* (2) On instantiation, the counters are set to some positive number
* (Instance::init, Instance::computeInitialHotnessCounter), which is a
* very crude estimate of the cost of Ion compilation of the function.
*
* (3) In baseline compilation, a function decrements its counter at every
* entry (BaseCompiler::beginFunction) and at the start of every loop
* iteration (BaseCompiler::emitLoop). The decrement code is created by
* BaseCompiler::addHotnessCheck.
*
* (4) The decrement is by some value in the range 1 .. 127, as computed from
* the function or loop-body size, by BlockSizeToDownwardsStep.
*
* (5) For loops, the body size is known only at the end of the loop, but the
* check is required at the start of the body. Hence the value is patched
* in at the end (BaseCompiler::emitEnd, case LabelKind::Loop).
*
* (6) BaseCompiler::addHotnessCheck creates the shortest possible
* decrement/check code, to minimise both time and code-space overhead. On
* Intel it is only two instructions. The counter has the value from (4)
* subtracted from it. If the result is negative, we jump to OOL code
* (class OutOfLineRequestTierUp) which requests tier up; control then
* continues immediately after the check.
*
* (7) The OOL tier-up request code calls the stub pointed to by
* Instance::requestTierUpStub_. This always points to the stub created by
* GenerateRequestTierUpStub. This saves all registers and calls onwards
* to WasmHandleRequestTierUp in C++-land.
*
* (8) WasmHandleRequestTierUp figures out which function in which Instance is
* requesting tier-up. It sets the function's counter (1) to the largest
* possible value, which is 2^31-1. It then calls onwards to
* Code::requestTierUp, which requests off-thread Ion compilation of the
* function, then immediately returns.
*
* (9) It is important that (8) sets the counter to 2^31-1 (as close to
* infinity as possible). This is because it may be arbitrarily long
* before the optimised code becomes available. In the meantime the
* baseline version of the function will continue to run. We do not want
* it to make frequent duplicate requests for tier-up. Although a request
* for tier-up is relatively cheap (a few hundred instructions), it is
* still way more expensive than the fast-case for a hotness check (2 insns
* on Intel), and performance of the baseline code will be badly affected
* if it makes many duplicate requests.
*
* (10) Of course it is impossible to *guarantee* that a baseline function will
* not make a duplicate request, because the Ion compilation of the
* function could take arbitrarily long, or even fail completely (eg OOM).
* Hence it is necessary for WasmCode::requestTierUp (8) to detect and
* ignore duplicate requests.
*
* (11) Each Instance of a Module runs in its own thread and has its own array
* of counters. This makes the counter updating thread-local and cheap.
* But it means that, if a Module has multiple threads (Instances), it
* could be that a function never gets hot enough to request tier up,
* because it is not hot enough in any single thread, even though the
* total hotness summed across all threads is enough to request tier up.
* Whether this inaccuracy is a problem in practice remains to be seen.
*
* (12) Code::requestTierUp (8) creates a PartialTier2CompileTask and queues it
* for execution. It does not do the compilation itself.
*
* (13) A PartialTier2CompileTask's runHelperThreadTask (running on a helper
* thread) calls CompilePartialTier2. This compiles the function with Ion
* and racily updates the tiering table entry for the function, which
* lives in Code::jumpTables_::tiering_.
*
* (14) Subsequent calls to the function's baseline entry points will then jump
* to the Ion version of the function. Hence lazy tier-up is achieved.
*/
#include "wasm/WasmBaselineCompile.h"
#include "wasm/WasmAnyRef.h"
#include "wasm/WasmBCClass.h"
#include "wasm/WasmBCDefs.h"
#include "wasm/WasmBCFrame.h"
#include "wasm/WasmBCRegDefs.h"
#include "wasm/WasmBCStk.h"
#include "wasm/WasmValType.h"
#include "jit/MacroAssembler-inl.h"
#include "wasm/WasmBCClass-inl.h"
#include "wasm/WasmBCCodegen-inl.h"
#include "wasm/WasmBCRegDefs-inl.h"
#include "wasm/WasmBCRegMgmt-inl.h"
#include "wasm/WasmBCStkMgmt-inl.h"
namespace js {
namespace wasm {
using namespace js::jit;
using mozilla::Maybe;
using mozilla::Nothing;
using mozilla::Some;
////////////////////////////////////////////////////////////
//
// Out of line code management.
// The baseline compiler will use OOL code more sparingly than Ion since our
// code is not high performance and frills like code density and branch
// prediction friendliness will be less important.
class OutOfLineCode : public TempObject {
private:
NonAssertingLabel entry_;
NonAssertingLabel rejoin_;
StackHeight stackHeight_;
public:
OutOfLineCode() : stackHeight_(StackHeight::Invalid()) {}
Label* entry() { return &entry_; }
Label* rejoin() { return &rejoin_; }
void setStackHeight(StackHeight stackHeight) {
MOZ_ASSERT(!stackHeight_.isValid());
stackHeight_ = stackHeight;
}
void bind(BaseStackFrame* fr, MacroAssembler* masm) {
MOZ_ASSERT(stackHeight_.isValid());
masm->bind(&entry_);
fr->setStackHeight(stackHeight_);
}
// The generate() method must be careful about register use because it will be
// invoked when there is a register assignment in the BaseCompiler that does
// not correspond to the available registers when the generated OOL code is
// executed. The register allocator *must not* be called.
//
// The best strategy is for the creator of the OOL object to allocate all
// temps that the OOL code will need.
//
// Input, output, and temp registers are embedded in the OOL object and are
// known to the code generator.
//
// Scratch registers are available to use in OOL code.
//
// All other registers must be explicitly saved and restored by the OOL code
// before being used.
virtual void generate(MacroAssembler* masm) = 0;
};
OutOfLineCode* BaseCompiler::addOutOfLineCode(OutOfLineCode* ool) {
if (!ool || !outOfLine_.append(ool)) {
return nullptr;
}
ool->setStackHeight(fr.stackHeight());
return ool;
}
bool BaseCompiler::generateOutOfLineCode() {
for (auto* ool : outOfLine_) {
if (!ool->entry()->used()) {
continue;
}
ool->bind(&fr, &masm);
ool->generate(&masm);
}
return !masm.oom();
}
//////////////////////////////////////////////////////////////////////////////
//
// Sundry code generation.
bool BaseCompiler::addInterruptCheck() {
#ifdef RABALDR_PIN_INSTANCE
Register tmp(InstanceReg);
#else
ScratchI32 tmp(*this);
fr.loadInstancePtr(tmp);
#endif
Label ok;
masm.branch32(Assembler::Equal,
Address(tmp, wasm::Instance::offsetOfInterrupt()), Imm32(0),
&ok);
masm.wasmTrap(wasm::Trap::CheckInterrupt, bytecodeOffset());
masm.bind(&ok);
return createStackMap("addInterruptCheck");
}
void BaseCompiler::checkDivideByZero(RegI32 rhs) {
Label nonZero;
masm.branchTest32(Assembler::NonZero, rhs, rhs, &nonZero);
trap(Trap::IntegerDivideByZero);
masm.bind(&nonZero);
}
void BaseCompiler::checkDivideByZero(RegI64 r) {
Label nonZero;
ScratchI32 scratch(*this);
masm.branchTest64(Assembler::NonZero, r, r, scratch, &nonZero);
trap(Trap::IntegerDivideByZero);
masm.bind(&nonZero);
}
void BaseCompiler::checkDivideSignedOverflow(RegI32 rhs, RegI32 srcDest,
Label* done, bool zeroOnOverflow) {
Label notMin;
masm.branch32(Assembler::NotEqual, srcDest, Imm32(INT32_MIN), &notMin);
if (zeroOnOverflow) {
masm.branch32(Assembler::NotEqual, rhs, Imm32(-1), &notMin);
moveImm32(0, srcDest);
masm.jump(done);
} else {
masm.branch32(Assembler::NotEqual, rhs, Imm32(-1), &notMin);
trap(Trap::IntegerOverflow);
}
masm.bind(&notMin);
}
void BaseCompiler::checkDivideSignedOverflow(RegI64 rhs, RegI64 srcDest,
Label* done, bool zeroOnOverflow) {
Label notmin;
masm.branch64(Assembler::NotEqual, srcDest, Imm64(INT64_MIN), &notmin);
masm.branch64(Assembler::NotEqual, rhs, Imm64(-1), &notmin);
if (zeroOnOverflow) {
masm.xor64(srcDest, srcDest);
masm.jump(done);
} else {
trap(Trap::IntegerOverflow);
}
masm.bind(&notmin);
}
void BaseCompiler::jumpTable(const LabelVector& labels, Label* theTable) {
// Flush constant pools to ensure that the table is never interrupted by
// constant pool entries.
masm.flush();
#if defined(JS_CODEGEN_ARM) || defined(JS_CODEGEN_ARM64)
// Prevent nop sequences to appear in the jump table.
AutoForbidNops afn(&masm);
#endif
masm.bind(theTable);
for (const auto& label : labels) {
CodeLabel cl;
masm.writeCodePointer(&cl);
cl.target()->bind(label.offset());
masm.addCodeLabel(cl);
}
}
void BaseCompiler::tableSwitch(Label* theTable, RegI32 switchValue,
Label* dispatchCode) {
masm.bind(dispatchCode);
#if defined(JS_CODEGEN_X64) || defined(JS_CODEGEN_X86)
ScratchI32 scratch(*this);
CodeLabel tableCl;
masm.mov(&tableCl, scratch);
tableCl.target()->bind(theTable->offset());
masm.addCodeLabel(tableCl);
masm.jmp(Operand(scratch, switchValue, ScalePointer));
#elif defined(JS_CODEGEN_ARM)
// Flush constant pools: offset must reflect the distance from the MOV
// to the start of the table; as the address of the MOV is given by the
// label, nothing must come between the bind() and the ma_mov().
AutoForbidPoolsAndNops afp(&masm,
/* number of instructions in scope = */ 5);
ScratchI32 scratch(*this);
// Compute the offset from the ma_mov instruction to the jump table.
Label here;
masm.bind(&here);
uint32_t offset = here.offset() - theTable->offset();
// Read PC+8
masm.ma_mov(pc, scratch);
// ARM scratch register is required by ma_sub.
ScratchRegisterScope arm_scratch(*this);
// Compute the absolute table base pointer into `scratch`, offset by 8
// to account for the fact that ma_mov read PC+8.
masm.ma_sub(Imm32(offset + 8), scratch, arm_scratch);
// Jump indirect via table element.
masm.ma_ldr(DTRAddr(scratch, DtrRegImmShift(switchValue, LSL, 2)), pc, Offset,
Assembler::Always);
#elif defined(JS_CODEGEN_MIPS64) || defined(JS_CODEGEN_LOONG64) || \
defined(JS_CODEGEN_RISCV64)
ScratchI32 scratch(*this);
CodeLabel tableCl;
masm.ma_li(scratch, &tableCl);
tableCl.target()->bind(theTable->offset());
masm.addCodeLabel(tableCl);
masm.branchToComputedAddress(BaseIndex(scratch, switchValue, ScalePointer));
#elif defined(JS_CODEGEN_ARM64)
AutoForbidPoolsAndNops afp(&masm,
/* number of instructions in scope = */ 4);
ScratchI32 scratch(*this);
ARMRegister s(scratch, 64);
ARMRegister v(switchValue, 64);
masm.Adr(s, theTable);
masm.Add(s, s, Operand(v, vixl::LSL, 3));
masm.Ldr(s, MemOperand(s, 0));
masm.Br(s);
#else
MOZ_CRASH("BaseCompiler platform hook: tableSwitch");
#endif
}
// Helpers for accessing the "baseline scratch" areas: all targets
void BaseCompiler::stashWord(RegPtr instancePtr, size_t index, RegPtr r) {
MOZ_ASSERT(r != instancePtr);
MOZ_ASSERT(index < Instance::N_BASELINE_SCRATCH_WORDS);
masm.storePtr(r,
Address(instancePtr, Instance::offsetOfBaselineScratchWords() +
index * sizeof(uintptr_t)));
}
void BaseCompiler::unstashWord(RegPtr instancePtr, size_t index, RegPtr r) {
MOZ_ASSERT(index < Instance::N_BASELINE_SCRATCH_WORDS);
masm.loadPtr(Address(instancePtr, Instance::offsetOfBaselineScratchWords() +
index * sizeof(uintptr_t)),
r);
}
// Helpers for accessing the "baseline scratch" areas: X86 only
#ifdef JS_CODEGEN_X86
void BaseCompiler::stashI64(RegPtr regForInstance, RegI64 r) {
static_assert(sizeof(uintptr_t) == 4);
MOZ_ASSERT(Instance::sizeOfBaselineScratchWords() >= 8);
MOZ_ASSERT(regForInstance != r.low && regForInstance != r.high);
# ifdef RABALDR_PIN_INSTANCE
# error "Pinned instance not expected"
# endif
fr.loadInstancePtr(regForInstance);
masm.store32(
r.low, Address(regForInstance, Instance::offsetOfBaselineScratchWords()));
masm.store32(r.high, Address(regForInstance,
Instance::offsetOfBaselineScratchWords() + 4));
}
void BaseCompiler::unstashI64(RegPtr regForInstance, RegI64 r) {
static_assert(sizeof(uintptr_t) == 4);
MOZ_ASSERT(Instance::sizeOfBaselineScratchWords() >= 8);
# ifdef RABALDR_PIN_INSTANCE
# error "Pinned instance not expected"
# endif
fr.loadInstancePtr(regForInstance);
if (regForInstance == r.low) {
masm.load32(
Address(regForInstance, Instance::offsetOfBaselineScratchWords() + 4),
r.high);
masm.load32(
Address(regForInstance, Instance::offsetOfBaselineScratchWords()),
r.low);
} else {
masm.load32(
Address(regForInstance, Instance::offsetOfBaselineScratchWords()),
r.low);
masm.load32(
Address(regForInstance, Instance::offsetOfBaselineScratchWords() + 4),
r.high);
}
}
#endif
// Given the bytecode size of a block (a complete function body, or a loop
// body), return the required downwards step for the associated hotness
// counter. Returned value will be in 1 .. 127 inclusive.
static uint32_t BlockSizeToDownwardsStep(size_t blockBytecodeSize) {
MOZ_RELEASE_ASSERT(blockBytecodeSize <= size_t(MaxFunctionBytes));
const uint32_t BYTECODES_PER_STEP = 20; // tunable parameter
size_t step = blockBytecodeSize / BYTECODES_PER_STEP;
step = std::max<uint32_t>(step, 1);
step = std::min<uint32_t>(step, 127);
return uint32_t(step);
}
//////////////////////////////////////////////////////////////////////////////
//
// Function entry and exit
bool BaseCompiler::beginFunction() {
AutoCreatedBy acb(masm, "(wasm)BaseCompiler::beginFunction");
JitSpew(JitSpew_Codegen, "# ========================================");
JitSpew(JitSpew_Codegen, "# Emitting wasm baseline code");
JitSpew(JitSpew_Codegen,
"# beginFunction: start of function prologue for index %d",
(int)func_.index);
// Make a start on the stackmap for this function. Inspect the args so
// as to determine which of them are both in-memory and pointer-typed, and
// add entries to machineStackTracker as appropriate.
ArgTypeVector args(funcType());
size_t inboundStackArgBytes = StackArgAreaSizeUnaligned(args);
MOZ_ASSERT(inboundStackArgBytes % sizeof(void*) == 0);
stackMapGenerator_.numStackArgBytes = inboundStackArgBytes;
MOZ_ASSERT(stackMapGenerator_.machineStackTracker.length() == 0);
if (!stackMapGenerator_.machineStackTracker.pushNonGCPointers(
stackMapGenerator_.numStackArgBytes / sizeof(void*))) {
return false;
}
// Identify GC-managed pointers passed on the stack.
for (WasmABIArgIter i(args); !i.done(); i++) {
ABIArg argLoc = *i;
if (argLoc.kind() == ABIArg::Stack &&
args[i.index()] == MIRType::WasmAnyRef) {
uint32_t offset = argLoc.offsetFromArgBase();
MOZ_ASSERT(offset < inboundStackArgBytes);
MOZ_ASSERT(offset % sizeof(void*) == 0);
stackMapGenerator_.machineStackTracker.setGCPointer(offset /
sizeof(void*));
}
}
GenerateFunctionPrologue(
masm, CallIndirectId::forFunc(codeMeta_, func_.index),
compilerEnv_.mode() != CompileMode::Once ? Some(func_.index) : Nothing(),
&offsets_);
// GenerateFunctionPrologue pushes exactly one wasm::Frame's worth of
// stuff, and none of the values are GC pointers. Hence:
if (!stackMapGenerator_.machineStackTracker.pushNonGCPointers(
sizeof(Frame) / sizeof(void*))) {
return false;
}
// Initialize DebugFrame fields before the stack overflow trap so that
// we have the invariant that all observable Frames in a debugEnabled
// Module have valid DebugFrames.
if (compilerEnv_.debugEnabled()) {
#ifdef JS_CODEGEN_ARM64
static_assert(DebugFrame::offsetOfFrame() % WasmStackAlignment == 0,
"aligned");
#endif
masm.reserveStack(DebugFrame::offsetOfFrame());
if (!stackMapGenerator_.machineStackTracker.pushNonGCPointers(
DebugFrame::offsetOfFrame() / sizeof(void*))) {
return false;
}
masm.store32(Imm32(func_.index), Address(masm.getStackPointer(),
DebugFrame::offsetOfFuncIndex()));
masm.store32(Imm32(0),
Address(masm.getStackPointer(), DebugFrame::offsetOfFlags()));
// No need to initialize cachedReturnJSValue_ or any ref-typed spilled
// register results, as they are traced if and only if a corresponding
// flag (hasCachedReturnJSValue or hasSpilledRefRegisterResult) is set.
}
// Generate a stack-overflow check and its associated stackmap.
fr.checkStack(ABINonArgReg0, BytecodeOffset(func_.lineOrBytecode));
ExitStubMapVector extras;
if (!stackMapGenerator_.generateStackmapEntriesForTrapExit(args, &extras)) {
return false;
}
if (!createStackMap("stack check", extras, masm.currentOffset(),
HasDebugFrameWithLiveRefs::No)) {
return false;
}
size_t reservedBytes = fr.fixedAllocSize() - masm.framePushed();
MOZ_ASSERT(0 == (reservedBytes % sizeof(void*)));
masm.reserveStack(reservedBytes);
fr.onFixedStackAllocated();
if (!stackMapGenerator_.machineStackTracker.pushNonGCPointers(
reservedBytes / sizeof(void*))) {
return false;
}
// Locals are stack allocated. Mark ref-typed ones in the stackmap
// accordingly.
for (const Local& l : localInfo_) {
// Locals that are stack arguments were already added to the stackmap
// before pushing the frame.
if (l.type == MIRType::WasmAnyRef && !l.isStackArgument()) {
uint32_t offs = fr.localOffsetFromSp(l);
MOZ_ASSERT(0 == (offs % sizeof(void*)));
stackMapGenerator_.machineStackTracker.setGCPointer(offs / sizeof(void*));
}
}
// Copy arguments from registers to stack.
for (WasmABIArgIter i(args); !i.done(); i++) {
if (args.isSyntheticStackResultPointerArg(i.index())) {
// If there are stack results and the pointer to stack results
// was passed in a register, store it to the stack.
if (i->argInRegister()) {
fr.storeIncomingStackResultAreaPtr(RegPtr(i->gpr()));
}
// If we're in a debug frame, copy the stack result pointer arg
// to a well-known place.
if (compilerEnv_.debugEnabled()) {
Register target = ABINonArgReturnReg0;
fr.loadIncomingStackResultAreaPtr(RegPtr(target));
size_t debugFrameOffset =
masm.framePushed() - DebugFrame::offsetOfFrame();
size_t debugStackResultsPointerOffset =
debugFrameOffset + DebugFrame::offsetOfStackResultsPointer();
masm.storePtr(target, Address(masm.getStackPointer(),
debugStackResultsPointerOffset));
}
continue;
}
if (!i->argInRegister()) {
continue;
}
Local& l = localInfo_[args.naturalIndex(i.index())];
switch (i.mirType()) {
case MIRType::Int32:
fr.storeLocalI32(RegI32(i->gpr()), l);
break;
case MIRType::Int64:
fr.storeLocalI64(RegI64(i->gpr64()), l);
break;
case MIRType::WasmAnyRef: {
mozilla::DebugOnly<uint32_t> offs = fr.localOffsetFromSp(l);
MOZ_ASSERT(0 == (offs % sizeof(void*)));
fr.storeLocalRef(RegRef(i->gpr()), l);
// We should have just visited this local in the preceding loop.
MOZ_ASSERT(stackMapGenerator_.machineStackTracker.isGCPointer(
offs / sizeof(void*)));
break;
}
case MIRType::Double:
fr.storeLocalF64(RegF64(i->fpu()), l);
break;
case MIRType::Float32:
fr.storeLocalF32(RegF32(i->fpu()), l);
break;
#ifdef ENABLE_WASM_SIMD
case MIRType::Simd128:
fr.storeLocalV128(RegV128(i->fpu()), l);
break;
#endif
default:
MOZ_CRASH("Function argument type");
}
}
fr.zeroLocals(&ra);
fr.storeInstancePtr(InstanceReg);
if (compilerEnv_.debugEnabled()) {
insertBreakablePoint(CallSiteDesc::EnterFrame);
if (!createStackMap("debug: enter-frame breakpoint")) {
return false;
}
}
JitSpew(JitSpew_Codegen,
"# beginFunction: enter body with masm.framePushed = %u",
masm.framePushed());
MOZ_ASSERT(stackMapGenerator_.framePushedAtEntryToBody.isNothing());
stackMapGenerator_.framePushedAtEntryToBody.emplace(masm.framePushed());
if (compilerEnv_.mode() == CompileMode::LazyTiering) {
size_t funcBytecodeSize = func_.end - func_.begin;
uint32_t step = BlockSizeToDownwardsStep(funcBytecodeSize);
// Create a patchable hotness check and patch it immediately (only because
// there's no way to directly create a non-patchable check directly).
Maybe<CodeOffset> ctrDecOffset = addHotnessCheck();
if (ctrDecOffset.isNothing()) {
return false;
}
patchHotnessCheck(ctrDecOffset.value(), step);
}
return true;
}
bool BaseCompiler::endFunction() {
AutoCreatedBy acb(masm, "(wasm)BaseCompiler::endFunction");
JitSpew(JitSpew_Codegen, "# endFunction: start of function epilogue");
// Always branch to returnLabel_.
masm.breakpoint();
// Patch the add in the prologue so that it checks against the correct
// frame size. Flush the constant pool in case it needs to be patched.
masm.flush();
// Precondition for patching.
if (masm.oom()) {
return false;
}
fr.patchCheckStack();
masm.bind(&returnLabel_);
ResultType resultType(ResultType::Vector(funcType().results()));
popStackReturnValues(resultType);
if (compilerEnv_.debugEnabled()) {
// Store and reload the return value from DebugFrame::return so that
// it can be clobbered, and/or modified by the debug trap.
saveRegisterReturnValues(resultType);
insertBreakablePoint(CallSiteDesc::Breakpoint);
if (!createStackMap("debug: return-point breakpoint",
HasDebugFrameWithLiveRefs::Maybe)) {
return false;
}
insertBreakablePoint(CallSiteDesc::LeaveFrame);
if (!createStackMap("debug: leave-frame breakpoint",
HasDebugFrameWithLiveRefs::Maybe)) {
return false;
}
restoreRegisterReturnValues(resultType);
}
#ifndef RABALDR_PIN_INSTANCE
// To satisy instance extent invariant we need to reload InstanceReg because
// baseline can clobber it.
fr.loadInstancePtr(InstanceReg);
#endif
GenerateFunctionEpilogue(masm, fr.fixedAllocSize(), &offsets_);
#if defined(JS_ION_PERF)
// FIXME - profiling code missing. No bug for this.
// Note the end of the inline code and start of the OOL code.
// gen->perfSpewer().noteEndInlineCode(masm);
#endif
JitSpew(JitSpew_Codegen, "# endFunction: end of function epilogue");
JitSpew(JitSpew_Codegen, "# endFunction: start of OOL code");
if (!generateOutOfLineCode()) {
return false;
}
JitSpew(JitSpew_Codegen, "# endFunction: end of OOL code");
if (compilerEnv_.debugEnabled()) {
JitSpew(JitSpew_Codegen, "# endFunction: start of per-function debug stub");
insertPerFunctionDebugStub();
JitSpew(JitSpew_Codegen, "# endFunction: end of per-function debug stub");
}
offsets_.end = masm.currentOffset();
if (!fr.checkStackHeight()) {
return decoder_.fail(decoder_.beginOffset(), "stack frame is too large");
}
JitSpew(JitSpew_Codegen, "# endFunction: end of OOL code for index %d",
(int)func_.index);
return !masm.oom();
}
//////////////////////////////////////////////////////////////////////////////
//
// Debugger API.
// [SMDOC] Wasm debug traps -- code details
//
// There are four pieces of code involved.
//
// (1) The "breakable point". This is placed at every location where we might
// want to transfer control to the debugger, most commonly before every
// bytecode. It must be as short and fast as possible. It checks
// Instance::debugStub_, which is either null or a pointer to (3). If
// non-null, a call to (2) is performed; when null, nothing happens.
//
// (2) The "per function debug stub". There is one per function. It consults
// a bit-vector attached to the Instance, to see whether breakpoints for
// the current function are enabled. If not, it returns (to (1), hence
// having no effect). Otherwise, it jumps (not calls) onwards to (3).
//
// (3) The "debug stub" -- not to be confused with the "per function debug
// stub". There is one per module. This saves all the registers and
// calls onwards to (4), which is in C++ land. When that call returns,
// (3) itself returns, which transfers control directly back to (after)
// (1).
//
// (4) In C++ land -- WasmHandleDebugTrap, corresponding to
// SymbolicAddress::HandleDebugTrap. This contains the detailed logic
// needed to handle the breakpoint.
void BaseCompiler::insertBreakablePoint(CallSiteDesc::Kind kind) {
#ifndef RABALDR_PIN_INSTANCE
fr.loadInstancePtr(InstanceReg);
#endif
// The breakpoint code must call the breakpoint handler installed on the
// instance if it is not null. There is one breakable point before
// every bytecode, and one at the beginning and at the end of the function.
//
// There are many constraints:
//
// - Code should be read-only; we do not want to patch
// - The breakpoint code should be as dense as possible, given the volume of
// breakable points
// - The handler-is-null case should be as fast as we can make it
//
// The scratch register is available here.
//
// An unconditional callout would be densest but is too slow. The best
// balance results from an inline test for null with a conditional call. The
// best code sequence is platform-dependent.
//
// The conditional call goes to a stub attached to the function that performs
// further filtering before calling the breakpoint handler.
#if defined(JS_CODEGEN_X64)
// REX 83 MODRM OFFS IB
static_assert(Instance::offsetOfDebugStub() < 128);
masm.cmpq(Imm32(0),
Operand(Address(InstanceReg, Instance::offsetOfDebugStub())));
// 74 OFFS
Label L;
L.bind(masm.currentOffset() + 7);
masm.j(Assembler::Zero, &L);
// E8 OFFS OFFS OFFS OFFS
masm.call(&perFunctionDebugStub_);
masm.append(CallSiteDesc(iter_.lastOpcodeOffset(), kind),
CodeOffset(masm.currentOffset()));
// Branch destination
MOZ_ASSERT_IF(!masm.oom(), masm.currentOffset() == uint32_t(L.offset()));
#elif defined(JS_CODEGEN_X86)
// 83 MODRM OFFS IB
static_assert(Instance::offsetOfDebugStub() < 128);
masm.cmpl(Imm32(0),
Operand(Address(InstanceReg, Instance::offsetOfDebugStub())));
// 74 OFFS
Label L;
L.bind(masm.currentOffset() + 7);
masm.j(Assembler::Zero, &L);
// E8 OFFS OFFS OFFS OFFS
masm.call(&perFunctionDebugStub_);
masm.append(CallSiteDesc(iter_.lastOpcodeOffset(), kind),
CodeOffset(masm.currentOffset()));
// Branch destination
MOZ_ASSERT_IF(!masm.oom(), masm.currentOffset() == uint32_t(L.offset()));
#elif defined(JS_CODEGEN_ARM64)
ScratchPtr scratch(*this);
ARMRegister tmp(scratch, 64);
Label L;
masm.Ldr(tmp,
MemOperand(Address(InstanceReg, Instance::offsetOfDebugStub())));
masm.Cbz(tmp, &L);
masm.Bl(&perFunctionDebugStub_);
masm.append(CallSiteDesc(iter_.lastOpcodeOffset(), kind),
CodeOffset(masm.currentOffset()));
masm.bind(&L);
#elif defined(JS_CODEGEN_ARM)
ScratchPtr scratch(*this);
masm.loadPtr(Address(InstanceReg, Instance::offsetOfDebugStub()), scratch);
masm.ma_orr(scratch, scratch, SetCC);
masm.ma_bl(&perFunctionDebugStub_, Assembler::NonZero);
masm.append(CallSiteDesc(iter_.lastOpcodeOffset(), kind),
CodeOffset(masm.currentOffset()));
#elif defined(JS_CODEGEN_LOONG64) || defined(JS_CODEGEN_MIPS64) || \
defined(JS_CODEGEN_RISCV64)
ScratchPtr scratch(*this);
Label L;
masm.loadPtr(Address(InstanceReg, Instance::offsetOfDebugStub()), scratch);
masm.branchPtr(Assembler::Equal, scratch, ImmWord(0), &L);
masm.call(&perFunctionDebugStub_);
masm.append(CallSiteDesc(iter_.lastOpcodeOffset(), kind),
CodeOffset(masm.currentOffset()));
masm.bind(&L);
#else
MOZ_CRASH("BaseCompiler platform hook: insertBreakablePoint");
#endif
}
void BaseCompiler::insertPerFunctionDebugStub() {
// The per-function debug stub performs out-of-line filtering before jumping
// to the per-module debug stub if necessary. The per-module debug stub
// returns directly to the breakable point.
//
// NOTE, the link register is live here on platforms that have LR.
//
// The scratch register is available here (as it was at the call site).
//
// It's useful for the per-function debug stub to be compact, as every
// function gets one.
Label L;
masm.bind(&perFunctionDebugStub_);
#if defined(JS_CODEGEN_X86) || defined(JS_CODEGEN_X64)
{
ScratchPtr scratch(*this);
// Get the per-instance table of filtering bits.
masm.loadPtr(Address(InstanceReg, Instance::offsetOfDebugFilter()),
scratch);
// Check the filter bit. There is one bit per function in the module.
// Table elements are 32-bit because the masm makes that convenient.
masm.branchTest32(Assembler::NonZero, Address(scratch, func_.index / 32),
Imm32(1 << (func_.index % 32)), &L);
// Fast path: return to the execution.
masm.ret();
}
#elif defined(JS_CODEGEN_ARM64)
{
ScratchPtr scratch(*this);
// Logic as above, except abiret to jump to the LR directly
masm.loadPtr(Address(InstanceReg, Instance::offsetOfDebugFilter()),
scratch);
masm.branchTest32(Assembler::NonZero, Address(scratch, func_.index / 32),
Imm32(1 << (func_.index % 32)), &L);
masm.abiret();
}
#elif defined(JS_CODEGEN_ARM)
{
// We must be careful not to use the SecondScratchRegister, which usually
// is LR, as LR is live here. This means avoiding masm abstractions such
// as branchTest32.
static_assert(ScratchRegister != lr);
static_assert(Instance::offsetOfDebugFilter() < 0x1000);
ScratchRegisterScope tmp1(masm);
ScratchI32 tmp2(*this);
masm.ma_ldr(
DTRAddr(InstanceReg, DtrOffImm(Instance::offsetOfDebugFilter())), tmp1);
masm.ma_mov(Imm32(func_.index / 32), tmp2);
masm.ma_ldr(DTRAddr(tmp1, DtrRegImmShift(tmp2, LSL, 0)), tmp2);
masm.ma_tst(tmp2, Imm32(1 << func_.index % 32), tmp1, Assembler::Always);
masm.ma_bx(lr, Assembler::Zero);
}
#elif defined(JS_CODEGEN_LOONG64) || defined(JS_CODEGEN_MIPS64) || \
defined(JS_CODEGEN_RISCV64)
{
ScratchPtr scratch(*this);
// Logic same as ARM64.
masm.loadPtr(Address(InstanceReg, Instance::offsetOfDebugFilter()),
scratch);
masm.branchTest32(Assembler::NonZero, Address(scratch, func_.index / 32),
Imm32(1 << (func_.index % 32)), &L);
masm.abiret();
}
#else
MOZ_CRASH("BaseCompiler platform hook: endFunction");
#endif
// Jump to the per-module debug stub, which calls onwards to C++ land.
masm.bind(&L);
masm.jump(Address(InstanceReg, Instance::offsetOfDebugStub()));
}
void BaseCompiler::saveRegisterReturnValues(const ResultType& resultType) {
MOZ_ASSERT(compilerEnv_.debugEnabled());
size_t debugFrameOffset = masm.framePushed() - DebugFrame::offsetOfFrame();
size_t registerResultIdx = 0;
for (ABIResultIter i(resultType); !i.done(); i.next()) {
const ABIResult result = i.cur();
if (!result.inRegister()) {
#ifdef DEBUG
for (i.next(); !i.done(); i.next()) {
MOZ_ASSERT(!i.cur().inRegister());
}
#endif
break;
}
size_t resultOffset = DebugFrame::offsetOfRegisterResult(registerResultIdx);
Address dest(masm.getStackPointer(), debugFrameOffset + resultOffset);
switch (result.type().kind()) {
case ValType::I32:
masm.store32(RegI32(result.gpr()), dest);
break;
case ValType::I64:
masm.store64(RegI64(result.gpr64()), dest);
break;
case ValType::F64:
masm.storeDouble(RegF64(result.fpr()), dest);
break;
case ValType::F32:
masm.storeFloat32(RegF32(result.fpr()), dest);
break;
case ValType::Ref: {
uint32_t flag =
DebugFrame::hasSpilledRegisterRefResultBitMask(registerResultIdx);
// Tell Instance::traceFrame that we have a pointer to trace.
masm.or32(Imm32(flag),
Address(masm.getStackPointer(),
debugFrameOffset + DebugFrame::offsetOfFlags()));
masm.storePtr(RegRef(result.gpr()), dest);
break;
}
case ValType::V128:
#ifdef ENABLE_WASM_SIMD
masm.storeUnalignedSimd128(RegV128(result.fpr()), dest);
break;
#else
MOZ_CRASH("No SIMD support");
#endif
}
registerResultIdx++;
}
}
void BaseCompiler::restoreRegisterReturnValues(const ResultType& resultType) {
MOZ_ASSERT(compilerEnv_.debugEnabled());
size_t debugFrameOffset = masm.framePushed() - DebugFrame::offsetOfFrame();
size_t registerResultIdx = 0;
for (ABIResultIter i(resultType); !i.done(); i.next()) {
const ABIResult result = i.cur();
if (!result.inRegister()) {
#ifdef DEBUG
for (i.next(); !i.done(); i.next()) {
MOZ_ASSERT(!i.cur().inRegister());
}
#endif
break;
}
size_t resultOffset =
DebugFrame::offsetOfRegisterResult(registerResultIdx++);
Address src(masm.getStackPointer(), debugFrameOffset + resultOffset);
switch (result.type().kind()) {
case ValType::I32:
masm.load32(src, RegI32(result.gpr()));
break;
case ValType::I64:
masm.load64(src, RegI64(result.gpr64()));
break;
case ValType::F64:
masm.loadDouble(src, RegF64(result.fpr()));
break;
case ValType::F32:
masm.loadFloat32(src, RegF32(result.fpr()));
break;
case ValType::Ref:
masm.loadPtr(src, RegRef(result.gpr()));
break;
case ValType::V128:
#ifdef ENABLE_WASM_SIMD
masm.loadUnalignedSimd128(src, RegV128(result.fpr()));
break;
#else
MOZ_CRASH("No SIMD support");
#endif
}
}
}
//////////////////////////////////////////////////////////////////////////////
//
// Support for lazy tiering
// The key thing here is, we generate a short piece of code which, most of the
// time, has no effect, but just occasionally wants to call out to C++ land.
// That's a similar requirement to the Debugger API support (see above) and so
// we have a similar, but simpler, solution. Specifically, we use a single
// stub routine for the whole module, whereas for debugging, there are
// per-function stub routines as well as a whole-module stub routine involved.
class OutOfLineRequestTierUp : public OutOfLineCode {
Register instance_; // points at the instance at entry; must remain unchanged
Maybe<RegI32> scratch_; // only provided on arm32
size_t lastOpcodeOffset_; // a bytecode offset
public:
OutOfLineRequestTierUp(Register instance, Maybe<RegI32> scratch,
size_t lastOpcodeOffset)
: instance_(instance),
scratch_(scratch),
lastOpcodeOffset_(lastOpcodeOffset) {}
virtual void generate(MacroAssembler* masm) override {
// Generate:
//
// [optionally, if `instance_` != InstanceReg: swap(instance_, InstanceReg)]
// call * $offsetOfRequestTierUpStub(InstanceReg)
// [optionally, if `instance_` != InstanceReg: swap(instance_, InstanceReg)]
// goto rejoin
//
// This is the unlikely path, where we call the (per-module)
// request-tier-up stub. The stub wants the instance pointer to be in the
// official InstanceReg at this point, but InstanceReg itself might hold
// arbitrary other live data. Hence, if necessary, swap `instance_` and
// InstanceReg before the call and swap them back after it.
#ifndef RABALDR_PIN_INSTANCE
if (Register(instance_) != InstanceReg) {
# ifdef JS_CODEGEN_X86
// On x86_32 this is easy.
masm->xchgl(instance_, InstanceReg);
# elif JS_CODEGEN_ARM
masm->mov(instance_,
scratch_.value()); // note, destination is second arg
masm->mov(InstanceReg, instance_);
masm->mov(scratch_.value(), InstanceReg);
# else
MOZ_CRASH("BaseCompiler::OutOfLineRequestTierUp #1");
# endif
}
#endif
// Call the stub
masm->call(Address(InstanceReg, Instance::offsetOfRequestTierUpStub()));
masm->append(CallSiteDesc(lastOpcodeOffset_, CallSiteDesc::RequestTierUp),
CodeOffset(masm->currentOffset()));
// And swap again, if we swapped above.
#ifndef RABALDR_PIN_INSTANCE
if (Register(instance_) != InstanceReg) {
# ifdef JS_CODEGEN_X86
masm->xchgl(instance_, InstanceReg);
# elif JS_CODEGEN_ARM
masm->mov(instance_, scratch_.value());
masm->mov(InstanceReg, instance_);
masm->mov(scratch_.value(), InstanceReg);
# else
MOZ_CRASH("BaseCompiler::OutOfLineRequestTierUp #2");
# endif
}
#endif
masm->jump(rejoin());
}
};
Maybe<CodeOffset> BaseCompiler::addHotnessCheck() {
// Here's an example of what we'll create. The path that almost always
// happens, where the counter doesn't go negative, has just one branch.
//
// subl $to_be_filled_in_later, 0x170(%r14)
// js oolCode // almost never taken
// rejoin:
// ----------------
// oolCode: // we get here when the counter is negative, viz, almost never
// call *0x160(%r14) // RequestTierUpStub
// jmp rejoin
//
// Note that the counter is updated regardless of whether or not it has gone
// negative. That means that, at entry to RequestTierUpStub, we know the
// counter must be negative, and not merely zero.
//
// Non-Intel targets will have to generate a load / subtract-and-set-flags /
// store / jcond sequence.
//
// To ensure the shortest possible encoding, `to_be_filled_in_later` must be
// a value in the range 1 .. 127 inclusive. This is good enough for
// hotness-counting purposes.
AutoCreatedBy acb(masm, "BC::addHotnessCheck");
#ifdef RABALDR_PIN_INSTANCE
Register instance(InstanceReg);
#else
// This seems to assume that any non-RABALDR_PIN_INSTANCE target is 32-bit
ScratchI32 instance(*this);
fr.loadInstancePtr(instance);
#endif
Address addressOfCounter = Address(
instance, wasm::Instance::offsetInData(
codeMeta_.offsetOfFuncDefInstanceData(func_.index)));
#if JS_CODEGEN_ARM
Maybe<RegI32> scratch = Some(needI32());
#else
Maybe<RegI32> scratch = Nothing();
#endif
OutOfLineCode* ool = addOutOfLineCode(new (alloc_) OutOfLineRequestTierUp(
instance, scratch, iter_.lastOpcodeOffset()));
if (!ool) {
return Nothing();
}
// Because of the Intel arch instruction formats, `patchPoint` points to the
// byte immediately following the last byte of the instruction to patch.
CodeOffset patchPoint = masm.sub32FromMemAndBranchIfNegativeWithPatch(
addressOfCounter, ool->entry());
masm.bind(ool->rejoin());
if (scratch.isSome()) {
freeI32(scratch.value());
}
// `patchPoint` might be invalid if the assembler OOMd at some point.
return masm.oom() ? Nothing() : Some(patchPoint);
}
void BaseCompiler::patchHotnessCheck(CodeOffset offset, uint32_t step) {
// Zero makes the hotness check pointless. Above 127 is not representable in
// the short-form Intel encoding.
MOZ_RELEASE_ASSERT(step > 0 && step <= 127);
masm.patchSub32FromMemAndBranchIfNegative(offset, Imm32(step));
}
//////////////////////////////////////////////////////////////////////////////
//
// Results and block parameters
void BaseCompiler::popStackReturnValues(const ResultType& resultType) {
uint32_t bytes = ABIResultIter::MeasureStackBytes(resultType);
if (bytes == 0) {
return;
}
Register target = ABINonArgReturnReg0;
Register temp = ABINonArgReturnReg1;
fr.loadIncomingStackResultAreaPtr(RegPtr(target));
fr.popStackResultsToMemory(target, bytes, temp);
}
// TODO / OPTIMIZE (Bug 1316818): At the moment we use the Wasm
// inter-procedure ABI for block returns, which allocates ReturnReg as the
// single block result register. It is possible other choices would lead to
// better register allocation, as ReturnReg is often first in the register set
// and will be heavily wanted by the register allocator that uses takeFirst().
//
// Obvious options:
// - pick a register at the back of the register set
// - pick a random register per block (different blocks have
// different join regs)
void BaseCompiler::popRegisterResults(ABIResultIter& iter) {
// Pop register results. Note that in the single-value case, popping to a
// register may cause a sync(); for multi-value we sync'd already.
for (; !iter.done(); iter.next()) {
const ABIResult& result = iter.cur();
if (!result.inRegister()) {
// TODO / OPTIMIZE: We sync here to avoid solving the general parallel
// move problem in popStackResults. However we could avoid syncing the
// values that are going to registers anyway, if they are already in
// registers.
sync();
break;
}
switch (result.type().kind()) {
case ValType::I32:
popI32(RegI32(result.gpr()));
break;
case ValType::I64:
popI64(RegI64(result.gpr64()));
break;
case ValType::F32:
popF32(RegF32(result.fpr()));
break;
case ValType::F64:
popF64(RegF64(result.fpr()));
break;
case ValType::Ref:
popRef(RegRef(result.gpr()));
break;
case ValType::V128:
#ifdef ENABLE_WASM_SIMD
popV128(RegV128(result.fpr()));
#else
MOZ_CRASH("No SIMD support");
#endif
}
}
}
void BaseCompiler::popStackResults(ABIResultIter& iter, StackHeight stackBase) {
MOZ_ASSERT(!iter.done());
// The iterator should be advanced beyond register results, and register
// results should be popped already from the value stack.
uint32_t alreadyPopped = iter.index();
// At this point, only stack arguments are remaining. Iterate through them
// to measure how much stack space they will take up.
for (; !iter.done(); iter.next()) {
MOZ_ASSERT(iter.cur().onStack());
}
// Calculate the space needed to store stack results, in bytes.
uint32_t stackResultBytes = iter.stackBytesConsumedSoFar();
MOZ_ASSERT(stackResultBytes);
// Compute the stack height including the stack results. Note that it's
// possible that this call expands the stack, for example if some of the
// results are supplied by constants and so are not already on the machine
// stack.
uint32_t endHeight = fr.prepareStackResultArea(stackBase, stackResultBytes);
// Find a free GPR to use when shuffling stack values. If none is
// available, push ReturnReg and restore it after we're done.
bool saved = false;
RegPtr temp = ra.needTempPtr(RegPtr(ReturnReg), &saved);
// The sequence of Stk values is in the same order on the machine stack as
// the result locations, but there is a complication: constant values are
// not actually pushed on the machine stack. (At this point registers and
// locals have been spilled already.) So, moving the Stk values into place
// isn't simply a shuffle-down or shuffle-up operation. There is a part of
// the Stk sequence that shuffles toward the FP, a part that's already in
// place, and a part that shuffles toward the SP. After shuffling, we have
// to materialize the constants.
// Shuffle mem values toward the frame pointer, copying deepest values
// first. Stop when we run out of results, get to a register result, or
// find a Stk value that is closer to the FP than the result.
for (iter.switchToPrev(); !iter.done(); iter.prev()) {
const ABIResult& result = iter.cur();
if (!result.onStack()) {
break;
}
MOZ_ASSERT(result.stackOffset() < stackResultBytes);
uint32_t destHeight = endHeight - result.stackOffset();
uint32_t stkBase = stk_.length() - (iter.count() - alreadyPopped);
Stk& v = stk_[stkBase + iter.index()];
if (v.isMem()) {
uint32_t srcHeight = v.offs();
if (srcHeight <= destHeight) {
break;
}
fr.shuffleStackResultsTowardFP(srcHeight, destHeight, result.size(),
temp);
}
}
// Reset iterator and skip register results.
for (iter.reset(); !iter.done(); iter.next()) {
if (iter.cur().onStack()) {
break;
}
}
// Revisit top stack values, shuffling mem values toward the stack pointer,
// copying shallowest values first.
for (; !iter.done(); iter.next()) {
const ABIResult& result = iter.cur();
MOZ_ASSERT(result.onStack());
MOZ_ASSERT(result.stackOffset() < stackResultBytes);
uint32_t destHeight = endHeight - result.stackOffset();
Stk& v = stk_[stk_.length() - (iter.index() - alreadyPopped) - 1];
if (v.isMem()) {
uint32_t srcHeight = v.offs();
if (srcHeight >= destHeight) {
break;
}
fr.shuffleStackResultsTowardSP(srcHeight, destHeight, result.size(),
temp);
}
}
// Reset iterator and skip register results, which are already popped off
// the value stack.
for (iter.reset(); !iter.done(); iter.next()) {
if (iter.cur().onStack()) {
break;
}
}
// Materialize constants and pop the remaining items from the value stack.
for (; !iter.done(); iter.next()) {
const ABIResult& result = iter.cur();
uint32_t resultHeight = endHeight - result.stackOffset();
Stk& v = stk_.back();
switch (v.kind()) {
case Stk::ConstI32:
#if defined(JS_CODEGEN_MIPS64) || defined(JS_CODEGEN_LOONG64) || \
defined(JS_CODEGEN_RISCV64)
fr.storeImmediatePtrToStack(v.i32val_, resultHeight, temp);
#else
fr.storeImmediatePtrToStack(uint32_t(v.i32val_), resultHeight, temp);
#endif
break;
case Stk::ConstF32:
fr.storeImmediateF32ToStack(v.f32val_, resultHeight, temp);
break;
case Stk::ConstI64:
fr.storeImmediateI64ToStack(v.i64val_, resultHeight, temp);
break;
case Stk::ConstF64:
fr.storeImmediateF64ToStack(v.f64val_, resultHeight, temp);
break;
#ifdef ENABLE_WASM_SIMD
case Stk::ConstV128:
fr.storeImmediateV128ToStack(v.v128val_, resultHeight, temp);
break;
#endif
case Stk::ConstRef:
fr.storeImmediatePtrToStack(v.refval_, resultHeight, temp);
break;
case Stk::MemRef:
// Update bookkeeping as we pop the Stk entry.
stackMapGenerator_.memRefsOnStk--;
break;
default:
MOZ_ASSERT(v.isMem());
break;
}
stk_.popBack();
}
ra.freeTempPtr(temp, saved);
// This will pop the stack if needed.
fr.finishStackResultArea(stackBase, stackResultBytes);
}
void BaseCompiler::popBlockResults(ResultType type, StackHeight stackBase,
ContinuationKind kind) {
if (!type.empty()) {
ABIResultIter iter(type);
popRegisterResults(iter);
if (!iter.done()) {
popStackResults(iter, stackBase);
// Because popStackResults might clobber the stack, it leaves the stack
// pointer already in the right place for the continuation, whether the
// continuation is a jump or fallthrough.
return;
}
}
// We get here if there are no stack results. For a fallthrough, the stack
// is already at the right height. For a jump, we may need to pop the stack
// pointer if the continuation's stack height is lower than the current
// stack height.
if (kind == ContinuationKind::Jump) {
fr.popStackBeforeBranch(stackBase, type);
}
}
// This function is similar to popBlockResults, but additionally handles the
// implicit exception pointer that is pushed to the value stack on entry to
// a catch handler by dropping it appropriately.
void BaseCompiler::popCatchResults(ResultType type, StackHeight stackBase) {
if (!type.empty()) {
ABIResultIter iter(type);
popRegisterResults(iter);
if (!iter.done()) {
popStackResults(iter, stackBase);
// Since popStackResults clobbers the stack, we only need to free the
// exception off of the value stack.
popValueStackBy(1);
} else {
// If there are no stack results, we have to adjust the stack by
// dropping the exception reference that's now on the stack.
dropValue();
}
} else {
dropValue();
}
fr.popStackBeforeBranch(stackBase, type);
}
Stk BaseCompiler::captureStackResult(const ABIResult& result,
StackHeight resultsBase,
uint32_t stackResultBytes) {