{"@attributes":{"version":"2.0"},"channel":{"title":"Tech blog","link":"https:\/\/tlog.quasinomial.net\/","description":"Recent content on Tech blog","generator":"Hugo -- gohugo.io","language":"en-us","lastBuildDate":"Wed, 08 Jan 2020 13:00:26 +0100","item":[{"title":"A Dive Into JavaScriptCore","link":"https:\/\/tlog.quasinomial.net\/posts\/dive-into-jsc\/","pubDate":"Wed, 08 Jan 2020 13:00:26 +0100","guid":"https:\/\/tlog.quasinomial.net\/posts\/dive-into-jsc\/","description":"<ul>\n<li><a href=\"#org7b42e4b\">In medias res<\/a><\/li>\n<li><a href=\"#org0abf6e5\">Backstory<\/a><\/li>\n<li><a href=\"#orga9519dc\">Diving in<\/a>\n<ul>\n<li><a href=\"#orgfd10d33\">The <code>CodeBlock<\/code><\/a><\/li>\n<li><a href=\"#org6c68086\">Virtual registers, locals and arguments, oh my!<\/a><\/li>\n<li><a href=\"#org6f6cbae\">Bytecode operands<\/a><\/li>\n<li><a href=\"#org9382a1a\">Variables with values<\/a><\/li>\n<li><a href=\"#org70ea9c4\">Values that must be handled<\/a><\/li>\n<li><a href=\"#org8a1fb79\">Callee save space as virtual registers<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#org87c98de\">Conclusion<\/a><\/li>\n<\/ul>\n<p>Recently, the compiler team at Igalia was discussing the available resources for the <a href=\"https:\/\/webkit.org\/\">WebKit<\/a> project, both for the purpose of onboarding new Igalians and for lowering the bar for third-party contributors. As compiler people, we are mainly concerned with JavaScriptCore (JSC), WebKit&rsquo;s javascript engine implementation. There are many high quality blog posts on the <a href=\"https:\/\/webkit.org\/blog\/\">webkit blog<\/a> that describe various phases in the evolution of JSC, but finding one&rsquo;s bearings in the actual source can be a daunting task.<\/p>\n<p>The aim of this post is twofold: first, document some aspects of JavaScriptCore at the source level; second, show how one can figure out what a piece of code actually does in a large and complex source base (which JSC&rsquo;s certainly is).<\/p>\n<p><a id=\"org7b42e4b\"><\/a><\/p>\n<h1 id=\"in-medias-res\">In medias res<\/h1>\n<p>As an exercise, we&rsquo;re going to arbitrarily use a commit I had open in a web browser tab. Specifically, we will be looking at <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/JITOperations.cpp#L1655\">this snippet<\/a>:<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\">Operands<span style=\"color:#f92672\">&lt;<\/span>Optional<span style=\"color:#f92672\">&lt;<\/span>JSValue<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">&gt;<\/span> mustHandleValues(codeBlock<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>numParameters(), numVarsWithValues);\n<span style=\"color:#66d9ef\">int<\/span> localsUsedForCalleeSaves <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">static_cast<\/span><span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#66d9ef\">int<\/span><span style=\"color:#f92672\">&gt;<\/span>(CodeBlock<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>llintBaselineCalleeSaveSpaceAsVirtualRegisters());\n<span style=\"color:#66d9ef\">for<\/span> (size_t i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> mustHandleValues.size(); <span style=\"color:#f92672\">+<\/span><span style=\"color:#f92672\">+<\/span>i) {\n    <span style=\"color:#66d9ef\">int<\/span> operand <span style=\"color:#f92672\">=<\/span> mustHandleValues.operandForIndex(i);\n    <span style=\"color:#66d9ef\">if<\/span> (operandIsLocal(operand) <span style=\"color:#f92672\">&amp;<\/span><span style=\"color:#f92672\">&amp;<\/span> VirtualRegister(operand).toLocal() <span style=\"color:#f92672\">&lt;<\/span> localsUsedForCalleeSaves)\n\t<span style=\"color:#66d9ef\">continue<\/span>;\n    mustHandleValues[i] <span style=\"color:#f92672\">=<\/span> callFrame<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>uncheckedR(operand).jsValue();\n}\n<\/code><\/pre><\/div><p>This seems like a good starting point for taking a dive into the low-level details of JSC internals. Virtual registers look like a concept that&rsquo;s good to know about. And what are those &ldquo;locals used for callee saves&rdquo; anyway? How do locals differ from vars? What are &ldquo;vars with values&rdquo;? Let&rsquo;s find out!<\/p>\n<p><a id=\"org0abf6e5\"><\/a><\/p>\n<h1 id=\"backstory\">Backstory<\/h1>\n<p><a href=\"https:\/\/webkit.org\/blog\/9329\/a-new-bytecode-format-for-javascriptcore\/\">Recall<\/a> that JSC is a multi-tiered execution engine. Most Javascript code is only executed once; compiling takes longer than simply interpreting the code, so Javascript code is always interpreted the first time through. If it turns out that a piece of code is executed frequently though<sup><a id=\"fnr.1\" class=\"footref\" href=\"#fn.1\">1<\/a><\/sup>, compiling it becomes a more attractive proposition.<\/p>\n<p>Initially, the tier up happens to the <strong>baseline JIT<\/strong>, a simple and fast non-optimizing compiler that produces native code for a Javascript function. If the code continues to see much use, it will be recompiled with <strong>DFG<\/strong>, an optimizing compiler that is geared towards low compilation times and decent performance of the produced native code. Eventually, the code might end up being compiled with the <strong>FTL<\/strong> backend too, but the upper tiers won&rsquo;t be making an appearence in our story here.<\/p>\n<p>What do <strong>tier up<\/strong> and <strong>tier down<\/strong> mean? In short, tier up is when code execution switches to a more optimized version, whereas tier down is the reverse operation. So the code might tier up from the interpreter to the baseline JIT, but later tier down (under conditions we&rsquo;ll briefly touch on later) back to the baseline JIT. You can read a more extensive overview <a href=\"https:\/\/webkit.org\/blog\/3362\/introducing-the-webkit-ftl-jit\/\">here<\/a>.<\/p>\n<p><a id=\"orga9519dc\"><\/a><\/p>\n<h1 id=\"diving-in\">Diving in<\/h1>\n<p>With this context now in place, we can revisit the snippet above. The code is part of <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/JITOperations.cpp#L1482\"><code>operationOptimize<\/code><\/a>. Just looking at the two sites it&rsquo;s <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/JIT.cpp#L106\">referenced<\/a> <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/JITOpcodes.cpp#L1040\">in<\/a>, we can see that it&rsquo;s only ever used if the <code>DFG_JIT<\/code> option is enabled. This is where the baseline JIT \u279e DFG tier up happens!<\/p>\n<p>The sites that make use of <code>operationOptimize<\/code> both run during the generation of native code by the baseline JIT. The first one runs in response to the <code>op_enter<\/code> bytecode opcode, i.e. the opcode that marks entry to the function. The second one runs when encountering an <code>op_loop_hint<\/code> opcode (an opcode that <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/dfg\/DFGByteCodeParser.cpp#L6782\">only appears at the beginning of a basic block<\/a> marking the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/NodesCodegen.cpp#L3146\">entry to a loop<\/a>). Those are the two kinds of program points at which execution might tier up to the DFG.<\/p>\n<p>Notice that calls to <code>operationOptimize<\/code> only occur during execution of the native code produced by the baseline JIT. In fact, if you look at the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/JIT.cpp#L94\">emitted code<\/a> surrounding the call to <code>operationOptimize<\/code> for the function entry case, you&rsquo;ll see that the call is conditional and only happens if the function has been executed enough times that it&rsquo;s worth making a C++ call to consider it for optimization.<\/p>\n<p>The function accepts two arguments: a <code>vmPointer<\/code> which is, umm, a pointer to a <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/runtime\/VM.h#L262\"><code>VM<\/code> structure<\/a> (i.e. the &ldquo;state of the world&rdquo; as far as this function is concerned) and the <code>bytecodeIndex<\/code>. Remember that the bytecode is the intermediate representation (IR) that all higher tiers start compiling from. In <code>operationOptimize<\/code>, the <code>bytecodeIndex<\/code> is used for<\/p>\n<ul>\n<li>distinguishing between function and loop entry points<\/li>\n<li>the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/dfg\/DFGPlan.cpp#L728\">DFG to be able to do program analysis of the values at the respective program point<\/a><\/li>\n<li>various diagnostics.<\/li>\n<\/ul>\n<p>Again, the <code>bytecodeIndex<\/code> is a parameter that has already been set in stone during generation of the native code by the baseline JIT.<\/p>\n<p>The other parameter, the <code>VM<\/code>, is used in a number of things. The part that&rsquo;s relevant to the snippet we started out to understand is that the <code>VM<\/code> is (<a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/interpreter\/CallFrame.h#L318\">sometimes<\/a>) used to give us access to the current <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/interpreter\/CallFrame.h#L96\"><code>CallFrame<\/code><\/a>. <code>CallFrame<\/code> inherits from <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/interpreter\/Register.h#L43\"><code>Register<\/code><\/a>, which is a thin wrapper around a (maximally) 64-bit value.<\/p>\n<p><a id=\"orgfd10d33\"><\/a><\/p>\n<h2 id=\"the-codeblock\">The <code>CodeBlock<\/code><\/h2>\n<p>In this case, the various accessors defined by <code>CallFrame<\/code> effectively treat the (pointer) value that <code>CallFrame<\/code> consists of as a pointer to an array of <code>Register<\/code> values. Specifically, a <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/interpreter\/CallFrame.h#L86\">set of constant expressions<\/a><\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\"><span style=\"color:#66d9ef\">struct<\/span> <span style=\"color:#a6e22e\">CallFrameSlot<\/span> {\n    <span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">constexpr<\/span> <span style=\"color:#66d9ef\">int<\/span> codeBlock <span style=\"color:#f92672\">=<\/span> CallerFrameAndPC<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>sizeInRegisters;\n    <span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">constexpr<\/span> <span style=\"color:#66d9ef\">int<\/span> callee <span style=\"color:#f92672\">=<\/span> codeBlock <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n    <span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">constexpr<\/span> <span style=\"color:#66d9ef\">int<\/span> argumentCount <span style=\"color:#f92672\">=<\/span> callee <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n    <span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">constexpr<\/span> <span style=\"color:#66d9ef\">int<\/span> thisArgument <span style=\"color:#f92672\">=<\/span> argumentCount <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n    <span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">constexpr<\/span> <span style=\"color:#66d9ef\">int<\/span> firstArgument <span style=\"color:#f92672\">=<\/span> thisArgument <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">1<\/span>;\n};\n<\/code><\/pre><\/div><p>give the offset (relative to the callframe) of the pointer to the codeblock, the callee, the argument count and the <code>this<\/code> pointer. Note that the first <code>CallFrameSlot<\/code> is the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/interpreter\/CallFrame.h#L79\"><code>CallerFrameAndPC<\/code><\/a>, i.e. a pointer to the <code>CallFrame<\/code> of the caller and the <code>returnPC<\/code>.<\/p>\n<p>The <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/CodeBlock.h#L105\"><code>CodeBlock<\/code><\/a> is definitely something we&rsquo;ll need to understand better, as it appears in our motivational code snippet. However, it&rsquo;s a large class that is intertwined with a number of other interesting code paths. For the purposes of this discussion, we need to know that it<\/p>\n<ul>\n<li>is associated with a code block (i.e. a <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/CodeBlock.cpp#L1956\">function, eval, program or module<\/a> code block)<\/li>\n<li>holds data relevant to tier up\/down decisions and operations for the associated code block<\/li>\n<\/ul>\n<p>We&rsquo;ll focus on three of its data members:<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\"><span style=\"color:#66d9ef\">int<\/span> m_numCalleeLocals;\n<span style=\"color:#66d9ef\">int<\/span> m_numVars;\n<span style=\"color:#66d9ef\">int<\/span> m_numParameters;\n<\/code><\/pre><\/div><p>So, it seems that a <code>CodeBlock<\/code> can have at least some parameters (makes sense, right?) but also has both variables and callee locals.<\/p>\n<p>First things first: what&rsquo;s the difference between callee locals and vars? Well, it turns out that <code>m_numCalleeLocals<\/code> is only incremented in <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGeneratorBaseInlines.h#L155\"><code>BytecodeGeneratorBase&lt;Traits&gt;::newRegister<\/code><\/a> whereas <code>m_numVars<\/code> is only incremented in <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGeneratorBaseInlines.h#L176\"><code>BytecodeGeneratorBase&lt;Traits&gt;::addVar()<\/code><\/a>. Except, <code>addVar<\/code> calls into <code>newRegister<\/code>, so vars are a subset of callee locals (and therefore <code>m_numVars<\/code> \u2264 <code>m_numCalleelocals<\/code>).<\/p>\n<p>Somewhat surprisingly, <code>newRegister<\/code> is only called in 3 places:<\/p>\n<ul>\n<li><code>addvar<\/code><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGeneratorBaseInlines.h#L165\"><code>BytecodeGeneratorBase&lt;Traits&gt;::newTemporary<\/code><\/a><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L1246\"><code>BytecodeGenerator::newBlockScopeVariable<\/code><\/a><\/li>\n<\/ul>\n<p>So there you have it. Callee locals<\/p>\n<ol>\n<li>are allocated by a function called <code>newRegister<\/code><\/li>\n<li>are either a var or a temporary.<\/li>\n<\/ol>\n<p>Let&rsquo;s start with the second point. What is a var? Well, let&rsquo;s look at where vars are created (via <code>addVar<\/code>):<\/p>\n<p>There is definitely <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L1841\">a var for every lexical variable (<code>VarKind::Stack<\/code>)<\/a>, i.e. a non-local variable accessible from the current scope. Vars are also generated (via <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2253\"><code>BytecodeGenerator::createVariable<\/code><\/a>) for<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L608\">the <code>arguments<\/code> object<\/a>, if needed<\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L617\">function definitions in scope<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L626\">declared function variables<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L950\">declared module variables<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L939\">the module &lsquo;meta&rsquo; private variable<\/a><\/li>\n<\/ul>\n<p>So, intuitively, vars are allocated more or less for &ldquo;every JS construct that could be called a variable&rdquo;. Conversely, temporaries are storage locations that have been allocated as part of bytecode generation (i.e. there is no corresponding storage location in the JS source). They can store intermediate calculation results and what not.<\/p>\n<p>Coming back to the first point regarding callee locals, how come they&rsquo;re allocated by a function called <code>newRegister<\/code>? Why, because JSC&rsquo;s bytecode operates on a register VM! The <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/RegisterID.h#L38\"><code>RegisterID<\/code><\/a> returned by <code>newRegister<\/code> wraps the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L47\"><code>VirtualRegister<\/code><\/a> that our register VM is all about.<\/p>\n<p><a id=\"org6c68086\"><\/a><\/p>\n<h2 id=\"virtual-registers-locals-and-arguments-oh-my\">Virtual registers, locals and arguments, oh my!<\/h2>\n<p>A virtual register (of type <code>VirtualRegister<\/code>) consists simply of an <code>int<\/code> (which is also called its <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L71\">offset)<\/a>. Each virtual register corresponds to one of<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.cpp#L72\">local<\/a> (i.e. variable or temporary)<\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.cpp#L64\">argument<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.cpp#L59\">reference to a constant<\/a> in the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/CodeBlock.h#L1015\">constant pool<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.cpp#L40\">reference to a field in the header<\/a> of a <code>CallFrame<\/code> (caller frame, return address, argument count, callee, code block)<\/li>\n<\/ul>\n<p>There is no differentiation between locals and arguments at the type level (everything is a (positive) <code>int<\/code>); However, virtual registers that map to locals <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L34\">are negative<\/a> and those that map to arguments <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L39\">are nonnegative<\/a>. In the context of bytecode generation, the <code>int<\/code><\/p>\n<ul>\n<li>for a local <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.h#L1098\">indexes<\/a> into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGeneratorBase.h#L87\"><code>m_calleeLocals<\/code><\/a><\/li>\n<li>for an argument <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.h#L1104\">indexes<\/a> into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.h#L1226\"><code>m_parameters<\/code><\/a><\/li>\n<li>for a constant, it <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/CodeBlock.h#L575\">indexes<\/a> into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/CodeBlock.h#L1015\"><code>m_constantRegisters<\/code><\/a><\/li>\n<\/ul>\n<p>It feels like JSC is underusing C++ here.<\/p>\n<p>In all cases, what we get after indexing with a local, argument or constant is a <code>RegisterID<\/code>. As explained, the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/RegisterID.h#L52\"><code>RegisterID<\/code><\/a> <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/RegisterID.h#L121\">wraps<\/a> a <code>VirtualRegister<\/code>. Why do we need this indirection?<\/p>\n<p>Well, there are two extra bits of info in the <code>RegisterID<\/code>. The <code>m_refcount<\/code> and an <code>m_isTemporary<\/code> flag. The reference count is always <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGeneratorBaseInlines.h#L181\">greater than zero for a variable<\/a>, but the rules under which a <code>RegisterID<\/code> is ref&rsquo;d and unref&rsquo;d are too complicated to go into here.<\/p>\n<p>When you have an argument, you get the <code>VirtualRegister<\/code> for it by <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L115\">directly adding it<\/a> to <code>CallFrame::thisArgumentoffset<\/code>.<\/p>\n<p>When you have a local, you <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L112\">map it<\/a> to <code>(-1 - local)<\/code> to get the corresponding <code>Virtualregister<\/code>. So<\/p>\n<table>\n<thead>\n<tr>\n<th>local<\/th>\n<th>vreg<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td>-1<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>-2<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>-3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>(remember, virtual registers that correspond to locals are negative).<\/p>\n<p>For an argument, you <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L113\">map it<\/a> to <code>(arg + CallFrame::thisArgumentOffset())<\/code>:<\/p>\n<table>\n<thead>\n<tr>\n<th>argument<\/th>\n<th>vreg<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>0<\/td>\n<td>this<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>this + 1<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<td>this + 2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Which makes all the sense in the world when you remember what the <a href=\"#org13dd53c\"><code>CallFrameSlot<\/code><\/a> looks like. So argument 0 is always the `this` pointer.<\/p>\n<p>If the vreg is greater than some <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/BytecodeConventions.h#L32\">large offset<\/a> (<code>s_firstConstantRegisterIndex<\/code>), then it is an index into the <code>CodeBlock<\/code>'s constant pool (after <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L70\">subtracting<\/a> the offset).<\/p>\n<p><a id=\"org6f6cbae\"><\/a><\/p>\n<h2 id=\"bytecode-operands\">Bytecode operands<\/h2>\n<p>If you&rsquo;ve followed any of the links to the functions doing the actual mapping of locals and arguments to a virtual register, you may have noticed that the functions are called <code>localToOperand<\/code> and <code>argumentToOperand<\/code>. Yet they&rsquo;re only ever used in <code>virtualRegisterForLocal<\/code> and <code>virtualRegisterForArgument<\/code> respectively. This raises the obvious question: what are those virtual registers operands of?<\/p>\n<p>Well, of the bytecode instructions in our register VM of course. Instead of recreating the pictures, I&rsquo;ll simply encourage you to take a look at a recent <a href=\"https:\/\/webkit.org\/blog\/9329\/a-new-bytecode-format-for-javascriptcore\/\">blog post<\/a> describing it at a high level.<\/p>\n<p>How do we know that&rsquo;s what &ldquo;operand&rdquo; refers to? Well, let&rsquo;s look at a use of <code>virtualRegisterForLocal<\/code> in the bytecode generator. <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2253\"><code>BytecodeGenerator::createVariable<\/code><\/a> will allocate<sup><a id=\"fnr.2\" class=\"footref\" href=\"#fn.2\">2<\/a><\/sup> the next available local index (<a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2284\">using the size<\/a> of <code>m_calleeLocals<\/code> to keep track of it). This calls into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L122\"><code>virtualRegisterForLocal<\/code><\/a>, which maps the <code>local<\/code> to a virtual register by calling <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/VirtualRegister.h#L112\"><code>localToOperand<\/code><\/a>.<\/p>\n<p>The newly allocated local is inserted into the function symbol table, along with its offset (i.e. the ID of the virtual register).<\/p>\n<p>The <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/runtime\/SymbolTable.h#L75\"><code>SymbolTableEntry<\/code><\/a> is looked up when we generate bytecode for a variable reference. A variable reference is represented by a <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/Nodes.h#L658\"><code>ResolveNode<\/code><\/a><sup><a id=\"fnr.3\" class=\"footref\" href=\"#fn.3\">3<\/a><\/sup>.<\/p>\n<p>So looking into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/NodesCodegen.cpp#L245\"><code>ResolveNode::emitBytecode<\/code><\/a>, we dive into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2188\"><code>BytecodeGenerator::variable<\/code><\/a> and <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2217\">there&rsquo;s<\/a> our <code>symbolTable-&gt;get()<\/code> call. And then the <code>symbolTableEntry<\/code> is passed to <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2239\"><code>BytecodeGenerator::variableForLocalEntry<\/code><\/a> which uses <code>entry.varOffset()<\/code> to initialize the returned <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.h#L296\"><code>Variable<\/code><\/a> with <code>offset<\/code>. It also uses <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.h#L1095\"><code>registerFor<\/code><\/a> to retrieve the <code>RegisterID<\/code> from <code>m_calleeLocals<\/code>.<\/p>\n<p><a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/NodesCodegen.cpp#L245\"><code>ResolveNode::emitBytecode<\/code><\/a> will then pass the <code>local<\/code> <code>RegisterID<\/code> to <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.h#L489\"><code>move<\/code><\/a> which calls into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L1509\"><code>emitMove<\/code><\/a>, which just calls <code>OpMov::emit<\/code> (a function generated by the <code>JavaScriptCore\/generator<\/code> code). Note that the compiler implicitly converts the <code>RegisterID<\/code> arguments to <code>VirtualRegister<\/code> type at this step. Eventually, we end up in the (generated) function<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\"><span style=\"color:#66d9ef\">template<\/span><span style=\"color:#f92672\">&lt;<\/span>OpcodeSize __size, <span style=\"color:#66d9ef\">bool<\/span> recordOpcode, <span style=\"color:#66d9ef\">typename<\/span> BytecodeGenerator<span style=\"color:#f92672\">&gt;<\/span>\n<span style=\"color:#66d9ef\">static<\/span> <span style=\"color:#66d9ef\">bool<\/span> emitImpl(BytecodeGenerator<span style=\"color:#f92672\">*<\/span> gen, VirtualRegister dst, VirtualRegister src)\n{\n    <span style=\"color:#66d9ef\">if<\/span> (__size <span style=\"color:#f92672\">=<\/span><span style=\"color:#f92672\">=<\/span> OpcodeSize<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>Wide16)\n\tgen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>alignWideOpcode16();\n    <span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#a6e22e\">if<\/span> (__size <span style=\"color:#f92672\">=<\/span><span style=\"color:#f92672\">=<\/span> OpcodeSize<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>Wide32)\n\tgen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>alignWideOpcode32();\n    <span style=\"color:#66d9ef\">if<\/span> (checkImpl<span style=\"color:#f92672\">&lt;<\/span>__size<span style=\"color:#f92672\">&gt;<\/span>(gen, dst, src)) {\n\t<span style=\"color:#66d9ef\">if<\/span> (recordOpcode)\n\t    gen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>recordOpcode(opcodeID);\n\t<span style=\"color:#66d9ef\">if<\/span> (__size <span style=\"color:#f92672\">=<\/span><span style=\"color:#f92672\">=<\/span> OpcodeSize<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>Wide16)\n\t    gen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>write(Fits<span style=\"color:#f92672\">&lt;<\/span>OpcodeID, OpcodeSize<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>Narrow<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>convert(op_wide16));\n\t<span style=\"color:#66d9ef\">else<\/span> <span style=\"color:#a6e22e\">if<\/span> (__size <span style=\"color:#f92672\">=<\/span><span style=\"color:#f92672\">=<\/span> OpcodeSize<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>Wide32)\n\t    gen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>write(Fits<span style=\"color:#f92672\">&lt;<\/span>OpcodeID, OpcodeSize<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>Narrow<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>convert(op_wide32));\n\tgen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>write(Fits<span style=\"color:#f92672\">&lt;<\/span>OpcodeID, __size<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>convert(opcodeID));\n\tgen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>write(Fits<span style=\"color:#f92672\">&lt;<\/span>VirtualRegister, __size<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>convert(dst));\n\tgen<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>write(Fits<span style=\"color:#f92672\">&lt;<\/span>VirtualRegister, __size<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>convert(src));\n\t<span style=\"color:#66d9ef\">return<\/span> true;\n    }\n    <span style=\"color:#66d9ef\">return<\/span> false;\n}\n<\/code><\/pre><\/div><p>where <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/Fits.h#L134\"><code>Fits::convert(VirtualRegister)<\/code><\/a> will trivially encode the VirtualRegister into the target type. <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/Fits.h#L113\">Specifically<\/a> the mapping is nicely summed up in the following comment<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\"><span style=\"color:#75715e\">\/\/ Narrow:\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/ -128..-1  local variables\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/    0..15  arguments\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/   16..127 constants\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/ Wide16:\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/ -2**15..-1  local variables\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/      0..64  arguments\n<\/span><span style=\"color:#75715e\"><\/span><span style=\"color:#75715e\">\/\/     64..2**15-1 constants\n<\/span><\/code><\/pre><\/div><p>You may have noticed that the <code>Variable<\/code> returned by <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2239\"><code>BytecodeGenerator::variableForLocalEntry<\/code><\/a> already has been initialized with the virtual register <code>offset<\/code> we set when inserting the <code>SymbolTableEntry<\/code> for the local variable. And yet we use <code>registerFor<\/code> to look up the <code>RegisterID<\/code> for the local and then use the offset of the <code>VirtualRegister<\/code> contained therein. Surely those are the same? Oh well, something for a runtime assert to check.<\/p>\n<p><a id=\"org9382a1a\"><\/a><\/p>\n<h2 id=\"variables-with-values\">Variables with values<\/h2>\n<p>Whew! Quite the detour there. Time to get back to our original snippet:<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\">Operands<span style=\"color:#f92672\">&lt;<\/span>Optional<span style=\"color:#f92672\">&lt;<\/span>JSValue<span style=\"color:#f92672\">&gt;<\/span><span style=\"color:#f92672\">&gt;<\/span> mustHandleValues(codeBlock<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>numParameters(), numVarsWithValues);\n<span style=\"color:#66d9ef\">int<\/span> localsUsedForCalleeSaves <span style=\"color:#f92672\">=<\/span> <span style=\"color:#66d9ef\">static_cast<\/span><span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#66d9ef\">int<\/span><span style=\"color:#f92672\">&gt;<\/span>(CodeBlock<span style=\"color:#f92672\">:<\/span><span style=\"color:#f92672\">:<\/span>llintBaselineCalleeSaveSpaceAsVirtualRegisters());\n<span style=\"color:#66d9ef\">for<\/span> (size_t i <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>; i <span style=\"color:#f92672\">&lt;<\/span> mustHandleValues.size(); <span style=\"color:#f92672\">+<\/span><span style=\"color:#f92672\">+<\/span>i) {\n    <span style=\"color:#66d9ef\">int<\/span> operand <span style=\"color:#f92672\">=<\/span> mustHandleValues.operandForIndex(i);\n    <span style=\"color:#66d9ef\">if<\/span> (operandIsLocal(operand) <span style=\"color:#f92672\">&amp;<\/span><span style=\"color:#f92672\">&amp;<\/span> VirtualRegister(operand).toLocal() <span style=\"color:#f92672\">&lt;<\/span> localsUsedForCalleeSaves)\n\t<span style=\"color:#66d9ef\">continue<\/span>;\n    mustHandleValues[i] <span style=\"color:#f92672\">=<\/span> callFrame<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>uncheckedR(operand).jsValue();\n}\n<\/code><\/pre><\/div><p>What are those <code>numVarsWithValues<\/code> then? Well, the definition is right before our snippet:<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\"><span style=\"color:#66d9ef\">unsigned<\/span> numVarsWithValues;\n<span style=\"color:#66d9ef\">if<\/span> (bytecodeIndex)\n    numVarsWithValues <span style=\"color:#f92672\">=<\/span> codeBlock<span style=\"color:#f92672\">-<\/span><span style=\"color:#f92672\">&gt;<\/span>numCalleeLocals();\n<span style=\"color:#66d9ef\">else<\/span>\n    numVarsWithValues <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">0<\/span>;\n<\/code><\/pre><\/div><p>OK, so this looks straighforward for a change. If the <code>bytecodeIndex<\/code> is <strong>not<\/strong> zero, we&rsquo;re doing the tier up from JIT to DFG in the body of a function (i.e. at a loop entry). In that case, we consider all our callee locals to have values. Conversely, when we&rsquo;re running for the function entry (i.e. <code>bytecodeIndex == 0<\/code>), none of the callee locals are live yet. Do note that the variable is incorrectly named. Vars are not the same as callee locals; we&rsquo;re dealing with the latter here.<\/p>\n<p>A second gotcha is that, whereas vars are always live, temporaries might not be. The DFG compiler will <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/dfg\/DFGPlan.cpp#L728\">do liveness analysis<\/a> at compile time to make sure it&rsquo;s only looking at live values. That must have been a <a href=\"https:\/\/github.com\/WebKit\/webkit\/commit\/e66e8921c44e622574b0120bccb58aec7ebb0b03\">fun bug<\/a> to track down!<\/p>\n<p><a id=\"org70ea9c4\"><\/a><\/p>\n<h2 id=\"values-that-must-be-handled\">Values that must be handled<\/h2>\n<p>Back to our snippet, <code>numVarsWithValues<\/code> is used as an argument to the constructor of <code>mustHandleValues<\/code> which is of type <code>Operands&lt;Optional&lt;JSValue&gt;&gt;<\/code>. Right, so what are the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/Operands.h#L43\"><code>Operands<\/code><\/a>? They simply hold a number of <code>T<\/code> objects (here <code>T<\/code> is <code>Optional&lt;JSValue&gt;<\/code>) of which the first <code>m_numArguments<\/code> correspond to, well, arguments whereas the remaining correspond to locals.<\/p>\n<p>What we&rsquo;re doing here is recording all the live (non-heap, obviously) values when we try to do the tier up. <a href=\"https:\/\/github.com\/WebKit\/webkit\/commit\/0fd7ec908ba87eea05f8d6fd701d4b3b322aabd5\">The idea<\/a> is to be able to <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/dfg\/DFGCFAPhase.cpp#L161\">mix those values in<\/a> with the previously observed values that DFG&rsquo;s <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/dfg\/DFGCFAPhase.cpp#L277\">Control Flow Analysis<\/a> will use to emit code which will bail us out of the optimized version (i.e. do a tier down). According to the comments and commit logs, this is in order to increase the chances of a successful OSR entry (tier up), even if the resulting optimized code may be slightly less conservative.<\/p>\n<p>Remember that the optimized code that we tier up to makes assumptions with regard to the types of the incoming values (based on what we&rsquo;ve observed when executing at lower tiers) and wil bail out if those assumptions are not met. Taking the values of the current execution at the time of the tier up attempt ensures we won&rsquo;t be doing all this work only to immediately have to tier down again.<\/p>\n<p><code>Operands<\/code> provides an <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecode\/Operands.h#L224\"><code>operandForIndex<\/code><\/a> method which will directly give you a virtual reg for every kind of element. For example, if you had called <code>Operands&lt;T&gt; opnds(2, 1)<\/code>, then the first iteration of the loop would give you<\/p>\n<pre><code>operandForIndex(0)\n-&gt; VirtualRegisterForargument(0).offset()\n  -&gt; VirtualRegister(argumentToOperand(0)).offset()\n    -&gt; VirtualRegister(CallFrame::thisArgumentOffset).offset()\n      -&gt; CallFrame::thisArgumentOffset\n<\/code><\/pre>\n<p>The second iteration would similarly give you <code>CallFrame::thisArgumentOffset + 1<\/code>.<\/p>\n<p>In the third iteration, we&rsquo;re now dealing with a local, so we&rsquo;d get<\/p>\n<pre><code>operandForIndex(2)\n-&gt; virtualRegisterForLocal(2 - 2).offset()\n  -&gt; VirtualRegister(localToOperand(0)).offset()\n    -&gt; VirtualRegister(-1).offset()\n      -&gt; -1\n<\/code><\/pre>\n<p><a id=\"org8a1fb79\"><\/a><\/p>\n<h2 id=\"callee-save-space-as-virtual-registers\">Callee save space as virtual registers<\/h2>\n<p>So, finally, what <strong>is<\/strong> our snippet doing here? It&rsquo;s iterating over the values that are likely to be live at this program point and storing them in <code>mustHandleValues<\/code>. It will first iterate over the arguments (if any) and then over the locals. However, it will use the &ldquo;operand&rdquo; (remember, everything is an int\u2026) to get the index of the respective local and then skip the first locals up to <code>localsUsedForCalleeSaves<\/code>. So, in fact, even though we allocated space for (arguments + callee locals), we skip some slots and only store (arguments + callee locals - <code>localsUsedForCalleeSaves<\/code>). This is OK, as the <code>Optional&lt;JSValue&gt;<\/code> values in the <code>Operands<\/code> will have been initialized by the default constructor of <code>Optional&lt;&gt;<\/code> which gives us an object without a value (i.e. an object that will later be ignored).<\/p>\n<p>Here, callee-saved register (<code>csr<\/code>) refers to a register that is available for use to the LLInt and\/or the baseline JIT. This is described a bit in <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/llint\/LowLevelInterpreter.asm#L24\"><code>LowLevelInterpreter.asm<\/code><\/a>, but is more apparent when one looks at what <code>csr<\/code> sets are <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/llint\/LowLevelInterpreter.asm#L261\">used on each platform<\/a> (or, <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/RegisterSet.cpp#L173\">in C++<\/a>).<\/p>\n<table>\n<thead>\n<tr>\n<th>platform<\/th>\n<th><code>metadataTable<\/code><\/th>\n<th>PC-base (<code>PB<\/code>)<\/th>\n<th><code>numberTag<\/code><\/th>\n<th><code>notCellMask<\/code><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><code>X86_64<\/code><\/td>\n<td>csr1<\/td>\n<td>csr2<\/td>\n<td>csr3<\/td>\n<td>csr4<\/td>\n<\/tr>\n<tr>\n<td><code>x86_64_win<\/code><\/td>\n<td>csr3<\/td>\n<td>csr4<\/td>\n<td>csr5<\/td>\n<td>csr6<\/td>\n<\/tr>\n<tr>\n<td><code>ARM64~\/~ARM64E<\/code><\/td>\n<td>csr6<\/td>\n<td>csr7<\/td>\n<td>csr8<\/td>\n<td>csr9<\/td>\n<\/tr>\n<tr>\n<td><code>C_LOOP<\/code> 64b<\/td>\n<td>csr0<\/td>\n<td>csr1<\/td>\n<td>csr2<\/td>\n<td>csr3<\/td>\n<\/tr>\n<tr>\n<td><code>C_LOOP<\/code> 32b<\/td>\n<td>csr3<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<\/tr>\n<tr>\n<td><code>ARMv7<\/code><\/td>\n<td>csr0<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<\/tr>\n<tr>\n<td><code>MIPS<\/code><\/td>\n<td>csr0<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<\/tr>\n<tr>\n<td><code>X86<\/code><\/td>\n<td>-<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<td>-<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>On 64-bit platforms, offlineasm (JSC&rsquo;s portable assembler) makes a range of callee-saved registers available to <code>.asm<\/code> files. Those are properly <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/llint\/LowLevelInterpreter.asm#L765\">saved<\/a> and <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/llint\/LowLevelInterpreter.asm#L789\">restored<\/a>. For example, for <code>X86_64<\/code> on non-Windows platforms, the returned <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/RegisterSet.h#L41\"><code>RegisterSet<\/code><\/a> <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/RegisterSet.cpp#L173\">contains<\/a> registers <code>r12<\/code>-<code>r15<\/code> (inclusive), i.e. the callee-saved registers as defined in the <a href=\"https:\/\/github.com\/hjl-tools\/x86-psABI\/wiki\/x86-64-psABI-1.0.pdf\">System V AMD64 ABI<\/a>. The mapping from symbolic names to architecture registers can be found in <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/jit\/GPRInfo.h#L337\"><code>GPRInfo<\/code><\/a>.<\/p>\n<p>On 32-bit platforms, the assembler doesn&rsquo;t make any <code>csr<\/code> regs available, so there&rsquo;s nothing to save <strong>except<\/strong> if the platform makes special use of some register (like <code>C_LOOP<\/code> <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/llint\/LowLevelInterpreter.asm#L294\">does<\/a> for the <a href=\"https:\/\/webkit.org\/blog\/9329\/a-new-bytecode-format-for-javascriptcore\/\"><code>metadataTable<\/code><\/a> <sup><a id=\"fnr.4\" class=\"footref\" href=\"#fn.4\">4<\/a><\/sup>).<\/p>\n<p>What are the <code>numberTag<\/code> and <code>notCellMask<\/code> registers? Out of scope, that&rsquo;s what they are!<\/p>\n<p><a id=\"org87c98de\"><\/a><\/p>\n<h1 id=\"conclusion\">Conclusion<\/h1>\n<p>Well, that wraps it up. Hopefully now you have a better understanding of what the original snippet does. In the process, we learned about a few concepts by reading through the source and, importantly, we added lots of links to JSC&rsquo;s source code. This way, not only can you check that the textual explanations are still valid when you read this blog post, you can use the links as spring boards for further source code exploration to your heart&rsquo;s delight!<\/p>\n<h2 id=\"footnotes\">Footnotes<\/h2>\n<p><sup><a id=\"fn.1\" class=\"footnum\" href=\"#fnr.1\">1<\/a><\/sup> Both the interpreter \u2013 better known as <strong>LLInt<\/strong> \u2013 and the baseline JIT keep track of execution statistics, so that JSC can make informed decisions on when to tier up.<\/p>\n<p><sup><a id=\"fn.2\" class=\"footnum\" href=\"#fnr.2\">2<\/a><\/sup> Remarkably, no <code>RegisterID<\/code> has been allocated at this point \u2013 we used the size of <code>m_calleeLocals<\/code> but never modified it. Instead, <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGenerator.cpp#L2290\">later in the function<\/a> (<strong>after<\/strong> adding the new local to the symbol table!) the code will call <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/bytecompiler\/BytecodeGeneratorBaseInlines.h#L176\"><code>addVar<\/code><\/a> which will allocate a new &ldquo;anonymous&rdquo; local. But then the code asserts that the index of the newly allocated local (i.e. the offset of the virtual register it contains) is the same as the offset we previously used to create the virtual register, so it&rsquo;s all good.<\/p>\n<p><sup><a id=\"fn.3\" class=\"footnum\" href=\"#fnr.3\">3<\/a><\/sup> How did we know to look for the <code>ResolveNode<\/code>? Well, the <code>emitBytecode<\/code> method needs to <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/Nodes.h#L173\">be implemented by subclasses<\/a> of <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/Nodes.h#L168\"><code>ExpressionNode<\/code><\/a>. If we look at how a simple binary expression is <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/Parser.cpp#L3943\">parsed<\/a> (and given that <code>ASTBuilder<\/code> <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/ASTBuilder.h#L119\">defines<\/a> <code>BinaryOperand<\/code> as <code>std::pair&lt;ExpressionNode*, BinaryOpInfo&gt;<\/code>), it&rsquo;s clear that any variable reference has already been lifted to an <code>ExpressionNode<\/code>.<\/p>\n<p>So instead, we take the bottom up approach. We find the lexer\/parser token definitions, one of which is the <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/ParserTokens.h#L116\"><code>IDENT<\/code><\/a> token. Then it&rsquo;s simply a matter of going over its uses in <code>Parser.cpp<\/code>, until we find our <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/Parser.cpp#L4497\">smoking gun<\/a>. This gets us into <a href=\"https:\/\/github.com\/WebKit\/webkit\/blob\/44bc82609c81ab4bf61d478b0442378b7526c339\/Source\/JavaScriptCore\/parser\/ASTBuilder.h#L197\"><code>createResolve<\/code><\/a> aaaaand<\/p>\n<div class=\"highlight\"><pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-C++\" data-lang=\"C++\"><span style=\"color:#66d9ef\">return<\/span> <span style=\"color:#a6e22e\">new<\/span> (m_parserArena) ResolveNode(location, ident, start);\n<\/code><\/pre><\/div><p>That&rsquo;s the node we&rsquo;re looking for!<\/p>\n<p><sup><a id=\"fn.4\" class=\"footnum\" href=\"#fnr.4\">4<\/a><\/sup> <code>C_LOOP<\/code> is a special backend for JSC&rsquo;s portable assembler. What is special about it is that it generates C++ code, so that it can be used on otherwise unsupported architectures. Remember that the portable assembler (<code>offlineasm<\/code>) runs at compilation time.<\/p>\n"},{"title":"About","link":"https:\/\/tlog.quasinomial.net\/about\/","pubDate":"Mon, 01 Jan 0001 00:00:00 +0000","guid":"https:\/\/tlog.quasinomial.net\/about\/","description":"<p>Freelancer working on compiler-y things. Ex-<a href=\"https:\/\/www.vusec.net\/people\/angelos-oikonomopoulos\/\">VUSec<\/a>. <a href=\"https:\/\/www.igalia.com\/\">Igalian<\/a>.<\/p>\n<p>Contact: [name of the language described <a href=\"https:\/\/en.wikipedia.org\/wiki\/The_C_Programming_Language\">here<\/a>]@quasinomial.net<\/p>\n"}]}}